Easy-to-Integrate Voice Control for Embedded Design

In this blog post, we introduce you to Digi ConnectCore® Voice Control, a new solution in Digi's family of embedded solutions that enables voice processing on devices at the network edge with no cloud connectivity required. There are many reasons why today voice integration in product design is of keen interest, and why we will see growth in this space as applications across vertical industries integrate interactive speech recognition.

Ease-of-use often sets a successful product apart from the “also rans” in the marketplace. For OEMs building solutions with embedded computing capabilities, that often comes down to creating an intuitive, user-friendly product interface. And interfaces don’t get much more user friendly than voice controlled device operation.

The benefits of voice control include better hygiene, rapid interaction between humans and machines, precision operation, and more. And processing at the edge reduces connectivity costs and data privacy concerns, while providing faster response times than what is possible with cloud-based voice processing.
 

Getting Humans and Machines on the Same Page  

Automated External Defibrillator voice control example
Many products with embedded computing require user input, and display information that must be understood or acted on by device users. This portion of product functionality is known as the Human Machine Interface (HMI). Today, the HMI is typically provided through display screens, and user input methods have evolved from buttons, mice and keyboards to touchscreens that mimic the operation of our smartphones.  

As of 2022, most users expect a smartphone-like interface on electronic products. But for OEMs, this can be difficult and expensive to develop in Embedded Linux and requires talented UI developers and additional graphical user interface (GUI) software tools to build. While the software may be open source, more powerful tools typically require the purchase of a development environment and licensing for devices.  

In addition, touchscreen hardware for the finished product is expensive and adds significantly to the bill of material (BOM) cost for embedded products. A glass display can be easily broken or damaged in everyday use in industrial environments, requiring expensive repair or replacement. Another issue device makers face in the medical and food industries is the hygiene factor and the problem of bacteria on surfaces getting transferred between users.  

Finally, most touch/display products designed for the smartphone market do not deliver the long service life (10+ years) expected of commercial or industrial products. 

Voice Control — the Ideal Human Machine Interface 

Human Machine Interaction example
The ideal answer to many of these issues lies in voice control. Voice controlled devices allow users to interact with a device at a distance even when they cannot see what they are interacting with. This means that their concentration and focus can be on the task at hand, rather than on the device.  

Speech is also a very efficient form of data input. Most people speak at around 150 words per minute, compared with an average typing speed of 40. Combined, these two benefits allow users to make relatively complex requests quickly. 

Voice control provides significant advantages in industrial applications where, for example, it can increase the safety of users who are free to focus on the ultimate task rather than on controlling a device through touch interaction. In a medical setting such as an operating room, voice-controlled devices allow touchless interaction, which helps to avoid the transfer of bacteria.  

Introducing Digi ConnectCore Voice Control 

Digi ConnectCore Voice Control
Digi ConnectCore Voice Control is a ready-to use software solution that is pre-integrated into Digi Embedded Yocto, for use with the Digi ConnectCore family of System on Modules (SOMs). ConnectCore Voice Control provides real-time voice recognition and text-to-speech capabilities with a customizable wake word, customizable 60,000-word vocabulary, and support for 30 national languages.  

ConnectCore Voice Control brings full voice processing at the IoT edge to any device with a Digi ConnectCore module, enabling zero-touch user interaction with the device. It does not require hardware-based AI/ML accelerators to operate, so product developers can add voice capabilities without additional hardware costs, beyond off-the-shelf microphones and speakers. 

Voice Processing Works Better at the Edge 

Voice control for parking meters
Why do the processing at the IoT edge? When you use popular consumer voice control applications like Apple Siri or Amazon Alexa, you might have noticed a slight delay in the interaction, even when the device is right in your hand or on the kitchen counter. That delay is caused by the fact that the computer processing behind nearly all consumer voice applications is performed in the cloud.  

While a few tenths of a second delay may not be a problem if you’re selecting a song or sending a text message, that latency can make voice control less effective in the flow of information, or when making precise adjustments. Needless to say, any interruption in connectivity to the cloud makes the problem worse.  

ConnectCore Voice Control, however, performs its voice processing locally, at the edge, enabling real-time performance with reaction times of less than 100 milliseconds. On-device voice processing with ConnectCore Voice Control brings real-time response, compared to variable latencies when using voice processing in the cloud. It also eliminates the connection costs of cloud-based solutions.   

30 Languages, 60,000 Words  

Most voice control applications on the market operate in only two languages—English and Mandarin Chinese. ConnectCore Voice has the ability to communicate in 30 national languages, providing a great advantage when developing a product for global deployment.  

Processing data locally virtually eliminates privacy and security issues that arise when transferring data to cloud services over the network. It protects the privacy of data, since it never needs to connect to the Internet. ConnectCore Voice Control is compliant with the European Union’s General Data Protection Regulation (GDPR), another key benefit for global deployment.  

Voice Control Use Cases 

Voice control for industrial use cases
Voice control is a valuable capability in any number of use cases. Considering that most people speak approximately 150 words per minute, compared with average typing speed of 40 words per minute, there is enormous value in improving speed and precision in a range of Human Machine Interaction scenarios. Here are some examples: 

  • Smart City and retail
    • Parking meters
    • Informational kiosks or terminals that provide wayfinding or event information
    • Vending machines
  • Industrial operations
    • Industrial crane control with voice allows the crane operator to watch the materials being moved, rather than a control unit
    • Robot control that enable users to initiate operations with preset commands
    • Process control, e.g. in harsh environments where gloves are needed and touchscreens do not work well
    • Measurement devices with voice interaction to collect sensor readings and other measurement data
    • Technician work reports and data collection
  • Medical and healthcare
    • Operating room devices — interacting with devices via voice offers advantages in convenience and hygiene over touchscreens or keyboards 
    • Home healthcare — nurse log keeping for medication, treatment, etc. 
    • Medical voice prompted checklists in hospitals e.g. to prepare / check patients prior to treatments
    • Clinical notes transcription offers greater efficiency

Adding Digi ConnectCore Voice Control to Your Next Product 

Digi ConnectCore Voice Control concept
For OEM developers considering a voice interface for their next product, either as a current feature or as a future enhancement, Digi ConnectCore Voice Control provides pre-integrated, ready-to-use software for developing on Digi ConnectCore modules.  

The development software is available for download on the Digi ConnectCore Voice Control documentation website. As part of the download, Digi provides a single software license for evaluation and development to customers who have already purchased a Digi ConnectCore 8M Nano Development Kit. (For deployment, OEMs can purchase licenses from the software vendor or through Digi for each device they sell.) This software download can be used to develop a proof of concept, to demonstrate voice capabilities and to design the voice control application for a new customer product.

To learn more, download the Digi ConnectCore Voice Control data sheet.

Next Steps

Watch Our Recorded Webinar
Learn about AI and machine learning in embedded systems