Always-on VoiceQ enables the ultimate touch-less user interface for mobile devices average mobile phone users reach for their devices up to 150 times per day, to check the time, glance at email, play music and much more. Any task we perform starts the same way – we reach for the device and press a button to activate the screen. technology removes this step, so that consumers can use their devices without even touching them, even when the device is asleep or just out of reach. Audience VoiceQ is a voice sensing technology that enables dependable, Always-on voice detection and actuation for mobile devices equipped with an Audience eS700 series processor. It enables devices to continuously listen to their surroundings, and act upon a simple, configurable voice command.
How Voice Q works in three stages
- Using a low-power, Always-on voice activity detector (VAD), the device continuously listens for a voice signal while staying in an ultra-low power mode.
- Once voice activity is detected, the incoming signal is compared to pre-stored key phrases or triggers. During stages one and two, only the Audience processor and digital microphone are awake. All other components in the device remain in a low-power sleep mode.
- When the key phrase or “wake-up trigger” is detected, the Audience processor wakes the device, indicating the users’ intent to interact with the device via a voice UI.
The Wake-up Trigger
Audience VoiceQ offers the choice of OEM selectable and user selectable wake-up triggers. OEM selectable wake-up triggers With this option, OEMs can select a keyword of their own to wake the device. It is speaker independent and does not require any user training. User selectable wake-up triggers With this option, users can personalize their mobile device by selecting their own personal wake-up trigger. In this mode, the user trains the system by speaking the key phrase four times in a relatively quiet environment. The user selectable trigger is speaker dependent which means that once a user has trained the system to their voice, the system recognizes and responds to that user’s voice only.
Support for speaker dependent, user selectable key phrases, along with the capability to store multiple key phrases allows two or more users to train the same mobile device simply through the unique properties of their voice. This delivers a highly personalized experience, for example, two people could use VoiceQ to access their private email on the same device via voice commands, making it an ideal solution for a shared home computer or family room tablet.
Continuous VoiceQ for a More Natural User Experience
Audiences's continuous VoiceQ provides a superior user experience by eliminating the need for a pause between trigger and command. With this feature enabled, a user can simply speak naturally without the need to think about saying the trigger first (eg: “OK Audience”), waiting for the device to respond, and then say the command (eg: “Check my email”).
As before, VoiceQ technology continuously listens for the trigger phrase while the application processor is in low-power sleep mode. In continuous mode, once the Voice Wake trigger phrase is detected (eg. “OK Audience”), a signal is sent to the application processor to wake-up. In the meantime, the actual command (“check my email”) is buffered on-chip.
Audience then employs its ASR Assist technology to interpret the command. Because ASR Assist technology is able to dramatically reduce the background noise, useful speech is preserved, providing the engine with a clean voice signal from which to easily determine what command has been given, and deliver first-time successful task completion.
Once the host is woken up and ready to receive the audio data, the Audience processor sends both the buffered and real-time audio data to the host. The host then either applies local ASR or up-streams the audio data to a cloud-based ASR system.
The whole process ensures a seamless transition between trigger key phrase and the command, providing a natural voice user interface, while maintaining low-power consumption.
Audience VoiceQ can be applied to all major spoken languages with appropriate language modeling.
False acceptances are one of the largest factors that affect the effectiveness of voice wake technologies. During a false acceptance, the display, radio, and speech recognition engine may be on for 30-60 seconds before the device goes back to sleep. In fact, just one false acceptance per hour can result in 12-24 minutes of additional, unintended phone use.
Audience VoiceQ provides best-in-class accuracy, a result of lowering the False Rejection Ratio (FRR) – or the amount of times a system will fail to recognize a command.
Audience also provides the lowest system power consumption, due to the tight link between the custom Audience hardware and our in-house developed VoiceQ algorithm, but also because of its performance in minimizing the number of false detects i.e. the False Acceptance Ratio (FAR).