Amazon introduced cloud-based Alexa wake word verification last week to little fanfare. This type of announcement will never capture as much media attention as introducing the Echo Show or payments to game developers, but it is just as important for third party developers using Alexa Voice Service (AVS). In case you don’t know, a “wake word” or “hot word” is what you say to activate your voice assistant into listening mode. For Amazon it is “Alexa,” “Echo,” or “Amazon.” For Apple it is “Hey Siri.” Google uses “Okay Google.”
Here is the problem. Sometimes other words sound like the standard wake word. Developer documentation suggests that terms such as “Alex,” “election,” and “Alexis” can cause Alexa to inadvertently wake up and start listening for a request. That’s not a great user experience. The question is how to fix it given how many phonetically similar phrases are out there.
Amazon’s New Two-Step Solution
Historically, the wake word activation is handled locally by the Wake Word Engine (WWE) software on the device and then requests are handled by cloud resources. This means the local wake word monitoring has less compute power to address edge cases where false positives are triggered.
The new solution uses the local WWE to determine if a wake word has been said, then verifies whether it was actually a wake word in the cloud WWE. Ted Karczewski from Amazon suggested in a blog post last week that:
With this update, the wake word engine (WWE) on the device handles the initial detection of “Alexa”, and then a secondary cloud-based check verifies the utterance. If a false wake word is detected, the verification process directs the device to close the audio stream and turn off the LED indicator.
In this way, local compute resources are used to identify a likely wake word utterance and the more robust cloud resources the verify if the local interpretation was correct. If the local service detects a suspected wake word utterance, the service will activate and wait for the request from a user. If you have an Echo your can try it out by saying “election.” The blue ring will likely activate showing that Alexa is listening. However, the cloud WWE resources then evaluate the utterance and if it is not the actual wake word, it shuts down the session. In our “election” example, the blue ring will disappear once the cloud service validates the error.
A Step in the Right Direction
Travis Teague, the manager of the Alexaslack group, commented:
It is a step in the right direction. I think the Amazon cloud wake word is really going to push a lot of projects that are still using the old API to move up to the new one. It will be interesting to see how this stands up to other wake-word engines such as Snowboy which have the capability of custom wake word support.
Snowboy is a Kitt.ai project that enables users to create custom wake words. This is a solution for developers building custom products that want to differentiate from Amazon, Apple and Google offerings.
When working with your own wake word through Snowboy, tuning and optimization is up to you. If you are working with AVS, Amazon has added this feature to improve the user experience. And, as Mr. Teague notes, it is an incentive for third party manufacturers that have embedded AVS into their products to utilize the new API which supports the feature. There are apparently many developers still on an older deprecated API.
If You Are Embedding Alexa, You Should Update Your Code
There are two takeaways from this announcement. First, Amazon is continually working to improve the user experience for Alexa users. The wake word verification is a worthwhile improvement that might not have been a top feature request, but definitely improves the product performance. Second, developers of IoT devices that utilize AVS need to make some code changes to enable the cloud-based wake word verification. That may also mean updating to the new API to take advantage of the feature. Release notes on the update can be found here.