Many people were frustrated when Amazon Alexa and Google Assistant weren’t immediately available in the UK, Australia and India after their respective launches in the U.S. The devices clearly understood English, so why not roll out in other English-speaking countries? It was commonly assumed that voice assistant localization challenges were primarily related to accent recognition. Accents do pose problems even within a country with regional variations. However, localization involves much more than accents. You should think about this problem in two parts: automated speech recognition (ASR) and natural language understanding (NLU). Speech and accent recognition fall under the ASR rubric. Recognizing the correct intent behind independence day, the incorporation of foreign words in daily discourse, grammar variants and colloquial expressions all require a specialized NLU.
What does this mean when forecasting voice assistant roll-outs into new languages? One of the most obvious relates to the world’s second most spoken native language. Google Assistant announced Spanish language support earlier this week, but only for Spain, Mexico and the U.S. Spanish is the official language of 20 countries. You should not expect all of them to receive Alexa or Google Assistant support at once. Peru is very different culturally from Mexico, Spain and Argentina. Thus, we see Google Assistant already rolling out support in waves.
Language Localization Goes Beyond Accent Recognition
Google has a significant advantage over Amazon from its speech-to-text ASR capabilities that support the company’s voice search products. In August, Google expanded speech recognition to another 30 languages bringing the total to 119. When Google Assistant enters a new country, the ASR is largely in place. Google Assistant group product manager Brad Abrams also suggested that the company has a head start on NLU.
It’s actually really challenging to roll out in the different languages if you think about what has to come together in terms of the speech recognition that has to be there, the speech synthesis that has to be there, the knowledge graph understanding of large portions of the internet in that language. So, all of that has to come together. I think we are in a really fortunate place. Because we’ve had Google search in all of those languages for so long that is has given us an opportunity to build up the technology. so, really we’re not building wholesale new technology as we roll out the Assistant, we’re just leveraging this long history we have with search…Even a lot of the NLU, like we’ve built a lot of this technology because we have to understand user queries to be able to give them good answers.
Amazon Faces More Challenges in Language Support
Amazon can still introduce Alexa into new languages. The lack of the ASR models and NLU in production just means that Amazon has more work to do when introducing language support to a new country. That is why Voicebot originally predicted that Alexa would roll out into Japanese after English and German. The logic was simple. Amazon already had Japanese support in its Fire TV products. The ASR was there along with some NLU. Bringing Alexa to French, Spanish, Hindi and other languages requires more work. This is why it was so critical for Amazon to have an early lead in launching Alexa. The company needed more time to build out its technology stack and establish voice assistant distribution.
Apple Has a Lot of Speech Assets
Another company that has a lot of ASR models already in production is Apple. The NLUs may not be as advanced as what Google has to draw from in its search products, but the ASR models already support 21 languages localized for 36 countries. Each of these ASR language models have been refined over many years. If you are looking at what assets Apple can leverage to catch up despite a late start moving Siri into more robust voice assistant use cases, language support is critical. The other assets are a large global user base with existing Siri distribution through Apple products. Apple’s moves in voice so far have been lackluster. The limitations of Siri integration on HomePod represent an illustrative example. However, it’s too early to count them out because of Apple’s extensive ASR assets. On the other end of the spectrum, Google is a clear leader. Its assets in ASR and NLU were critical in helping Google catch up quickly to a surging Amazon Alexa.