At Dashbot’s Superbots conference in San Francisco this month Voicebot had the chance to catch up with Stephane Nguyen, VP of Product and Engineering at Assist. The company is part agency, part technology provider and fully focused on bots of both the chat and voice varieties. Assist has also developed some of the better known Alexa skills and Google Actions for high profile clients.
Tell me about Assist. When were you formed? What is your focus? Who do you serve?
We started in January 2014 raising a $1.5 million seed and then a $4 million series A in April. At Assist we support four platforms: Twitter DMs, Facebook messenger, Google Assistant and Amazon Alexa. We really focus on the friction. The copy is important but people are driven by the friction. We want to make the experience better and faster for users. Our philosophy is really to be curious and work with people on how to make things better. And we are growing so we are hiring.
What are some chatbot and Voice skills you have built?
Lonely Planet was one of the first Actions on Google Home. 1-800 Flowers was featured in the Super Bowl ad for Alexa. There are a lot more coming up. We are working on the next wave of experiences that are going to be released soon.
Do you work with brands predominantly?
We like corporate brands like everyone. But, it is more about working with people we love and partnering with the right people. Curiosity means everything to us. It is ingrained in our culture. We love to discover everything together with our clients and even with the platforms. We are focusing on developing partnerships with the platforms as well to discover how people are interacting.
What is the difference between chat and voice?
The ability to change your mind at any time is really important for both. When you think about how people talk, they change their mind all of the time and go back into the conversation. It is never acceptable to say go back, like in IVR (interactive voice response). We want to always move conversations forward without moving back. This is the Random Access Navigation that I discussed earlier. You never go back. Always going forward and always being able to change your mind. On voice this approach works well. But the ability to change your mind is really not a difference between the two.
In addition, we thought that NLU (natural language understanding) in a query was the holy grail for development in chat. We would interpret a block of text as a single query. It turns out, that is not how people actually text. They text in snippets and change their minds and add fragments. So, developers process each input, but people text by group. We asked if there was a way to group user inputs before answering. We created an act of listening. When you use a typing indicator it is a signal to wait and listen. This can be a signal to the bot and user. You wait a certain amount of time until after the user finishes a group of text. While we don’t text groups of snippets on voice, we could code an act of listening where the device waits until you are done talking.
How do you reduce friction?
Yes it’s about design, but also about observation. When you build for visual you don’t pay attention to how people talk to each other. So, how do you implement this to set expectations [for users] on what the UI should be?
Okay. Then how do you set the expectations?
On voice this is something we are currently thinking about. We don’t have all of the answers. There is not a silver bullet. Each experience is different. But we are designing with that in mind – how people interact and how to always move the conversation forward.
What is your background?
I’m an engineer, but I have always had artistic things in my life. I was a fashion photographer in Paris for three years. Because of that, I always wanted to to understand the front end to build a better backend. When we built our dashboard, which allows you to better understand the performance of your bot, it was about how you can improve the signal.
Are you building the dashboard for your customers to use?
We are. We are just not opening it up to an external audience. We can use it to provide better service to our customers.
What is your view of the voice assistant experience on the phone vs the smart speaker?
When we started thinking about the voice assistant on the phone we wanted to think about the environment. At home, you are mostly in a quiet environment. On the phone, you want to interact wherever you are. Voice on mobile devices is challenging. You ask about a restaurant and there is a truck going by. The environment is a concern.
Voice is also challenging when you have special characters. For example email. You have the @ symbol and the dot. We also know that the voice-to-text is satisfying and easy to edit afterward. So, these are two challenges to keep in mind. User input for special characters and environment.
What is Assist’s aspiration?
Removing the friction is everything to us. Conversational commerce is coming. Conversional commerce will be way better [than what we have today]. Removing the friction while doing a simple transaction is a main focus right now. [We expect Assist to be] more of a platform type of company and tying a environment and framework together. We will have products and raise them to be a platform.
What is your favorite Alexa skill?
Philips Hue and Harmony. You cannot turn on the lights in my home without voice. You can not turn on the TV without voice. I am using it everyday. I don’t have a choice.