Recently we were prototyping with voice-user interface (VUI). Personally I approached this topic with a certain arrogance, as I have interacted with a few voice assistants and it is safe to say that they have come a long way (Siri at one point refused to understand my accent) but let’s face it we are still not satisfied. I realised that if I know the pain points, I have the right thinking to create this voice interaction that will be nothing short of perfect. Wrong! My experiment with this design exercise gave me a deeper understanding of human behaviour and how small nuances make all the difference.
This little read highlights the few struggles or thoughts I had while trying to design the perfect interaction.
Higher expectations - More chance for disappointment
Human to human: In real life, you walk around and ask people a question or pass a comment as a thought occurs to you, your expectation at this point is satisfied if the person actually acknowledges your comment or thought. This is not the case when you are carrying five bags in your hand and trying to open the door while replying to a text message and your friend walks by without even offering to help. You are now disappointed in the friend for not being more considerate.
Human to Voice assistant: You approach the voice assistant like you would your friend when you are holding the five bags; you ask for something very specific and you expect that this very specific task needs to be done now. If this task is done well, no comment you smile and think you love technology, but if the message from your assistant is ‘I am sorry I didn’t quite get that’ you are furious and cannot understand why technology cannot be more intelligent. You are now disappointed.
Do I really talk like that?
Human to human: Imagine a really stressful situation. There is a deadline coming up for your team, you haven't had much rest or food. You need multiple files to come together to finish up what you need to. Communication at this point should be at its absolute best, but somehow very often this is when you strip down your polite mails to words that represent an abstract and cluttered thought. You rely heavily on the people in your team who understand perfectly what you mean and they in turn give you exactly what you are looking for. Job done.
Human to Voice assistant: I found the interaction to be a bit different with voice assistants as here you must know exactly what you want and more importantly be able to communicate this with absolute clarity in order for your result to be obtained. This is often frustrating as we have an idea of what we want but are not necessarily very good at communicating this. This is not the result of a badly designed piece of technology, but has more to do with improving communication channels between human and technology.
It’s all in the personality
Human to human: Humans tend to run on emotions. There are many factors that can affect the way a person feels therefore directly affecting the way they respond. Based on how we feel we carefully select who we want to interact with and how we want to interact with them. We are rational beings that often respond and behave irrationally. When we interact with fellow irrational humans, we get some sympathy or we end up in a fight, but both of these situations we know and understand and it is a natural part of our daily interactions.
Human to Voice assistant: The current model is that one personality fits all and the big focus at the moment is to design a personality that is witty or sympathetic and one that can respond to sound more like a human conversation, this is interesting for sure, but it is still not going to be enough as we don't need one personality but need to have multiple. When you want to get something done it might not be ideal to have a grumpy voice assistant, but it would definitely feel much more like a real interaction if one could elicit multiple personalities from a voice assistant as opposed to the always polite and diplomatic one.
- - - - - - - - - - - - -
As designers it is always interesting to see how we have to continually improve and better the experiences we create. Voice interaction is a particularly interesting field as you have more challenges to address. How do you show a response to an action? How do you ensure privacy? How can it be a less embarrassing interaction? As we look into the needs from a users perspective, we hope to address the concerns and problems that exist today and take forward this interaction to look and feel more perfect.