The Furhat Robot is using humankind’s oldest interface – literally the face itself – and as humans, we are hardcoded from birth to interact with faces. This means that as social robot skill developers, we have some catching up to do. One of the main components of an interaction with Furhat is speech. In this blog post we have summarized five key guidelines that we keep in mind when designing interactions.
1. Less is more, probably less than you think
Using too long phrases is the most common mistake we see when people try to create interactions with Furhat. What in writing may feel like a very short paragraph, will sound a lot longer when you hear a robot say it. Don’t believe me? Try reading this paragraph out loud with a monotone voice.
It is always worth spending some extra time working on the speech rate, pausation and expressivity in the voice. TTS voices are improving rapidly, like e.g. Amazon’s neural voices, but for now they are still limited. If you don’t put in the effort, it will sound way more monotonous than a human would talk, or most humans anyway…
If the speech is long and flat, the user will lose interest. We all have that friend who just goes on and on and on and on… quicker turn-taking is one way to remedy this. Give the user a chance to talk as well, or the conversation turns from a dialogue to just a flow of information. But generally, keep it short.
2. Remember, you’re talking
It’s not only the length of what is being said that is important, it’s also what type of language you use. Most interactions with a social robot have the form of a conversation, and this means that the language style should also be conversational – which often means a more casual way compared to formal, written communication. This will of course vary depending on the type of application and the robot’s character, but even serious applications like medical screenings or in banking need to adopt a conversational tone.
Also consider what type of content actually works with speech, some things are better communicated in a different way. Legal texts, for example, are a particularly bad example. We’ve had projects where legal departments wanted Furhat to read a written disclaimer or terms of service, and so far we’ve not seen anyone manage to make it bearable without cutting it by 90%. Try having the robot refer to a printed note or text on a screen instead, or the user might leave before the actual conversation even started.
Trying to be correct often leads to too formal language. Human speech is full of grammatical errors and hesitations which can feel strange to add to your code on purpose. But unless you want to sound really robotic, don’t be afraid to include imperfections.
3. Don’t explain how language works
Imagine having a conversation with a human, who begins your meeting with instructing you “if you want to continue, say ‘continue’” or “If you want to stop, say ‘stop’”.
If you need to start with explaining which words to use, you’re not doing it right.
So what should you do instead? Well, one of the advantages of a voice based interface is that we can express ourselves as we normally do. So for example instead of:
“To repeat a question, say repeat” / “To continue, say continue”
You can say something like:
“You can always ask me to repeat a question” / “Let me know if you want to continue”
Sounds better, doesn’t it? And by doing this you’re priming the user to use the words, “repeat” and “continue” anyway, without explicitly mentioning it. They are more likely to use them in the first place or recall them if they struggle, and the users can still use their own words.
4. Don’t be a form, it’s boring
If the user can go through the entire interaction only by saying yes or no, it’s probably not going to be an engaging conversation. Change things around, and get creative!
An exception would be if the robot needs to handle many people rapidly after one another, for example, with Furhat as a queue manager at an airport or in a busy reception. But even then you can change things around with small tweaks. Instead of “Are you a member?”, ask something like “So, are you a visitor or a member?”
Include parts that are just there to improve the conversation, even if you don’t really care about the answer.
5. Variance!
No one likes to hear the same line over and over, no matter how well written it is.
“Say things in different ways!” / “Add variance!” / “Don’t keep repeating yourself!” / “Don’t say the same thing over and over!” / “More is more when it comes to variation!”
Adding variance is particularly important for phrases that are used fairly often, like the robot’s response if it doesn’t understand the user. It can also be good to keep in mind for situations when a user has multiple interactions or if another user could be watching.
Think of Furhat in a co-office reception for example. Members might interact with Furhat on a regular basis, and while one visitor is checking in, the next might be standing behind overhearing the interaction. If the exact same words are used again, the user will become painfully aware it’s a robot talking.
With these 5 guidelines in mind, it’s time to get into roleplay mode! Write a script for Furhat, grab a colleague and test the interaction. Or why not try the other way around, and record the interaction you want to recreate and write the script based on that. In addition to being a great time-saver and a way to catch potential issues early on, it’s also a lot of fun!
A great way to do both corridor- and proper user tests early in development is with our Wizard-of-Oz interface. If you haven’t already, check out this blog post by our chief scientist Gabriel on the Wizard of Oz. Our rapid prototype tool Blockly is also a good tool to quickly test your ideas.
Are you ready to start working? If you don’t yet have your own Furhat, make sure to download our free SDK and create your interaction with the Virtual Furhat already today!