Google Duplex will call salons, restaurants, and pretend to be your personal human.
NEW YORK — In a bid to quell concerns about the new technology — and prove that it can work — Google showed journalists in restaurants in New York and California how its Duplex artificial intelligence system can make reservations with remarkably human-sounding conversation. But first, it identified itself as a bot.
I fielded the call at the THEP Thai Restaurant on Manhattan’s Upper East Side to experience firsthand the Duplex system that the search giant announced to some fanfare — and then a flare of criticism over the ethics of its Google Assistant pretending to be a human — at its I/O developer conference in May.
What sounded like a female caller on the other end of the line wanted to book a dinner reservation for a party of four. Acting as a proxy for the restaurant staff, I told her the requested time was booked and suggested alternatives. After she chose 9 PM, I asked for her name and a phone number where I could reach her. “Valerie,” she said, before giving me the number. She also could spell her last name.
But when I further asked Valerie if she could also supply an email address so that I could send her offers, she briefly put me on hold and had another person come on the line. He explained that he didn’t have permission to share the email, a nod to her privacy.
That second person was a live human being. And if the perfectly natural sounding first voice on the phone hadn’t identified herself as the Google Assistant using Google’s automated booking service—adding that the call would be recorded—I might have mistaken Valerie for a real person, too.
Indeed, Valerie sounded uncannily real, right down to the pacing and intonation in her voice, and the “um’s” and “ah’s” that punctuate typical human conversation.
Google is slowly starting to experiment with Duplex in more public places, starting by having the Google Assistant call retailers to confirm their store hours during the holidays, to update Google search. The Google Assistant will follow later this summer with automated calls that make hair salon appointments and book restaurant reservations.
Google’s ultimate vision is you will issue a command into your phone—something along the lines of “Hey Google, book a table for two at eight at so an so” after which the Assistant may ask you for alternate times that may also work—as in this promotional Google video showcasing what will be possible with Duplex. The Google Assistant will then make the call to the eatery on your behalf, and notify you on your phone when the reservation was made.
The technology wowed with its canny ability to mimic human conversation when it was first demonstrated in May, but the company took heat for not letting the human on the line known that the “person” trying to book an appointment was actually software. “Straight up, deliberate, deception,” voiced one critic. After days of negative headlines and vague answers, it announced it would disclose at the start that the human-sounding caller was the Google Assistant.
More: Should Google let businesses know their human-sounding robot is calling?
More: Google bows to critics, now will tell people a robot is on the phone
Businesses can opt out
Google also says if businesses don’t want to receive such calls for any reason, they’ll honor such requests and provide some sort of opt-out mechanism during the conversation and online.
“We obviously don’t want to push them on it too aggressively for any reason. If they don’t want to be recorded, if they don’t want to interact with a machine or whatever it is, then that’s totally up to them,” says Nick Fox, Google vice president for product and design for the Google Assistant.
Fox adds that if a person doesn’t want to be recorded during a call the failsafe is “they can hang up,”
But Google is still trying to determine whether it will be possible to complete a reservation under such a scenario, or whether to bow out.
Google says it will also put safeguards in place to protect against spam and abuse.
Earlier this month, Google CEO Sundar Pichai announced a series of AI principles, one of which is to “incorporate our privacy principles in the development and use of our AI technologies,” and to “give opportunity for notice and consent, encourage architectures with privacy safeguards, and provide appropriate transparency and control over the use of data.”
And Google obviously sees opportunity, since only 60 percent of the small businesses who rely on customer bookings that it surveyed in April have an online booking system set up. The THEP Thai restaurant gets about 100 calls a day, mostly from callers seeking reservations.
And there are technical challenges. According to Google, four out of five calls can be handled completely automatically, meaning that 20 percent of the time a human may have to intervene, perhaps because of language barriers or a conversation that goes completely off the rails.
A call may also go beyond scope of what the system has been trained to do. At this stage during testing, for example, a Duplex call to the Thai restaurant would not be able to order take-out food, or answer a query, say, about vegetarian menu options.
Google says the system has a self-monitoring capability, which allows it to recognize the tasks it cannot complete autonomously. At that point it signals a human operator, who can complete the task.
Google brought in human operators to monitor actual calls people make as part of its training. The system is designed wherever possible to direct the call back to the task at hand—“I’m not sure what you said but can I get you a table for three?”
In natural spontaneous speech people talk faster and less clearly than they do when they speak to a machine, Google adds, so speech recognition is harder and the error rates climb. Phone calls with poor audio or loud background noises may only compound the challenges.
And Google says that depending on the context of a conversation, the same sentence can have a very different meaning: For example, “Ok for 4” can mean the time a reservation was booked, or the number of people in that reservation.
Meanwhile, Google intentionally incorporates the “speech disfluencies” or those “Mmhmm’s” and “Uh’s” that the company says are conversational acknowledgments that are a natural part of language. “Do you have a table for, uh, 5PM?” Those hmm’s and uh’s and such where one side is talking, may provide cues that let the other person know that the system is still listening to them, but it is not yet their turn to talk.
Apparently, Valerie, um, wouldn’t have it any other way.
Email: firstname.lastname@example.org; Follow USA TODAY Personal Tech Columnist @edbaig on Twitter
Read or Share this story: https://usat.ly/2Mv6Ryi