Google Duplex conducts natural conversation for real-world tasks

Natural language conversation with computers is the fiction we enjoyed watching in various tv series or movies. Recently, we have witnessed movies like ‘her’, a romantic Sci-Fi in which Theodore Twombly (the main character) fell in love and develops a relationship with an intelligent computer operating system Samantha – that personified through a female voice.

Now the same Artificial Intelligent(AI) fiction has nearly become a fact with Google’s new AI system that accomplishes real-world tasks over the phone. In the recent years, we are constantly listening to AI buzz and its research and evolution in natural language processing, machine learning, and deep learning. Also, experienced with few of the applications like Google Voice Search, WaveNet, etc.

In all these years the various downsides of AI are like not understanding natural languages, doesn’t engage with callers efficiently, stuck with flow and force callers to adjust to the system instead of system adjusting to the caller.

However, Google’s new technology called Google Duplex will avoid the stilted computerized voices and bring more peace by conducting natural conversations to complete real-world tasks over the phone.

What are the real-world tasks accomplished by Google Duplex?

Google Duplex conducts the natural language processing very efficiently and smoothly. The tasks like booking for a service, scheduling appointments, and getting information of a product are the more specific to accomplish using this new technology. For such tasks, Google wants to make the conversation experience comfortable and as natural as possible. So, this will help people to speak normally without adjusting to the machine to carry out real-time tasks.

Important to note that Duplex doesn’t carry out general conversations, it is constrained to closed domains. It carries out natural conversation after being profoundly trained in such domains, where it is narrow enough to explore extensively. So, there’s no danger yet with AI.

How natural is Natural conversation using Google Duplex?

The new technology Google Duplex sound natural where the users and businesses have good experience with the service. But it is not that easy to put it practically, because it is very difficult to understand natural language, the natural behavior is tricky to model, requires fast processing for real-time responses. Also, the intent and context of the conversation are hard to devise into the machine.

However, Google has used the following technology stack to put the Duplex to sound natural.

Duplex uses the Recurrent neural network (RNN) to cope with the above challenges. This is built using TensorFow Extended (TFX). The Duplex RNN is trained using the corpus of anonymized phone conversation data, the output of Automatic Speech Recognition (ASR) technology, audio conversation, history of conversation, and the parameters of conversation.

The understanding model is trained separately for each task and use hyperparameter optimization from TFX to improve the model.

Now, it uses a combination of a concatenative text to speech (TTS) engine and a synthesis TTS engine to control intonation depending on the circumstance. It includes speech disfluencies(e.g. “hmm’s” and “uh”s) to make the conversation more natural.

Benefits for Businesses and Users

  • It allows the Google Assistant to book appointments, tickets, and service for their users automatically.
  • Also, it is simple to cope with cancellations and rescheduling of appointments with the Duplex.
  • It increases the customer satisfaction with its uninterrupted service and information online.
  • Customers can now call businesses to inquire the information about the company which is not available on the internet.
  • Google Duplex is always available to its users in making supported tasks easier like making a phone call.

6 thoughts on “Google Duplex conducts natural conversation for real-world tasks”

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.