Covering Disruptive Technology Powering Business in The Digital Age

Bridging the Communication Between Man and Machine
September 5, 2022 Blog


Written by: Khairul Haqeem, Journalist, AOPG.

Think of yourself and your computer having a thoughtful discussion. Though our manner of communicating is based on interpersonal relationships, it may be adapted into a more rationally oriented language suitable for computers. However, human language is in a constant state of flux and ambiguity, thus it usually takes an instinctive familiarity with the way things are in the world to comprehend what we mean by certain terms.

Human Languages Are Confusing to Robots

The only language computers can comprehend is computer code. Robots are often programmed to follow concise instructions and operate within strict limits. If you have taken an introductory C++ course in the past, even if it was just something as basic as this:

Your robot will be able to easily follow these directions and respond accordingly. However, a command like “Please fetch me something to drink,” is so general that it might confuse your AI companion and cause it to short circuit!

While it’s true that we can convert human speech into computer code with relative ease, there is still a significant linguistic gap between the two. Although humans express themselves in three different ways (verbally, in writing, and through body language), machines can only use written language. Artificially Intelligent chatbots, which typically feature generative animation and speech synthesis, are only one example of how written language on a computer may serve as a signalling mechanism between oral and visual stimuli. But in the end, it’s still written words.

Lingua Franca

Google’s parent company, Alphabet, is merging two of its most advanced research fields to develop a “helper robot” called the Everyday Robots project that can understand instructions in natural human speech. It’s like something out of a science fiction book; pretty soon we’ll be able to communicate with robots in a natural way and have them follow our instructions.

Although the Everyday Robots project is still in its infancy (the robots are sluggish and timid), they have just received an upgrade: Enhanced language understanding thanks to Google’s Large Language Model (LLM) PaLM or Pathways Language Model. This combination, dubbed PaLM-SayCan, demonstrates a way forward toward easing human-robot conversations and enhancing the efficiency with which robots complete their tasks.

LLMs such as GPT-3 and Google’s MuM excel in understanding the meaning of complex commands. One such scenario where you would say this to a prototype of Google’s Everyday Robots is if you were to say something like, “I spilt my food, can you help?” The robot processes this command via its internal database of probable procedures and understands it to mean, “Get me the dishcloth.” Is this as smart as it gets for a robot? As Neil Armstrong famously put it, “one small step for man,” this is quite impressive.

Humanising Machine

The PALM-SayCan technology was tried out by the team at Google Research and Everyday Robots with a robot in a kitchen setting. In order to implement their plan, they ‘grounded’ PaLM in the scenario of a robot receiving high-level instructions from a human, in which the robot must determine what constitutes a helpful activity and what it is capable of in the given setting.

There’s more to the story, as explained by a Google researcher. According to their paper ‘Do As I Can, Not As I Say.’

Whenever a Google employee replies, “I spilt my drink, can you help?” the robot now returns with a sponge and even attempts to recycle the empty can. Adding the ability to clean up the mess would be a useful addition to training.

Google claims that their robots using PaLM-SayCan were able to correctly prepare replies to 101 user instructions 84% of the time and successfully execute them 74% of the time. You should take those results with a grain of salt but that’s still a really good hit rate. How well do you think they represented the range and complexity of language we’d expect from a genuine home assistance robot?

The Ghost in the Machine

There are many too complicated requests we’d have for a real house robot, such as “sauté the onions for my rendang” or “clean up the coffee I just spilt beneath the table” (both commands that contain a vast amount of implied knowledge, from how to clean up spilt coffee, to where the onions in the fridge are and how to prepare them, and so on).

Will a machine ever be able to imitate causation? Maybe. However, until true Artificial Intelligence is developed, the world will have to wait.