Waymo has indicated it will use Google Gemini AI for its self-driving “robotaxis”. The company seems to be developing a new training model for its autonomous vehicles, which will draw data from Google’s Multimodal Large Language Model (MLLM) Gemini.
Waymo releases new research paper about MLLMs helping robotaxis
Waymo LLC was formerly known as the Google Self-Driving Car Project. It is an American autonomous driving technology company. Waymo has been gradually building hardware and software for robotaxis to safely ferry passengers on busy roads.
Waymo released a new research paper, reported The Verge. Titled “End-to-End Multimodal Model for Autonomous Driving” or EMMA, the research paper refers to a new MLLM that’s dedicated to autonomous vehicles.
This new end-to-end training model would process sensor data and generate “future trajectories for autonomous vehicles.”. Needless to say, this would help Waymo’s driverless vehicles make smart decisions on the road. The Waymo robotaxis could confidently predict where to go and how to avoid obstacles.
How will Google Gemini help Waymo?
For several years, algorithms for driverless vehicles have adopted compartmentalized solutions or modules to address each critical function. In other words, tech companies attempted to address aspects such as perception, mapping, prediction, and planning, independently of each other.
Such an approach has helped solve problems for autonomous vehicles. However, with this approach, companies have faced trouble while scaling their solutions. This is because of, “accumulated errors among modules and limited inter-module communication,” mentioned Waymo in the research paper.
Moreover, “pre-defined” parameters caused such solutions to falter in responding to “novel environments” as they struggled to “adapt”. Google’s Gemini is a Generative Artificial Intelligence (Gen AI). It is a “generalist” AI that the search giant has trained on vast sets of scraped data from the internet.
Secondly, Gen AI platforms have proven to demonstrate “superior” reasoning capabilities through techniques like “chain-of-thought reasoning,” suggested Waymo. Simply put, Gemini can mimic human reasoning, and hence, the LLM could “think” like a driver.
Although Google Gemini could help Waymo, the EMMA AI would still need to play nice with new data, something that autonomous vehicles need to do constantly. Specifically speaking, EMMA has faced problems incorporating 3D sensor inputs from lidar or radar, admitted Waymo.
2024-10-31 15:10:01