Comment by godelski
You are confidently wrong
> Powered by Gemini, a multimodal large language model developed by Google, EMMA employs a unified, end-to-end trained model to generate future trajectories for autonomous vehicles directly from sensor data. Trained and fine-tuned specifically for autonomous driving, EMMA leverages Gemini’s extensive world knowledge to better understand complex scenarios on the road.
https://waymo.com/blog/2024/10/introducing-emma/
You were confidently wrong for judging them to be confidently wrong
> While EMMA shows great promise, we recognize several of its challenges. EMMA's current limitations in processing long-term video sequences restricts its ability to reason about real-time driving scenarios — long-term memory would be crucial in enabling EMMA to anticipate and respond in complex evolving situations...
They're still in the process of researching it, noting in that post implies VLM are actively being used by those companies for anything in production.