Waymo Taps Google's Genie 3 to Simulate Elephant Encounters and Tornadoes for Its Robotaxis

Waymo announced that it is building its next-generation driving simulator on top of Google DeepMind's Genie 3, the general-purpose world model that Google made publicly available to AI Ultra subscribers just eight days earlier. The system, called the Waymo World Model, generates photorealistic driving environments complete with both camera and lidar data, and it can conjure scenarios that Waymo's fleet has never actually encountered on the road.

Elephants. Tornadoes. Snow on tropical streets lined with palm trees. A plane landing on a freeway.

The demos are impressive, if a bit theatrical. But the announcement lands during what might be the worst two weeks Waymo has had since it started giving commercial rides.

The timing is hard to ignore

On January 23, a Waymo robotaxi struck a child near an elementary school in Santa Monica during morning drop-off. The child ran into the street from behind a double-parked SUV. Waymo says its system detected the child and braked from roughly 17 mph to under 6 mph before contact, and that the child sustained minor injuries. NHTSA opened an investigation. So did the NTSB.

That same week, the NTSB announced a separate probe into Waymo vehicles illegally passing stopped school buses in Austin. The Austin Independent School District had documented 19 instances since the start of the 2025-26 school year.

Then, on February 5, the day before the World Model announcement, Waymo's chief safety officer Mauricio Peña testified before the Senate and confirmed that some of the company's "fleet response agents," the humans who help robotaxis navigate confusing situations, are based in the Philippines. Senator Ed Markey called it "a safety issue" and questioned whether overseas operators introduce cybersecurity vulnerabilities. Peña couldn't say how many agents were offshore versus domestic.

So: a child struck near a school, a federal investigation into school bus behavior, and a congressional grilling about offshore human operators. And then, the next morning, a blog post about simulating elephants.

I'm not suggesting the timing was deliberate PR. Engineering projects like this take months. But it is the context in which you should read every claim Waymo makes about improving safety through simulation.

What the World Model actually does

The technical pitch is straightforward, and legitimately interesting. Most AV simulation systems train only on data collected by a company's own fleet. That means your simulator can only recreate scenarios you've already seen. If your cars have never driven through a tornado, your simulator can't produce one.

Waymo's approach is different. By building on Genie 3, which was pre-trained on a massive corpus of diverse video data, the World Model inherits broad knowledge about how the physical world looks and behaves. Waymo then post-trains this model specifically for driving, teaching it to output not just camera video but also 3D lidar point clouds that match Waymo's proprietary sensor suite.

The system offers three control mechanisms. Driving action control lets engineers test counterfactuals: what if the car had accelerated instead of yielding? Scene layout control lets them rearrange road geometry, traffic signals, and other vehicles. And language control, the flashiest feature, lets engineers type text prompts to change weather, time of day, or spawn entirely new scenarios.

It can also take a regular dashcam video and transform it into a multi-sensor simulation, showing how the Waymo Driver would perceive that same scene. Waymo showed examples using footage from Norway and Utah's Arches National Park.

The long-tail problem (and whether this solves it)

The core argument is about what the industry calls "long-tail events," the rare scenarios that happen once in millions of miles but can be catastrophic when they do. A construction zone with an unusual cone layout. A mattress on the freeway. A pedestrian in a T-Rex costume. These events are almost impossible to collect enough real-world data on, and you can't safely stage most of them.

Waymo says the World Model can generate these at scale, exposing the Waymo Driver to situations it's never seen before it encounters them on actual streets. The company has reportedly logged nearly 200 million fully autonomous miles on public roads, but racks up billions more in simulation.

Here's the thing, though: Waymo didn't share any benchmark results or independent evaluations alongside this announcement. We don't know how well simulated training translates to improved real-world performance. We don't know if exposure to a simulated elephant actually helps the car handle a real dog darting into traffic. The gap between "we can generate realistic-looking rare scenarios" and "training on these scenarios measurably reduces incidents" is significant, and Waymo hasn't bridged it publicly.

The company positions simulation as one of three safety pillars alongside real-world testing and rigorous validation. But the Santa Monica incident, where the core challenge was a child emerging from behind an occluding vehicle, isn't really a long-tail event. Kids running into streets near schools during drop-off is about as predictable as it gets. That's not a scenario you need Genie 3 to simulate.

Waymo isn't alone here

UK-based Wayve has been working on generative world models for driving since 2023, when it published GAIA-1. Its latest iteration, GAIA-3, is a 15-billion-parameter model that Wayve says can run structured, repeatable evaluations of driving AI. Wayve has attracted significant investment from Nvidia for this work.

Nvidia itself offers its Cosmos world foundation models to partners like Plus and Oxa. Waabi has Copilot4D. Pony.ai recently partnered with Moore Threads to scale its PonyWorld simulator. Tesla, meanwhile, has focused on its Dojo supercomputer and a massive real-world video corpus, with less public emphasis on generative simulation.

What Waymo has that most of these competitors don't is the Alphabet corporate umbrella. Genie 3 was built by DeepMind for general purposes. Waymo gets to adapt it for driving. That's a real structural advantage, even if it's hard to quantify.

The consumer Genie 3 isn't exactly confidence-inspiring

Project Genie, the consumer-facing version that launched on January 29, generates explorable 3D worlds from text prompts at 720p and 20-24 fps. Early feedback has been mixed. The Verge reported that one demo failed to maintain continuity, and the overall result was "much worse than an actual handcrafted video game." Generated worlds are capped at 60 seconds. And it costs $250 a month through Google's AI Ultra subscription.

Now, the Waymo World Model is a different beast, post-trained specifically for driving with Waymo's own sensor data. But the underlying model's known limitations with consistency and physics accuracy are worth keeping in mind when evaluating claims about "hyper-realistic" driving simulation.

Waymo says it built a leaner variant of the model for scenarios that require longer rollouts, like navigating a narrow lane with oncoming traffic, achieving what it calls a "dramatic reduction in compute." No numbers attached to "dramatic."

What I'm watching for

Waymo is currently raising more than $15 billion at a valuation approaching $100 billion, according to Bloomberg. It operates in six US markets and plans aggressive expansion this year into cities including Nashville, Las Vegas, Seattle, and Miami, plus international markets including the UK.

The company claims its data shows an 81% reduction in injury-causing crashes compared to human drivers, though it has still driven relatively few miles compared to the overall human driving population. That statistic will face sharper scrutiny as the fleet grows and incidents like Santa Monica pile up.

The World Model is a research announcement, not a product launch. Waymo hasn't said when simulation-trained improvements will ship to its fleet, or how it plans to validate that simulated scenario exposure translates to real-world safety gains. Until it does, this is a promising piece of infrastructure wrapped in a very well-timed blog post.