Fei-Fei Li's World Models Taxonomy: Renderer, Simulator, Pla

Fei-Fei Li and the team at her startup World Labs published an essay this week trying to fix a term the AI industry has nearly worn out: the "world model." Their fix is to stop treating it as one thing and split it into three jobs, then point out which one actually matters. The piece, a functional taxonomy, builds on her earlier argument that spatial intelligence is where AI goes next.

The problem they're actually solving

Everyone uses "world model" and almost nobody means the same thing by it. Computer vision people, roboticists, reinforcement learning researchers, generative AI labs: each has quietly attached its own definition. The essay's move is to drag all of them back to one diagram any RL textbook already has. An agent takes an action, the world changes state, the agent gets an observation, repeat. Formally it's a partially observable Markov decision process, which sounds worse than it is.

From that single loop, Li and her coauthors carve out three projections. A renderer turns state into pixels and cares about whether the picture looks right. A simulator produces the state itself, geometry and physics that hold up whether a human or an algorithm pokes at it. A planner takes an observation and a goal and spits out the next action, closing the perception-to-action circle.

Why the boring one wins

Here's the part with an opinion attached. Of the three, the simulator gets the least public attention and, the authors argue, is the most consequential. Renderers are where the money and the demos are right now, the crop of image and video generators optimized to produce a convincing frame. But a pretty frame is useless if you want to stress-test a building or train a robot arm. The physics underneath isn't real, it just looks real.

Planners are the opposite trade: the most exciting category and the rawest. The essay notes that years of robotics demos still mostly live in lab conditions, a long way from a kitchen that hasn't been tidied for the camera. That gap is the whole problem with embodied AI, and nobody has closed it.

So the simulator sits in the middle as the structural backbone. Master it and you can derive a renderer's appearance and a planner's action consequences from the same understanding of geometry and physics. The framing is clean, maybe a little too clean, since "just solve simulation" is doing a lot of quiet work in that sentence. A faithful simulator needs data correlating movement to action that, by the authors' own admission elsewhere, the field doesn't really have yet.

One model to do all three

Li's stated endgame is a single base model that switches modes on demand: render, simulate, or plan depending on the prompt. As evidence that this isn't pure manifesto, she points at Marble, World Labs' own platform, which launched publicly in November 2025.

Marble generates 3D scenes and, from one model, exports both Gaussian splats for visual inspection and collider meshes a physics engine can chew on. The splats look gorgeous and carry no collision properties at all, which is exactly why the dual export exists. NVIDIA has already wired Marble's PLY splats and GLB colliders into Isaac Sim for robotics work, cutting environment setup from weeks to hours, per the company's own measurement. Take that timing claim with the usual grain of salt that comes with a vendor describing its own tooling.

Whether one model can genuinely flip between rendering a film shot and simulating a robot's afternoon is the open bet. World Labs raised a billion dollars on roughly that thesis. The essay is the clearest statement yet of what they think they're buying.

Fei-Fei Li Splits World Models Into Renderers, Simulators, and Planners

The problem they're actually solving

Why the boring one wins

One model to do all three

Oliver Senti

Related Articles

Why a Global AI Pause Isn't Happening: US Calls It a National Security Asset

Microsoft Launches MAI-Transcribe-1.5 Speech Model

Ideogram Releases Ideogram 4, Its First Open-Weight Image Model

Stay Ahead of the AI Curve