Overworld dropped Waypoint-1 on Monday, a 2.3 billion parameter video diffusion model that creates playable game environments in real time. The company, founded by former Stability AI researchers Louis Castricato and Shahbuland Matiana, trained the model on 10,000 hours of gameplay footage.
The pitch: give it some frames and a text prompt, and it builds a world you can walk around in using keyboard and mouse. Unlike existing world models that let you nudge a camera every few frames, Waypoint-1 processes every input with zero latency, according to Overworld's technical blog. The team claims 30 FPS at 4 denoising steps on Nvidia's RTX 5090, or 60 FPS if you cut to 2 steps.
To keep long sessions coherent, the model uses self-forcing via DMD, a post-training technique that aligns inference behavior with training conditions. The original self-forcing paper showed the approach prevents the quality degradation that plagues autoregressive video models over extended rollouts.
Overworld raised $4.5 million in pre-seed funding led by Kindred Ventures, with angels including Google's Logan Kilpatrick and executives from Snowflake and Roblox. The code lives on GitHub.
The Bottom Line: Consumer-grade real-time world simulation just went open-source, though visual fidelity and stability are still experimental.
QUICK FACTS
- Model size: 2.3B parameters (Waypoint-1-Small)
- Training data: 10,000 hours of video game footage
- Performance: 30 FPS at 4 steps, 60 FPS at 2 steps (RTX 5090, company-reported)
- Funding: $4.5M pre-seed led by Kindred Ventures
- License: Open weights on Hugging Face




