Google DeepMind shipped Gemini Robotics-ER 1.6 on Tuesday, an upgrade to its embodied reasoning model that handles spatial understanding, task planning and success detection for robots. The model does not drive motors itself. It calls out to a separate vision-language-action system, Gemini Robotics 1.5, and to tools like Google Search when it needs them.
The headline new trick, worked out with Boston Dynamics, is reading industrial instruments. Spot can now point its camera at an analog pressure gauge, a sight glass, or a digital readout and get an answer. DeepMind's research blog explains the model zooms into a crop, places points on tick marks and needles, then runs code to work out proportions. Boston Dynamics' Marco da Silva called the capabilities the thing that lets Spot "react to real-world challenges completely autonomously," which is the sort of line a VP gives a partner on launch day.
On DeepMind's own instrument-reading benchmark, the new model scores 93% with agentic vision enabled and 86% without. Gemini Robotics-ER 1.5 managed 23% on the same task, and Gemini 3.0 Flash hit 67%. Numbers are self-reported and the benchmark is internal. Pointing, counting and multi-view success detection also improved, though comparisons across the multi-view and single-view evals don't share examples.
Safety instruction following gained +6% on text scenarios and +10% on video against Gemini 3.0 Flash. Developers get access now through Google AI Studio and the Gemini API, with a sample Colab covering typical prompts.
Bottom Line
Gemini Robotics-ER 1.6 hits 93% on DeepMind's instrument-reading benchmark with agentic vision, up from 23% on the prior ER 1.5 model.
Quick Facts
- Released April 14, 2026
- Instrument reading: 93% with agentic vision, company-reported
- Previous ER 1.5 scored 23% on the same internal benchmark
- Safety: +6% text, +10% video vs Gemini 3.0 Flash
- Available via Gemini API and Google AI Studio




