Google is reworking how its Gemini app burns through usage quota, less than two weeks after switching to a compute-based system at I/O 2026. VP Josh Woodward laid out the changes in an X thread on May 28, framing them as a response to paying customers who said they were hitting limits within a handful of prompts.
What broke
The new system, introduced right after I/O, weighs request complexity, tool usage, and chat length instead of just counting prompts. Reasonable on paper. In practice, subscribers started posting receipts. One user claimed a single avatar video attempt ate his entire five-hour allowance in about four minutes, and the video reportedly failed to even generate. Woodward's public reply at the time was "Yikes, let us take a look!", which is not the sentence you want from the person running a product you pay for.
That specific case traced back to a bug in Omni, Google's world model. One or two videos could wipe a quota. It's now patched, and Ultra subscribers get double the Omni generations as compensation. Whether "double" means much depends on how low the starting number was, which Google hasn't said.
The fixes that actually matter
Two changes do real work here. The first: failed requests stop counting. Woodward noted that roughly 1 in 10 requests fail on system errors, and until now you paid quota for those anyway. "Your quota is used only for successful completions," he wrote. Charging people for your own server hiccups was always going to be hard to defend, so this reads less like generosity and more like fixing something that should never have shipped.
The second: Gemini 3.1 Pro now caps how much quota a single prompt can consume. Big files and multi-step requests were the worst offenders. The prompt still runs in full, but it can't torch a whole session's budget on its own.
Flash-Lite prompts are now free and don't touch your quota at all. Pick a model and that choice sticks across sessions now, rather than silently resetting, with the system only switching you down to a lighter model when you hit a limit.
What Google still hasn't said
Deep Research and similar heavy tasks are getting more detailed usage breakdowns and notifications, eventually. The current dashboard at gemini.google.com/usage only shows a high-level view. Woodward gave no timeline for the items still in development, which is most of the transparency-related ones.
For context on the stakes: AI Pro runs $19.99 a month, the two Ultra tiers sit at $99.99 and $199.99. These are the people who complained loudest, and they're the ones Google can least afford to annoy. Pay-as-you-go top-up credits are also coming at some unspecified point, which suggests Google expects people to keep running out.
The bug fixes and the failed-request change are live now. The reporting and notification improvements have no announced date.




