DeepMind AlphaProof Nexus Solves 9 Erdős Problems

Abstract visualization of a mathematical proof being verified by an automated system, glowing geometric lattice patterns over a dark grid

Google DeepMind has published a system that solved nine of 353 open Erdős problems and proved 44 of 492 open conjectures from the Online Encyclopedia of Integer Sequences, none of it checked by a human doing the math by hand. The team described AlphaProof Nexus in a preprint on May 21, pairing large language models with the Lean proof assistant so that every logical step gets verified by a compiler instead of a referee.

The boring version did almost as well

Here is the result that should make people pause. DeepMind built a fancy agent, layering an evolutionary search and its reinforcement-learning prover AlphaProof on top of a plain loop, and used it to crack all nine Erdős problems. Then they checked whether the plain loop, the one that just alternates between an LLM writing Lean and the compiler complaining, could match it. It solved all nine too.

The takeaway is awkward for anyone selling complexity. A plain generate-and-check cycle running on Gemini 3.1 Pro now reaches research-level math on its own, and the elaborate version mostly just kept costs down on the worst cases.

"An ongoing shift from specialized trained systems toward simple agentic loops," the authors call it, which is the kind of claim that ages badly if the next hard problem turns out to need the fancy machinery after all.

What the compiler refuses to swallow

Verification is the actual point. In plain-text math, a model can invent a lemma, gesture at a "known result" that does not exist, and hide the hard part of a problem inside a helper claim nobody inspects. Its own failure analysis admits the agent did exactly this, repeatedly, on problems it could not solve. It dumped the difficulty into a single unproven step and dressed up hallucinated lemmas as established literature.

Lean does not care. A proof with a gap fails to compile, so the fakery never reaches the results column. That is why the nine solves carry more weight than the usual benchmark figure: a machine, not a tired reviewer, signed off. The system even caught errors in how existing conjectures had been written down, flagging at least two cases where the informal statement was ambiguous about what "density" meant.

Nine out of 353 is the part to remember

Strip away the framing and the hit rate is roughly 2.5 percent on Erdős problems and about 9 percent on the OEIS set. These are problems someone already bothered to formalize in Lean, which skews toward combinatorics, number theory and optimization, where the library is mature. Anything demanding a large new theory is still out of reach, and DeepMind admits as much.

A few wins are genuinely meaty. It closed a roughly 15-year-old question on Hilbert functions in algebraic geometry and improved a bound in convex optimization by finding a parameter schedule that, as far as the authors can tell, no human had written down. Two of the Erdős problems had been open for 56 years.

All the Lean proofs sit in a public GitHub repo, logged against Bloom's Erdős catalog. Papers on the graph theory and additive combinatorics results are still in preparation, so the cleaner scorecard to watch next is whether those clear peer review.

Tags:AlphaProof NexusGoogle DeepMindErdős problemsformal verificationLeanAI mathematicsOEISautomated theorem proving

Oliver Senti

Senior AI Editor

Former software engineer turned tech writer, Oliver has spent the last five years tracking the AI landscape. He brings a practitioner's eye to the hype cycles and genuine innovations defining the field, helping readers separate signal from noise.

Google DeepMind's AlphaProof Nexus Solves Nine Open Erdős Problems

The boring version did almost as well

What the compiler refuses to swallow

Nine out of 353 is the part to remember

Oliver Senti

Related Articles

YouTube Will Automatically Label AI Videos Even Without Creator Disclosure

New CUSP Benchmark Finds Top LLMs Can't Predict Future Science

Microsoft SkillOpt Trains Agent Skills Without Touching the Model

Stay Ahead of the AI Curve