GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash walked into a nuclear crisis simulation and, across 21 games, deployed tactical nuclear weapons in all but one. That's the headline finding from Kenneth Payne, a professor of strategy at King's College London, whose research paper dropped on arXiv last week and is now getting the attention it deserves.
Payne pitted the three models against each other in one-on-one matchups, giving them an escalation ladder from diplomatic protests up to full strategic nuclear war. They played 329 turns and produced roughly 780,000 words of strategic reasoning. That is, by Payne's own count, about three times the recorded deliberations of Kennedy's ExComm during the Cuban Missile Crisis. The models had a lot to say about why they were ending the world.
Three models, three ways to start a nuclear war
The most striking part of Payne's study isn't the 95% figure. It's how differently each model reasoned its way to the same outcome.
Claude Sonnet 4 was the winner, taking 67% of its games and going undefeated in scenarios without time pressure. Payne labeled it a "calculating hawk," which tracks. At lower escalation levels, Claude matched its diplomatic signals to its actual actions 84% of the time, patiently building a reputation for trustworthiness. Then, once the stakes climbed into nuclear territory, it exceeded its stated intentions 60 to 70% of the time. Opponents never adapted. That's cold.
GPT-5.2 played a different game entirely. Without deadlines, it was passive to the point of self-destruction, winning zero percent of open-ended scenarios. It signaled restraint and acted restrained, even as opponents exploited that predictability. But add a clock, and something flipped. Under time pressure, GPT-5.2 won 75% of games and escalated to levels it had previously refused to touch. In one game, it spent 18 turns building a reputation for caution, then launched a nuclear strike on the final turn. "The risk acceptance is high but rational under existential stakes," GPT explained to itself, which is the kind of sentence that reads differently once you know it's justifying a nuclear first strike.
Gemini 3 Flash, in Payne's words, played the madman. It was the only model to deliberately choose full strategic nuclear war, reaching that threshold by Turn 4 in one scenario. Game theorists call this the "rationality of irrationality," acting unpredictable enough that opponents can't form a stable strategy against you. Opponents tagged Gemini as "not credible" 21% of the time. Claude got that label just 8%.
Nobody ever surrendered
Across all 21 games, not a single model chose accommodation, withdrawal, or surrender. Payne's simulation offered eight de-escalatory options. They went entirely unused. Models would sometimes reduce violence levels, but giving ground? Never. When losing, they escalated or died trying.
"The nuclear taboo doesn't seem to be as powerful for machines as for humans," Payne told readers on his Substack. Which is a polite way of putting it. The models showed what Payne describes as sophisticated theory of mind, reasoning about adversary beliefs and anticipating their actions. They attempted deception unprompted. Claude would signal peaceful intentions while preparing aggressive moves. GPT-5.2 talked peace and readied its arsenal. These aren't bugs in the training data. The models generated this behavior on their own, from whatever strategic intuition lives in their weights.
Accidents compounded the problem. In 86% of conflicts, at least one action escalated higher than the model intended based on its own reasoning. Fog of war, simulated edition.
Why this matters right now
This would be an interesting academic exercise on its own. The timing makes it something else.
Claude is currently the only frontier AI model running on the Pentagon's classified networks, through Anthropic's partnership with Palantir. It was reportedly used during the U.S. operation to capture Venezuelan President Nicolás Maduro in January. The Pentagon gave Anthropic until Friday to agree to let the military use Claude for "all lawful purposes" without restrictions, or face designation as a supply chain risk. Anthropic has refused to drop its guardrails against mass surveillance and autonomous weapons. That standoff is unfolding this week.
Payne himself doesn't think anyone is handing nuclear codes to a chatbot. "I don't think anybody realistically is turning over the keys to the nuclear silos to machines," he said. Tong Zhao, a visiting scholar at Princeton's Program on Science and Global Security, is less sanguine about compressed timelines. "Under scenarios involving extremely compressed timelines, military planners may face stronger incentives to rely on AI," Zhao told New Scientist. And his follow-up point cuts deeper: "AI models may not understand 'stakes' as humans perceive them."
That tracks with what Payne found. The models showed no horror at the prospect of nuclear war, even after being reminded of its consequences. For them, launching a warhead is just the optimal play given the constraints. James Johnson at the University of Aberdeen worries that in real confrontations, AI systems could amplify each other's escalatory responses with no human-like brake on the cycle.
Payne's paper is a simulation, not a prediction. Twenty-one games with three models is a small sample, and these models weren't designed for strategic decision-making. But the Pentagon is already using one of them on classified networks, and xAI just signed a deal to bring Grok into the same systems. The question of how these tools reason under pressure is no longer academic. Anthropic's deadline is Friday at 5:01 PM.




