Anthropic's Safeguards Research Lead Resigns, Warns 'the World Is in Peril'

Mrinank Sharma, who led Anthropic's Safeguards Research Team, announced his departure from the company on February 9 in a post on X that read more like a manifesto than a resignation letter. Four footnotes, multiple poet citations, and a closing poem by William Stafford. No specific accusations against his employer.

"I continuously find myself reckoning with our situation," Sharma wrote. "The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment." Which crises, exactly? He doesn't say. A footnote points to something called the "meta-crisis," then cites a book on CosmoErotic Humanism.

The letter that launched a thousand takes

Sharma joined Anthropic in August 2023 after finishing his PhD in machine learning at the University of Oxford. His team worked on defenses against AI-assisted bioterrorism, researched why chatbots tend to flatter users rather than challenge them, and produced what Sharma called "one of the first AI safety cases." His final project, he said, examined how AI assistants might "make us less human or distort our humanity."

All of that is real, useful safety work. And then the letter takes a turn.

"Throughout my time here, I've repeatedly seen how hard it is to truly let our values govern our actions," Sharma wrote, noting pressures within the organization "to set aside what matters most." That's as specific as it gets. No names, no incidents, no policy disagreements. Just a gesture toward tension between what Anthropic says publicly about safety and what happens internally, which is a serious claim to make while offering zero evidence for it.

His next move? "I hope to explore a poetry degree and devote myself to the practice of courageous speech." He plans to return to the UK and, as he put it on X, "let myself become invisible for a period of time."

The footnote problem

Several outlets flagged that Sharma's letter cited a book on CosmoErotic Humanism, published under the collective pseudonym David J. Temple. As Futurism reported, one of the primary authors behind that pseudonym is Marc Gafni, who has faced accusations of sexual exploitation from multiple people, including minors. Gizmodo noted that the philosophy frames humanity's evolution as "the Love Story of the Universe."

Sharma hasn't publicly addressed the Gafni connection. It's a strange footnote (literally) in a letter about moral clarity.

Not the only one leaving

Sharma's exit fits a pattern. According to Business Insider, R&D researcher Harsh Mehta and AI scientist Behnam Neyshabur both announced departures from Anthropic in the past week, though their posts praised the company's culture. Former Anthropic safety researcher Dylan Scandinaro recently joined OpenAI as head of preparedness, which is its own kind of statement.

The timing lands days after Anthropic shipped Claude Opus 4.6 and amid reports the company is pursuing a funding round at a $350 billion valuation. BusinessToday suggested that pressure to ship fast and compete with OpenAI may be squeezing safety work, though that's speculation from unnamed observers, not from Sharma himself.

Anthropic hasn't commented on the resignation.

What it actually means

A safety lead leaving an AI company is worth paying attention to. But this resignation is frustratingly vague about the one thing that would make it matter: what, specifically, went wrong. Sharma's letter is heavy on existential framing and light on operational detail.

Compare this to Tom Cunningham's departure from OpenAI, where the ex-researcher reportedly pointed to the company becoming more reluctant to publish research critical of AI. That's a specific, verifiable claim. Sharma's letter reads like someone processing a spiritual crisis alongside a professional one, and the two get tangled in ways that make it hard to extract actionable information.

A study Sharma published shortly before leaving did find that AI chatbot interactions can produce distorted perceptions of reality in users, with "thousands" of such interactions occurring daily. That research, at least, speaks for itself.

Anthropic was founded by former OpenAI employees who left over concerns about commercialization. The irony of that origin story getting replayed at Anthropic itself is not lost on anyone.

Anthropic's Safeguards Research Lead Resigns, Warns 'the World Is in Peril'

The letter that launched a thousand takes

The footnote problem

Not the only one leaving

What it actually means

Liza Chan

Related Articles

Anthropic Raises $65 Billion at $965 Billion Valuation, Passing OpenAI

Anthropic and OpenAI Shift Enterprise Coding Tools to API Pricing

Altman and Amodei Walk Back AI Jobs Apocalypse Predictions

Stay Ahead of the AI Curve