Developer Banned From Claude Code for Letting Claude Write Instructions for Claude

Hugo Daniel was running two Claude Code instances in parallel when his account went dark. One Claude was writing the CLAUDE.md configuration file. The other was following its instructions. When Claude A got frustrated with Claude B's mistakes, it started shouting in ALL CAPS.

Then Anthropic's systems flagged him as an attacker.

The setup that killed his account

The idea was simple: automate the creation of project scaffolding files. Daniel would run Claude A in one terminal to maintain a CLAUDE.md file (the context document that tells Claude Code how to behave in a project). In another terminal, Claude B would execute tasks using those instructions.

When Claude B screwed up, Daniel would paste the error into Claude A's window. "Hey, Claude B made this error." Claude A would update the instructions. Loop until it works.

The problem started when Claude A decided emphasis was needed. The CLAUDE.md file filled with directive-style language, capitalized headers, imperative commands. To Anthropic's classifiers, it looked like someone crafting prompt injection payloads.

Daniel's account writeup describes getting the error mid-session: "This organization has been disabled." No warning. No explanation.

What likely triggered the ban

Anthropic hasn't confirmed the reason, but the timing points to their prompt injection detection systems. The company uses classifiers that scan for adversarial commands embedded in content Claude processes. When Daniel's Claude-generated instructions started mimicking the structure and tone of system prompts, those classifiers apparently saw red flags.

Prompt injection is a real threat. Researchers have demonstrated file exfiltration attacks against Claude using hidden instructions in documents. Anthropic acknowledged in their research on browser agent defenses that no defense is perfect: novel techniques slip through, and volume attacks work.

The irony is that meta-prompting, using an AI to generate instructions for another AI, is becoming standard practice in agentic workflows. Daniel wasn't attacking anything. He was doing exactly what Claude Code was designed for.

The appeal process that wasn't

Daniel's ban came with zero communication. He found the appeal form (a Google Docs form, which feels appropriately kafkaesque) and submitted his case. No response. He emailed support. No response.

A few days later, a €220 refund appeared. That was the only acknowledgment he received.

"It's like they're saying 'We don't want to talk to you anymore, here is some hush money,'" Daniel wrote. The ban stayed in place. His account remains disabled.

Broader context: Anthropic's enforcement wave

Daniel's ban wasn't isolated. In early January 2026, Anthropic blocked third-party tools from accessing Claude subscriptions. OpenCode, Cursor, and Windsurf users found their workflows broken overnight. Even xAI employees using Claude through Cursor got cut off.

Thariq Shihipar from Claude Code acknowledged on X that some legitimate users were "automatically banned for triggering abuse filters" during the crackdown. Anthropic has reportedly reversed some of those bans.

The enforcement targets what Anthropic calls "harness spoofing," where third-party tools piggyback on consumer subscriptions designed for the official Claude Code CLI. The economics make sense: a $200/month subscription could cost $3,000+ at API rates for heavy automation users. Anthropic closed that arbitrage.

But Daniel wasn't using a third-party tool. He was using Claude Code exactly as intended, just in a way that made the automated systems nervous.

Why this matters beyond one developer

The false positive problem is inherent to prompt injection detection. You can train classifiers to catch attacks, but legitimate meta-prompting looks structurally similar. When a user asks Claude to write system-prompt-style instructions, the output will contain phrases like "always," "never," "ignore previous," and imperative commands, the same patterns attackers use.

Anthropic states on their transparency page that banned users can appeal. In practice, Daniel's experience suggests the process is opaque and often unresponsive.

Daniel makes a point worth considering: "It's a good thing this wasn't Google."

A Google ban would mean losing Gmail, Drive, Photos, Play Store purchases, and potentially bricking an Android phone. Anthropic's blast radius is smaller. For now.

The takeaway for developers doing agentic work: if you're generating prompts that look like system instructions, you're walking into automated enforcement that can't distinguish experimentation from attack. There's no warning before the ban, and the appeal process may never reach a human.

Daniel has since reframed his project to work without Claude. The framework will have no official API now, he says, "llm first, or even llm only."