Anthropic released Claude Code Security on Friday, a tool that scans codebases for vulnerabilities and suggests patches. It's available as a limited research preview for Enterprise and Team customers, with free expedited access for open-source maintainers. Wall Street's reaction was swift: CrowdStrike and Cloudflare both closed down roughly 8%, marking the second Anthropic-triggered selloff in enterprise software stocks this month.
What it actually does
Static analysis tools, the kind most security teams rely on, work by matching code against databases of known vulnerability patterns. They catch hardcoded passwords, outdated encryption, the obvious stuff. But flaws in business logic or broken access control? Those require understanding how components interact and how data flows through a system. That's where static analysis hits a wall.
Claude Code Security claims to read code the way a human researcher would. According to Anthropic's blog post, it traces data flows, maps component interactions, and catches complex vulnerabilities that rule-based scanners miss. Each finding goes through what Anthropic calls a multi-stage verification process, where Claude re-examines its own results, tries to disprove them, and filters out false positives. Validated issues show up in a dashboard with severity ratings, confidence scores, and suggested patches.
Nothing gets applied without human approval. That's worth emphasizing given the anxiety around AI making autonomous changes to production code.
500 bugs that nobody found
The headline number is striking. Using Opus 4.6, released on February 5, Anthropic's Frontier Red Team says it found over 500 vulnerabilities in production open-source codebases, bugs that survived years (Anthropic says decades, in some cases) of expert review. The company is working through triage and responsible disclosure with maintainers now.
That claim deserves some scrutiny. "Over 500" is vague enough to include anything from critical remote code execution flaws to minor issues that would never be exploited in practice. Anthropic doesn't break down severity distribution, and the company is both the tool maker and the one reporting the results. Logan Graham, who leads the roughly 15-person Frontier Red Team, told Fortune the findings include "high-severity vulnerabilities," the kind that let attackers break into systems or steal data. But until the disclosures are public and independently verified, the number is more marketing than evidence.
Still, even if half those findings are noise, the other half represents a real problem. Open-source maintainers are chronically under-resourced, and the security review backlog across the software supply chain is massive.
The Aardvark in the room
Anthropic isn't the first to ship an AI vulnerability scanner. OpenAI launched Aardvark in late October, a GPT-5-powered tool that does much the same: scans repos, identifies vulnerabilities, proposes patches. Aardvark goes further in one respect, attempting to actually exploit findings in a sandboxed environment to confirm they're real, and has claimed a 92% detection rate on benchmark repositories.
Aardvark reported 10 CVEs from its open-source scanning. Claude Code Security claims 500+ findings. Comparing those numbers directly is misleading, since CVEs represent confirmed, cataloged vulnerabilities while Anthropic's count includes findings still in triage. But the gap suggests either Opus 4.6 is finding a lot more, or Anthropic is counting differently. Probably both.
StackHawk's response to the announcement raised a point that neither Anthropic nor OpenAI address well: neither tool actually runs your application. They reason about code statically, just more intelligently than traditional scanners. Business logic vulnerabilities that only manifest at runtime, the kind that show up in actual incident reports, still require dynamic testing. Calling this "reading code like a human researcher" is accurate as far as it goes, but human researchers also run the code.
The dual-use problem
"Attackers will use AI to find exploitable weaknesses faster than ever," Anthropic wrote in the announcement. Graham put it more directly to Fortune: "It's really important to make sure that what is a dual-use capability gives defenders a leg up."
This is the tension at the core of the product. The same capabilities that let Claude find hidden vulnerabilities could, in theory, help attackers find them too. Anthropic says it's investing in safeguards to detect malicious use, though the company didn't provide specifics on what those safeguards look like. The limited research preview, with its requirement that testers only scan code they own, is one form of guardrail. Whether that holds as the tool scales is an open question.
The Frontier Red Team's background work here is real. They've entered Claude in competitive CTF events, partnered with Pacific Northwest National Laboratory on critical infrastructure defense, and Anthropic says it uses Claude to review its own code internally. That's not nothing.
Enterprise and Team customers can apply for access at claude.com/contact-sales/security. Open-source maintainers get a separate fast track. No public timeline for general availability.




