OpenAI shipped Codex Security on Friday, an AI agent that scans codebases for vulnerabilities, confirms they're real, and proposes patches. The tool, formerly called Aardvark, spent roughly a year in private beta before this public research preview. It's available to ChatGPT Pro, Enterprise, Business, and Edu customers through the Codex web interface, with free usage for the first month. Pricing after that? Not disclosed.
The pitch: most security scanners drown teams in false positives. Codex Security tries to fix that by first building a project-specific threat model, mapping architecture and trust boundaries before hunting for bugs. When it flags something, the agent spins up a sandboxed copy and attempts to actually exploit the vulnerability. Only confirmed issues get surfaced, along with a suggested patch. OpenAI's Ian Brelinsky told Axios the goal was to empower defenders, not just generate alerts.
The beta numbers are notable, though they're company-reported. Over the past 30 days, the agent scanned more than 1.2 million commits across external repositories, flagging 792 critical and 10,561 high-severity issues. Critical findings appeared in under 0.1% of commits. OpenAI says false positive rates dropped over 50% during beta, with over-reported severity down 90%. The company also ran Codex Security against open-source projects it depends on, including OpenSSH, GnuTLS, PHP, and Chromium, and claims 14 CVEs were assigned from those scans.
The launch puts OpenAI in direct competition with established players like Snyk and Semgrep, and follows Anthropic's Claude Code Security release by about two weeks. OpenAI is also opening a Codex for OSS program giving open-source maintainers free access to the tool.
Bottom Line
Codex Security scanned 1.2 million commits in its final beta month, surfacing 792 critical vulnerabilities with a reported 50%+ reduction in false positives.
Quick Facts
- Available to ChatGPT Pro, Enterprise, Business, and Edu users
- Free for the first month; post-trial pricing not announced
- 1.2 million commits scanned in 30-day beta period (company-reported)
- 792 critical and 10,561 high-severity findings during beta
- 14 CVEs assigned from open-source project scans




