OpenAI quietly released GPT-5.2 Codex to the Responses API on January 14th. The model had been locked inside OpenAI's Codex environment since its December launch, limited to paid ChatGPT subscribers using the CLI or IDE extensions. That wall just came down.
The pitch
OpenAI is positioning this as their go-to for what they're calling "agentic coding," which means tasks that run autonomously over long sessions: refactors, migrations, feature builds. According to the announcement, the model handles extended workflows without losing track of what it was doing, even when "plans change or attempts fail."
The cybersecurity angle is interesting. OpenAI claims this is their strongest model yet for vulnerability detection. A researcher apparently used the previous version to find multiple React bugs, including one with a CVSS score of 10.0. Whether that makes it a good idea to hand this to every developer with API credentials is a different question.
The numbers
Pricing: $1.75 per million input tokens, $14 per million output. That's a 40% jump from GPT-5.1 Codex, which ran $1.25 and $10 respectively. OpenAI's argument is that token efficiency makes the total cost lower for equivalent quality. I couldn't verify that claim.
The model supports four reasoning effort levels: low, medium, high, and "xhigh." Text and image inputs work. Function calling, structured outputs, streaming, prompt caching, all the usual suspects.
SWE-Bench Pro score: 56.4%. Sounds modest until you compare it to GPT-5.2 at 55.6% and GPT-5.1 at 50.8%. Terminal-Bench 2.0 hit 64.0%.
Already in the wild
Cursor added support immediately. Windsurf did too, and they're running a half-price promotion for paid users. Both platforms clearly had early access, which tells you something about OpenAI's partnership priorities.
Windsurf's changelog shows they've made it the default model, which is a bold move given the price increase. Their CEO called it "the biggest leap for GPT models in agentic coding since GPT-5." Marketing, sure, but they're betting their user experience on it.
What's actually new
The technical improvements center on "context compaction," a native feature that preserves task state across long sessions without blowing through your context window. Previous models struggled with multi-hour coding sessions. Whether this fixes that completely remains unclear.
Windows performance is supposedly better. If you've tried running AI coding tools on Windows, you know why that matters.
Vision capabilities got an upgrade. The model can interpret screenshots, UI mockups, and technical diagrams during coding sessions. Useful for translating design files into functional prototypes, if the marketing is accurate.
The security question
Here's where it gets complicated. OpenAI acknowledges the dual-use problem directly in their documentation: the same capabilities that help defenders also help attackers. They say GPT-5.2 Codex doesn't reach "High" cyber capability under their Preparedness Framework, but they're already planning for future models that will.
There's a "Trusted Access Pilot" for vetted security professionals who want fewer restrictions for legitimate red-teaming work. Invite-only for now.
What happens next
API access is live now, though OpenAI's documentation suggests the rollout may still be expanding. The prompting guide is available on their developer site for anyone trying to optimize their usage.
Expect the IDE integrations to proliferate. GitHub Copilot hasn't announced anything yet, but with Cursor and Windsurf already onboard, the pressure's there.




