Andrej Karpathy, the former head of Tesla Autopilot and OpenAI co-founder, posted on X this week that programming has become "unrecognizable" in two months. His claim: coding agents crossed from broken to functional sometime in December 2025, and the shift wasn't gradual. "You're not typing computer code into an editor like the way things were since computers were invented," he wrote. "That era is over."
This is Karpathy's third major public statement on AI-assisted coding in roughly a year. He popularized the term vibe coding in February 2025, then declared a "magnitude 9 earthquake" in software engineering in a viral post on December 26. That December tweet alone pulled over 14 million views. Now he's back with something more specific: a concrete before-and-after line drawn at December 2025.
The DGX Spark demo
To make his point, Karpathy described a weekend project: building a local video analytics dashboard for his home cameras. He says he gave a coding agent a single prompt that included his DGX Spark credentials and a list of tasks: set up SSH keys, deploy vLLM, download and benchmark Qwen3-VL, build a web dashboard, wire up systemd services, write a report. The agent ran for about 30 minutes, hit problems, Googled solutions, debugged its own code, and came back with everything working.
"I didn't touch anything," he wrote. Three months ago, he says, this would have been a full weekend project.
It's a compelling anecdote, though anecdotes from people with DGX hardware sitting around the house should come with an asterisk. Karpathy is describing the upper bound of what's possible when you have exactly the right setup: a clearly specified task, verifiable outputs, and the kind of infrastructure most developers don't have on their desks. He acknowledges as much, noting the approach works best "for tasks with clear specs where you can verify and test the result." The gap between that scenario and the messy reality of most production codebases is wide enough to drive a truck through.
What he thinks changed
Karpathy attributes the shift to models gaining "long-term coherence and tenacity," which is a polite way of saying they stopped falling apart on tasks that take more than a few minutes. The timing lines up with the release of Claude Opus 4.5 in late November 2025, which Anthropic positioned as a step change in agentic coding. METR's evaluations showed Opus 4.5 could autonomously code for up to five hours without crashing, a metric that sounds impressive until you remember it's measuring persistence, not correctness.
In his year-in-review blog post, Karpathy called Anthropic's Claude Code "the first convincing demonstration of what an LLM Agent looks like" and said OpenAI got the architecture wrong by focusing on cloud-based agents rather than ones running locally on the developer's machine. Boris Cherny, the creator of Claude Code, replied to Karpathy's December post with a similar observation, and Karpathy responded with a characteristically blunt metaphor: "You point the thing around and it shoots pellets or sometimes even misfires and then once in a while when you hold it just right a powerful beam of laser erupts and melts your problem."
That metaphor, honestly, is more useful than the headline claim. The laser-versus-pellet distinction captures what most developers actually experience: wildly inconsistent results where the gap between success and failure depends on factors that are hard to articulate.
From vibe coding to agent engineering
There's a subtle but important shift in Karpathy's framing since last year. Vibe coding was about surrendering control. "Fully give in to the vibes, embrace exponentials, and forget that the code even exists," he wrote in February 2025. A year and change later, he's describing something that sounds like management. You spin up agents, give them tasks in English, review their work, iterate. The big prize, he says, is learning to "climb higher in the levels of abstraction: setting up long-lived orchestrator agents with the right tools, memory, and instructions that productively manage several parallel coding agent instances for you."
That's not vibing. That's project management for robots.
The security and quality concerns that plagued vibe coding haven't disappeared with this framing shift. A CodeRabbit analysis from December 2025 found AI co-authored code contained 1.7 times more major issues than human-written code, with security vulnerabilities running 2.74 times higher. In May 2025, a security researcher scanning apps built with the vibe coding platform Lovable found 170 out of 1,645 exposed sensitive user data with no authentication required. Agent orchestration adds another layer of abstraction on top of code nobody fully reviews, which should make anyone building production systems nervous.
The skill gap question
Back in December, Karpathy admitted he'd "never felt this much behind as a programmer" and called the failure to claim a 10x productivity boost a "skill issue." His latest post doubles down. He says the key skill is now decomposing tasks correctly: handing agents what works and helping at the edges. High-level guidance, judgment, taste, and iteration still matter. Models are stochastic and fallible. You need intuition for what to delegate and what to control.
None of this is controversial, but the "it's not business as usual" framing assumes a consensus that doesn't exist. Plenty of experienced engineers, including Karpathy himself just months earlier, have found agents unreliable for anything novel. He hand-coded his Nanochat project because agents "just didn't work well enough" for it. The question isn't whether agents are useful. They are. The question is whether December 2025 really was a threshold crossing or just another point on a curve that keeps moving.
Karpathy's next follow-up post will probably tell us more. For now, the best summary might be his own, from that same X thread in December: "People who aren't keeping up even over the last 30 days already have a deprecated world view on this topic." If true, most of us are already behind.




