Sequoia Capital partners Pat Grady and Sonya Huang published an essay last week titled "2026: This is AGI," declaring that artificial general intelligence has arrived. Their evidence: coding agents like Claude Code can now work autonomously for about 30 minutes without falling apart.
That's the bar. Thirty minutes of semi-reliable task completion.
The Definition Game
The essay opens with a familiar scene: researchers dodging the AGI question with "we'll know it when we see it." Grady and Huang position themselves as the adults in the room, offering a functional definition rather than a technical one. AGI, they write, is simply "the ability to figure things out."
It's a clever move. By defining AGI as autonomous problem-solving rather than human-level cognition, Sequoia can claim the prize has already been won. The goalposts didn't move; they were just never where you thought they were.
Their framework breaks down intelligence into three components: baseline knowledge from pre-training (the ChatGPT moment of 2022), reasoning via inference-time compute (OpenAI's o1 in late 2024), and iteration through long-horizon agents (Claude Code and similar tools in the past few weeks). Stack these together and you get something that can "figure things out." That's AGI, apparently.
The essay's most telling admission comes in a footnote: "We appreciate that such an imprecise definition will not settle any philosophical debates." No kidding.
The METR Extrapolation
The quantitative backbone of Sequoia's argument comes from METR, a research organization tracking AI agent capabilities through task completion time horizons. The metric measures how long a task takes human experts, then tracks which duration of tasks AI agents can complete with 50% reliability.
The trend is genuinely striking: a doubling time of roughly seven months over the past six years. METR's March 2025 paper found that Claude 3.7 Sonnet has a time horizon of approximately one hour, meaning it can succeed at 50% of tasks that take human programmers an hour to complete.
Sequoia takes this exponential and runs with it. If the trend continues, they write, "agents should be able to work reliably to complete tasks that take human experts a full day by 2028, a full year by 2034, and a full century by 2037."
A century. By 2037.
This is where the investor optimism starts to show. METR itself is considerably more cautious. Their researchers note that performance degrades significantly on "messier" tasks that resemble real-world work. The correlation between AI success and log human time is strong (R² ≈ 0.83) on their benchmark tasks, but those benchmarks are dominated by well-structured software engineering problems. Whether the trend holds for ambiguous, context-dependent work is an open question.
The METR researchers also flag that their extrapolations "do not account for future changes in the trend or external validity concerns, which are responsible for the majority of our uncertainty." Sequoia's essay mentions none of these caveats.
The Recruiting Demo
To illustrate "figuring things out," Sequoia describes an agent completing a recruiting task in 31 minutes. A founder asks for a developer relations lead. The agent searches LinkedIn, pivots to YouTube conference talks, cross-references Twitter activity, identifies someone who seems disengaged from their current role, and drafts outreach.
It's a compelling vignette. It's also hypothetical. There's no citation, no product name, no indication this actually happened. The essay presents it as illustrative rather than documented, which is fine for a thought experiment but thin for a proof of AGI.
The more interesting question: how often does this work? The essay acknowledges that "agents still fail. They hallucinate, lose context, and sometimes charge confidently down exactly the wrong path." But how often? At 50% reliability, you'd need to run the recruiting agent twice on average to get a usable result. At lower reliability levels, you're generating plausible-sounding nonsense that someone still has to verify.
The Technical Reality
The essay credits two approaches for enabling long-horizon agents: reinforcement learning (training models to maintain focus over extended tasks) and agent harnesses (external scaffolding for memory, handoffs, and guardrails).
Claude Code, which Anthropic released as a command-line tool in late 2024 and expanded to a web interface in October 2025, exemplifies the harness approach. It gives Claude access to the terminal, file system, and bash commands, letting it iterate through code like a human developer would. Last week, Anthropic added tool search to reduce context overhead.
The underlying capabilities are real. Claude Sonnet 4.5 reportedly maintains focus for more than 30 hours on complex tasks, according to Anthropic. Coding benchmarks like SWE-bench Verified show steady improvement, with top agents now solving 50-70% of real GitHub issues. Factory's Droids, Manus, and similar tools are gaining traction with developers.
But there's a gap between "coding agents are useful" and "AGI is here." The first is demonstrably true. The second requires accepting Sequoia's definitional sleight of hand.
The Business Case
Strip away the AGI framing and Sequoia's essay reads like a founder pitch deck. They list AI applications that "function as" specialists: OpenEvidence for medicine, Harvey for law, XBOW for penetration testing, Day AI for sales. The message to portfolio companies is clear: the technology has arrived, start building.
The essay also signals Sequoia's investment thesis. They're backing both Anthropic (reportedly in talks for a $25 billion round) and OpenAI. If AGI is here, every AI company becomes a potential winner, justifying dual bets on competing approaches.
There's nothing wrong with venture capitalists being bullish. That's the job. But the essay's rhetorical strategy, redefining AGI to match current capabilities, deserves scrutiny. It conflates "agents that can complete structured tasks autonomously" with "artificial general intelligence," two claims with very different implications.
What's Actually New
The genuine news in Sequoia's essay isn't the AGI declaration. It's the snapshot of where agent capabilities stand in early 2026.
Long-horizon agents have crossed a usability threshold for certain workflows. Coding assistance has moved from autocomplete to autonomous task execution. Developers are discovering that Claude Code handles non-programming work surprisingly well, which prompted Anthropic to release Cowork, a version for general users, this week.
The METR benchmark, despite its limitations, provides a useful framework for tracking progress. If the exponential holds, agents capable of day-long autonomous work would arrive in the next couple of years. That's worth watching.
What Sequoia gets wrong is the framing. Calling current agents "functionally AGI" doesn't illuminate the technology; it muddies the discourse. When everything is AGI, the term means nothing.
The next FTC deadline for AI regulation comments closes in March. Whether agents that can work for 30 minutes count as general intelligence matters less than whether they can be deployed safely. That's a conversation Sequoia's essay seems uninterested in having.




