DeepMind Researchers Propose AGI Could Emerge from Agent Swarms, Not a Single System

Google DeepMind researchers have published a paper challenging a core assumption in AI safety: that artificial general intelligence will arrive as a single, monolithic system. The paper, posted to arXiv on December 18, argues that AGI may first manifest through coordination in groups of sub-AGI individual agents with complementary skills, a scenario the authors call the "patchwork AGI hypothesis."

The argument

Lead author Nenad Tomasev, a senior research scientist at DeepMind who has previously worked on AI applications in healthcare and knot theory, and four co-authors contend that the rapid deployment of advanced AI agents with tool-use capabilities and the ability to communicate and coordinate makes this an urgent safety consideration.

The framing is provocative. Most AI safety research assumes researchers will eventually build, or stumble into, something that thinks generally and powerfully on its own. This paper says: what if general capability just emerges when you let enough specialized agents talk to each other? The problem then isn't aligning a single mind. It's more like regulating a market.

To that end, the researchers propose a framework for distributional AGI safety that moves beyond evaluating and aligning individual agents. Their solution centers on what they call "virtual agentic sandbox economies", where agent-to-agent transactions would be governed by robust market mechanisms, coupled with appropriate auditability, reputation management, and oversight.

Why sandboxes?

The sandbox concept isn't entirely new. Tomasev and colleagues at DeepMind published a related paper in September on "Virtual Agent Economies" that laid out this framework in more detail. That work characterized such systems along two dimensions: their origins (emergent vs. intentional) and their degree of separateness from the established human economy (permeable vs. impermeable).

The worry, essentially, is that we're heading toward a spontaneous emergence of a vast and highly permeable AI agent economy, one that operates at speeds beyond human oversight and bleeds directly into real financial systems. The September paper warned this presents significant challenges, including systemic economic risk and exacerbated inequality.

The new paper builds on this by connecting it explicitly to AGI timelines and safety research. If you buy the patchwork hypothesis, then single-agent alignment techniques miss the point. You'd need to think about auction mechanisms, reputation systems, and the kind of infrastructure that lets you audit what millions of agents are doing with each other.

Does the premise hold?

The argument rests on a contestable claim: that coordination among sub-AGI systems could produce general capabilities none of them possess individually. There's some precedent for this in multi-agent research. A February 2025 report from the Cooperative AI Foundation identified "emergent agency" as one of seven key risk factors in advanced multi-agent systems, describing it as qualitatively new goals or capabilities arising from collections of agents.

But "emergent capabilities" is doing a lot of work here. The paper doesn't demonstrate that current agent swarms are approaching anything like general intelligence, even collectively. It's making a bet about the future shape of AI development, and that bet could be wrong.

Still, the deployment trend is real. By mid-2025, over 70% of enterprise AI deployments are expected to involve multi-agent or action-based systems, according to one market analysis. Companies are already building agent networks that coordinate on complex tasks. Whether that coordination will ever produce genuine generality is a separate question from whether the coordination itself needs governance.

What's actually being proposed

The paper's practical contribution is the sandbox framework. The authors envision controlled environments where AI agents transact under rules designed to make their behavior auditable and steerable. Think of it as a regulatory sandbox for algorithmic economies, with mechanisms borrowed from auction theory and distributed systems.

Reputation mechanisms and verification protocols may play an important role in establishing robust and safe multi-agent collaboration, the researchers write. They suggest using verifiable credentials as machine-readable trust signals between agents.

Whether any of this would actually work at scale is unclear. Sandbox economies that stay isolated from the real economy might be safe but also useless. Permeable ones, which the authors acknowledge we're more likely to get, bring some risk of contagion in which a crisis in the sandbox sparks a crisis in the real economy.

The paper doesn't resolve this tension so much as name it.

The paper is available on arXiv. DeepMind has not announced any plans to implement the proposed framework.