Context Graphs: Enterprise AI's Missing Layer or VC Hype?

Foundation Capital partners Jaya Gupta and Ashu Garg published an essay on December 22nd arguing that the next trillion-dollar enterprise platforms will be built not on better AI models, but on something they called context graphs: living records of decision traces, stitched across systems and time, that make organizational precedent searchable. Within weeks, the piece had circulated through Slack channels at OpenAI and Anthropic, drawn responses from the CEOs of HubSpot, Box, and Glean, and generated hundreds of startup pitches back to Foundation Capital's inbox.

A month later, Garg published a follow-up cataloging what resonated and where the pushback landed. The reception tells you something about the state of enterprise AI right now: everyone agrees there's a missing layer between raw data and autonomous agents. Nobody agrees on what it looks like.

The gap that doesn't have a product yet

The core observation is simple enough that it's almost annoying nobody named it sooner. Enterprise software records outcomes. Salesforce logs the final deal price. Zendesk marks a ticket resolved. SAP captures the shipment. But the reasoning that connected data to action, the exception logic, the precedent from a similar deal last quarter, the Slack thread where someone explained why healthcare clients always get an extra 10% discount, none of that was ever treated as data in the first place.

Gupta and Garg call these missing records "decision traces." Accumulate enough of them and you get something queryable: a context graph. The pitch is that startups building agent orchestration have a structural advantage here because they sit in the execution path. They're present when decisions happen. Incumbents like Salesforce or Workday would have to retrofit this capability into workflows they don't control.

Dharmesh Shah of HubSpot called it a system of record for decisions. Aaron Levie of Box said we've entered an era where context, not models, is the differentiator. Glean CEO Arvind Jain's response was telling: "Everyone is suddenly talking about context graphs. At Glean, we're excited, because it finally has a name."

That last quote is worth sitting with. Glean has been building enterprise search and knowledge graphs for years. When an established company says "it finally has a name," they're claiming prior art, not discovering a new concept.

Prescribed versus learned (and why both camps are partially wrong)

Here's where it gets interesting. The moment you say "context graph," you have to answer a foundational question: how do you model organizational reality? And the industry has split into two camps with genuinely different visions.

On one side, there's the Palantir model. Prescribe your ontology upfront. Define the schema. Map messy enterprise data into it. Deploy forward-deployed engineers to customize it for each customer. It works. It's also expensive, slow, and doesn't scale well.

On the other side, there's Animesh Koratana's vision. Koratana, the founder of PlayerZero and a Stanford DAWN lab alum, wrote what might be the most technically ambitious response to the Foundation Capital essay. His argument: don't prescribe an ontology at all. Let agents discover it through use. When an agent investigates an incident or completes a task, its trajectory through your company's systems is itself a trace through organizational state space. Accumulate thousands of these trajectories and you get learned embeddings that encode how the organization actually functions. The schema isn't the starting point. It's the output.

There's a reinforcement learning analogy here that Koratana pushes hard: the orchestrator is the policy, the context graph is the learned world model, agent traces are the trajectories. Every successful execution reinforces good patterns. The context graph becomes a simulator, not just a search index.

And then Kirk Marple of Graphlit, who's been building context infrastructure for over three years, called this a false dichotomy. His counterpoint is blunt: you can't learn from agent trajectories until agents are running effectively, and agents can't run effectively without foundational context. You can't wait for thousands of agentic RAG runs to discover that Sarah Chen is a person who works at Acme Corp. You need to know that before agents start reasoning.

The bootstrap problem

Marple's critique cuts deeper than it first appears. The node2vec analogy that Koratana uses actually works against his own argument if you think about it carefully. Node2vec learns embeddings from walk patterns over existing edges. It doesn't discover nodes and edges from scratch. The graph has to exist first.

So who builds the initial graph? Marple's answer: adopt existing ontologies (Schema.org, CDM, WAND) for the basics, then let learning refine and extend them. The entities themselves, Person, Organization, Account, Contact, don't need to be learned. We've known what those are for decades. Waiting for agents to discover that accounts and contacts are important entities is, as Marple puts it, like waiting for a search engine to discover that nouns exist.

The genuinely hard problem, the one neither prescribed nor learned ontologies solve well, is temporal. Most systems can tell you what's true now. Almost none can tell you what was true at a specific point in the past, or how truth evolved over time. Koratana called this the "event clock," and I think he's right that it's the crux. "Paula works at Microsoft" is a fact. But when did she start? Does she still? What was true in 2022? If your context graph can't answer "what did the system know at 2:14 PM last Tuesday when it made this decision," you're not building what Foundation Capital described.

Is this a category?

The skeptical read is that "context graph" is VC category creation, a nicely branded concept that gives Foundation Capital's portfolio companies (PlayerZero, Maximor, Tessera, Tonkean, Regie) a shared narrative for fundraising. I don't think that's entirely fair, but it isn't entirely wrong either.

Garg himself acknowledges the risk in his follow-up. Some see this becoming "data catalog 3.0": important infrastructure that eventually gets absorbed into warehouses, catalogs, and observability tools rather than standing alone. The comparison to "data mesh," a concept that generated enormous buzz and then mostly dissolved into existing practices, came up repeatedly.

Cognition (the Devin team) attached the context graph framing to their Agent Trace launch, an open spec for recording AI contributions alongside human authorship in version-controlled codebases. That's a concrete implementation, which is more than most of the discourse has produced. But it's also a spec for code commits, not the cross-system decision layer that Foundation Capital's essay describes. The gap between the vision and the implementations remains wide.

What nobody's talking about

There's a question buried in the Metadata Weekly response to this whole debate that I haven't seen adequately addressed. When a renewal agent proposes a 20% discount, the context doesn't come from one system. It comes from everywhere: the CRM, billing, support tickets, Slack, PagerDuty. And every enterprise has a different combination of these systems. One customer runs Salesforce plus Zendesk plus Snowflake. Another runs HubSpot plus Intercom plus Databricks.

The vertical agent startup sees the execution path for its own workflow. But enterprises have dozens of agents, across dozens of vendors, each building their own context silo. We might end up with fragmented context graphs that are no better than the fragmented systems of record they were supposed to transcend.

Garg's response to the "can you really capture the why" question is worth noting here. He concedes: you can't. What you capture is the how, the sequence of actions, and infer the why from patterns over time. That's a more modest claim than the original essay made, and probably a more honest one.

The next concrete test: Foundation Capital says they expect every organization to have multiple context graphs, each shaped by its domain. PlayerZero is building one for production engineering. Maximor for finance. Regie for demand gen. Whether those graphs talk to each other, and whether they compound the way the thesis promises, is the question that will determine if this is a category or a footnote.