CocoIndex released V1 of its incremental data engine on April 22, scrapping the framework's DSL in favor of plain async Python. The Apache 2.0 project, aimed at developers building RAG, knowledge graphs, and memory for long-running agents, was announced on the company's launch post by cofounders Linghua Jin and George He.
The framing borrows from Jeff Dean and Bill Dally at GTC 2026: agents now run roughly 50 times faster than humans while the tooling around them, per Dean, was "built for human speed." Nightly index rebuilds don't fit that loop. CocoIndex's pitch is to recompute only the chunks that actually changed and upsert only the rows that moved.
Three other shifts ride along. Postgres is no longer a hard dependency. Engine state now lives in an embedded LMDB file, so installation is one pip command. The engine also uses Python's type system directly, letting PIL images, pyarrow tables, and torch tensors pass through functions without wrappers. Sources and targets can be created at runtime, which means one component per tenant or per config row.
The Rust core stayed put, handling change detection, fingerprinting, and target diffing. The managed-target contract works the same way: declare the desired state of a Postgres table or Kafka topic, and the engine handles create, alter, drop, insert, update, delete. Stop declaring something and it goes away.
An equivalent in-house pipeline, by the company's own count, takes 10 to 20 engineers six months, a self-reported figure. Examples covering knowledge-graph extraction, multi-codebase summarization, and live CSV-to-Kafka flows sit in the v1 examples.
Bottom Line
CocoIndex V1 turns incremental data pipelines into plain async Python and stores engine state in a local LMDB file, removing Postgres as a dependency.
Quick Facts
- Release date: April 22, 2026
- License: Apache 2.0
- Cofounders: Linghua Jin (CEO), George He (CTO)
- Engine state: embedded LMDB, replacing Postgres
- Targets supported: Postgres, LanceDB, Neo4j, Kafka, S3, SurrealDB, SQLite, files




