CocoIndex V1 Drops DSL for Python-Native Pipelines

Abstract visualization of branching data pipelines with selected segments highlighted to represent incremental updates flowing through nodes

CocoIndex released V1 of its incremental data engine on April 22, scrapping the framework's DSL in favor of plain async Python. The Apache 2.0 project, aimed at developers building RAG, knowledge graphs, and memory for long-running agents, was announced on the company's launch post by cofounders Linghua Jin and George He.

The framing borrows from Jeff Dean and Bill Dally at GTC 2026: agents now run roughly 50 times faster than humans while the tooling around them, per Dean, was "built for human speed." Nightly index rebuilds don't fit that loop. CocoIndex's pitch is to recompute only the chunks that actually changed and upsert only the rows that moved.

Three other shifts ride along. Postgres is no longer a hard dependency. Engine state now lives in an embedded LMDB file, so installation is one pip command. The engine also uses Python's type system directly, letting PIL images, pyarrow tables, and torch tensors pass through functions without wrappers. Sources and targets can be created at runtime, which means one component per tenant or per config row.

The Rust core stayed put, handling change detection, fingerprinting, and target diffing. The managed-target contract works the same way: declare the desired state of a Postgres table or Kafka topic, and the engine handles create, alter, drop, insert, update, delete. Stop declaring something and it goes away.

An equivalent in-house pipeline, by the company's own count, takes 10 to 20 engineers six months, a self-reported figure. Examples covering knowledge-graph extraction, multi-codebase summarization, and live CSV-to-Kafka flows sit in the v1 examples.

Bottom Line

CocoIndex V1 turns incremental data pipelines into plain async Python and stores engine state in a local LMDB file, removing Postgres as a dependency.

Quick Facts

Release date: April 22, 2026
License: Apache 2.0
Cofounders: Linghua Jin (CEO), George He (CTO)
Engine state: embedded LMDB, replacing Postgres
Targets supported: Postgres, LanceDB, Neo4j, Kafka, S3, SurrealDB, SQLite, files

Tags:CocoIndexAI infrastructureRAGopen sourcedata pipelinesAI agents

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

CocoIndex Ships V1, Replaces DSL With Plain Python

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

EU Joins US-Led Pax Silica Alliance as Membership Nears 24 Countries

Oxford Researcher Maps China's Grey Market for Claude Tokens at 90% Off

Qwen Releases AgentWorld, a Language Model That Simulates Agent Environments

Stay Ahead of the AI Curve