Perplexity Open-Sources pplx-embed Embedding Models Under MI

Abstract visualization of bidirectional text embedding vectors in a high-dimensional space with search retrieval paths connecting query and document nodes

Perplexity released pplx-embed, a pair of open-source text embedding models built for production-scale search. The family includes pplx-embed-v1 for standard retrieval and pplx-embed-context-v1 for document-aware embeddings, both available at 0.6B and 4B parameter sizes. The company published a research blog alongside a technical paper detailing the approach.

The interesting architectural move: Perplexity took Qwen3 and converted it from a decoder-only LLM into a bidirectional encoder using diffusion-based pretraining. That lets the model process tokens in both directions simultaneously, which is better suited for embedding tasks than the standard left-to-right approach most LLMs use. Three stages of contrastive learning follow, with the final model produced by merging checkpoints via spherical linear interpolation.

On the ConTEB contextual retrieval benchmark, the 4B model scores 81.96% nDCG@10, ahead of Voyage's voyage-context-3 at 79.45% and Anthropic Contextual at 72.4%. Those are Perplexity's own reported numbers. On the broader MTEB multilingual retrieval benchmark, the 4B variant hits 69.66%, roughly matching Qwen3-Embedding-4B and beating Google's gemini-embedding-001. Independent verification is pending.

Storage is where things get practical. The models natively output INT8-quantized embeddings (trained that way, not post-hoc compressed), cutting storage 4x versus FP32. Binary quantization pushes that to 32x with under 1.6 percentage points of accuracy loss on the larger model. No instruction prefix required, either.

All four models are on Hugging Face under MIT license and accessible through the Perplexity API.

Bottom Line

Perplexity's 4B contextual embedding model scores 81.96% on ConTEB, topping Voyage and Anthropic's offerings by 2.5 and 9.5 points respectively, and ships under MIT license.

Quick Facts

Models: pplx-embed-v1 and pplx-embed-context-v1 (0.6B and 4B sizes)
ConTEB score: 81.96% nDCG@10 (4B, company-reported)
MTEB multilingual retrieval: 69.66% nDCG@10 (4B, company-reported)
Storage reduction: 4x (INT8) to 32x (binary) vs. FP32
License: MIT, available on Hugging Face and Perplexity API

Tags:Perplexityembedding modelsopen sourceRAGtext retrievalQwen3NLP

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Perplexity Open-Sources Embedding Models That Beat Anthropic and Voyage

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Thinking Machines Previews Real-Time Interaction Models

Zyphra Releases 74B MoE Checkpoint Trained Entirely on AMD

Tencent Open-Sources 440MB Offline Translation Model for Phones

Stay Ahead of the AI Curve