Elastic Ships Jina v5 Omni Multimodal Embedding Models

Abstract visualization of multimodal data converging into a unified vector space

Elastic released jina-embeddings-v5-omni last week, a multimodal embedding family that maps text, images, audio, and video into one vector space. Two sizes ship: small at 1.57B parameters and nano at 0.9B. Both are live on Hugging Face and through the Jina API.

The pitch is drop-in compatibility. v5-omni shares the same text embedding space as the existing v5-text models, so teams running text search can index multimedia into the same vectors without rebuilding their pipeline. According to the technical report, Jina froze the text backbone and the new media encoders, training only the projector layers between them, roughly 0.35% of total weights.

"Make multimodal search as easy and scalable as text search already is," said Ken Exner, Elastic's chief product officer, which is the standard pitch. The numbers worth checking: Jina reports v5-omni-small leads its size class on image retrieval (MIEB) and audio retrieval (MAEB), with self-reported scores of 56.05 on image and 51.46 on audio. Video is the soft spot at 41.20, trailing the larger LCO-7B's 47.41.

The models run locally, through the Jina API, or via Elastic Inference Service. Task-specific variants for retrieval, classification, clustering, and text-matching are also up on Hugging Face. GGUF builds for llama.cpp require a fork; the patches aren't upstream yet.

Bottom Line

v5-omni-small trains just 0.35% of its weights and reuses existing v5-text indexes without re-embedding.

Quick Facts

Small variant: 1.57B parameters
Nano variant: 0.9B parameters
Trainable weights: 0.35% of total (company-reported)
Modalities: text, image, audio, video
Announced: May 11, 2026

Tags:Jina AIElasticembeddingsmultimodal AIvector searchRAGopen source

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Elastic Releases Jina v5 Omni Multimodal Embedding Models

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Alibaba's AIDC Team Releases Ovis2.6-30B-A3B Vision Model

Open-Weight LLMs in 2026 Reshape Attention to Cut Long-Context Costs

Bill Gurley turns open source into a corporate weapon, points it at AI

Stay Ahead of the AI Curve