Open-Source AI

Ant Group Open-Sources Ling-2.5-1T, a Trillion-Parameter Instant Model

The 1T-parameter MoE model runs 63B active params with a 1M-token context window under MIT license.

Andrés Martínez
Andrés MartínezAI Content Writer
February 16, 20262 min read
Share:
Abstract visualization of a trillion-parameter neural network with branching attention pathways and linear architecture elements

Ant Group's inclusionAI team has released Ling-2.5-1T, a trillion-parameter open-source model that activates 63 billion parameters per token. It ships under the MIT license. The model upgrades the previous Ling-1T across architecture, training data (29 trillion tokens, up from 20T), and post-training alignment, positioning it as the most capable "instant" (non-thinking) model in the Ling family.

The architecture is the headline change. Ling 2.5 swaps out the grouped query attention from Ling 2.0 for a hybrid setup: a 1:7 ratio of multi-head latent attention to Lightning Linear Attention. The practical result, per Ant Group's own benchmarks, is 3x higher decode throughput on sequences over 32K tokens compared to the previous generation. Context extends to 1 million tokens via YaRN scaling. On the BFCL-V4 benchmark for tool calling, the model claims leading open-source performance, and it's been trained with Agentic RL to work natively with platforms like Claude Code and OpenCode.

The long-context results are where things get interesting. Ling-2.5-1T beats Kimi K2.5 and DeepSeek V3.2 on RULER and MRCR benchmarks (averaged across 16K to 256K windows), and scores perfectly on needle-in-a-haystack tests up to 1M tokens. The team openly acknowledges a gap remains against GPT-5.2 and Gemini 3 Pro on multi-step long-horizon tasks. That kind of candor is unusual in a model card.

A composite reward mechanism combining correctness and "process redundancy" lets the model match the reasoning quality of thinking models that burn roughly 4x more output tokens, according to Ant Group's self-reported numbers. Independent verification hasn't surfaced yet. Weights are available on ModelScope for users in mainland China. No API pricing has been announced for this version.

Bottom Line

Ling-2.5-1T activates 63B of its 1T parameters under MIT license, claiming to match thinking-model reasoning at one-quarter the token cost, though benchmarks are self-reported.

Quick Facts

  • 1 trillion total parameters, 63B active per token
  • 29 trillion pre-training tokens (up from 20T in Ling-1T)
  • Context window: up to 1M tokens via YaRN scaling
  • MIT license, weights on Hugging Face and ModelScope
  • Still trails GPT-5.2 and Gemini 3 Pro on long-horizon tasks (company-reported)
Tags:Ant Groupopen-source AIlarge language modelsmixture of expertsLing-2.5-1Tagentic AIlinear attention
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Ant Group Releases Ling-2.5-1T: Trillion-Parameter Open-Sour | aiHola