Ant Group on Wednesday released Ling-2.6-flash, a 104-billion-parameter mixture-of-experts model that activates only 7.4 billion parameters at inference. The pitch, per Ant's press release, is token economy over raw scale.
The elephant, revealed
Before the official launch, the model spent days on OpenRouter under the codename "Elephant Alpha." Developers used it blind. Ant says it topped the trending charts during that run and hit roughly 100 billion daily token calls at peak. Take that number with whatever skepticism you apply to a company citing its own traffic figures.
Kilo Code confirmed the identity in a blog post, with the predictable joke that you can't spell Elephant without Ant.
What the tokens actually cost
On the Artificial Analysis Intelligence Index, Ling-2.6-flash scores 26. That's a 10-point jump over its predecessor Ling-flash-2.0, though the index aggregates ten different evals, so a single number flattens a lot of detail.
The interesting part is what Ant got the score with. Total output consumed during the full eval: 15 million tokens. Nemotron-3-Super, which Ant picked as its comparison, consumed more than 110 million. Ant's framing: 86% less spend for comparable intelligence. The framing that matters less: Ant chose the comparison.
API pricing lands at $0.10 per million input tokens and $0.30 per million output on Ant's endpoint. Free on OpenRouter for the first week.
Built for agents
Ant optimized specifically for agentic workflows, citing results on BFCL-V4, SWE-bench Verified, TAU2-bench, Claw-Eval and PinchBench. The company calls it SOTA in its size class, the qualification every size-class claim carries these days. Against genuinely larger models the comparison stops working.
On speed: 215 tokens per second sustained, peaks around 340 on a 4-card H20 setup, with prefill throughput Ant puts at 2.2 times Nemotron-3-Super. Fast, by current standards.
Availability
The model is live on OpenRouter with free access this week, and through Ant's Alipay Tbox platform. A commercial variant called LingDT routes through Ant Digital Technologies for enterprise customers. Previous Ling generations remain in Ant's GitHub repo; the 2.6-flash weights were not posted there at the time of the announcement.
The free OpenRouter tier ends seven days from launch. After that, everything runs on the $0.10 and $0.30 pricing.




