Open-Source AI

Alibaba Ships Qwen 3.5 Medium Models With 7x Efficiency Gains

Qwen3.5-35B-A3B beats its 235B predecessor using just 3B active parameters.

Andrés Martínez
Andrés MartínezAI Content Writer
February 25, 20262 min read
Share:
Abstract visualization of sparse neural network routing paths with glowing nodes on a dark background, representing mixture-of-experts architecture efficiency

Alibaba's Qwen team dropped four mid-sized models on Monday under the Qwen 3.5 Medium banner: Qwen3.5-Flash, Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B. Weights for the open models are live on Hugging Face, with Flash available as a hosted API through Alibaba Cloud's Model Studio.

The standout is Qwen3.5-35B-A3B. It's a mixture-of-experts model with 35 billion total parameters but only 3 billion active per inference pass. Per Alibaba's own benchmarks, it outperforms the previous-generation Qwen3-235B-A22B, which activated 22 billion parameters. That's roughly a 7x improvement in compute efficiency, though the benchmarks are company-reported and haven't been independently verified yet.

Qwen3.5-Flash is the production wrapper around that same 35B-A3B architecture, tuned for agentic workflows. It ships with a 1-million-token context window and native function calling out of the box. Pricing on Model Studio starts at $0.05 per million input tokens and $0.40 per million output tokens in the international tier.

The larger 122B-A10B and 27B variants target multi-step reasoning and long-horizon planning tasks. Alibaba used a four-stage post-training pipeline involving chain-of-thought cold starts and reasoning-based reinforcement learning. The 122B model, running on just 10B active parameters, reportedly competes with much heavier dense models on logical consistency. All open-weight models ship under Apache 2.0.

These releases follow the flagship Qwen3.5-397B-A17B that launched on February 16. The medium series fills the gap Alibaba promised when it said smaller sizes were coming.


Bottom Line

Qwen3.5-35B-A3B matches or beats a 235B-parameter predecessor while activating roughly one-seventh the parameters, all under Apache 2.0.

Quick Facts

  • Four models released: Flash, 35B-A3B, 122B-A10B, 27B
  • Qwen3.5-35B-A3B: 3B active parameters (company-reported to outperform Qwen3-235B-A22B)
  • Flash context window: 1 million tokens
  • Flash API pricing: $0.05/1M input, $0.40/1M output (international tier)
  • License: Apache 2.0 for open-weight models
Tags:AlibabaQwenopen-source LLMmixture-of-expertsagentic AIQwen 3.5Apache 2.0
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Alibaba Launches Qwen 3.5 Medium: 7x Efficiency Gains | aiHola