Alibaba Launches Qwen 3.5 Medium: 7x Efficiency Gains

Abstract visualization of sparse neural network routing paths with glowing nodes on a dark background, representing mixture-of-experts architecture efficiency

Alibaba's Qwen team dropped four mid-sized models on Monday under the Qwen 3.5 Medium banner: Qwen3.5-Flash, Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, and Qwen3.5-27B. Weights for the open models are live on Hugging Face, with Flash available as a hosted API through Alibaba Cloud's Model Studio.

The standout is Qwen3.5-35B-A3B. It's a mixture-of-experts model with 35 billion total parameters but only 3 billion active per inference pass. Per Alibaba's own benchmarks, it outperforms the previous-generation Qwen3-235B-A22B, which activated 22 billion parameters. That's roughly a 7x improvement in compute efficiency, though the benchmarks are company-reported and haven't been independently verified yet.

Qwen3.5-Flash is the production wrapper around that same 35B-A3B architecture, tuned for agentic workflows. It ships with a 1-million-token context window and native function calling out of the box. Pricing on Model Studio starts at $0.05 per million input tokens and $0.40 per million output tokens in the international tier.

The larger 122B-A10B and 27B variants target multi-step reasoning and long-horizon planning tasks. Alibaba used a four-stage post-training pipeline involving chain-of-thought cold starts and reasoning-based reinforcement learning. The 122B model, running on just 10B active parameters, reportedly competes with much heavier dense models on logical consistency. All open-weight models ship under Apache 2.0.

These releases follow the flagship Qwen3.5-397B-A17B that launched on February 16. The medium series fills the gap Alibaba promised when it said smaller sizes were coming.

Bottom Line

Qwen3.5-35B-A3B matches or beats a 235B-parameter predecessor while activating roughly one-seventh the parameters, all under Apache 2.0.

Quick Facts

Four models released: Flash, 35B-A3B, 122B-A10B, 27B
Qwen3.5-35B-A3B: 3B active parameters (company-reported to outperform Qwen3-235B-A22B)
Flash context window: 1 million tokens
Flash API pricing: $0.05/1M input, $0.40/1M output (international tier)
License: Apache 2.0 for open-weight models

Tags:AlibabaQwenopen-source LLMmixture-of-expertsagentic AIQwen 3.5Apache 2.0

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Alibaba Ships Qwen 3.5 Medium Models With 7x Efficiency Gains

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Perplexity Open-Sources Embedding Models That Beat Anthropic and Voyage

Skywork Publishes SkyReels-V4 Technical Report for Unified Video-Audio Model

Microsoft Launches Copilot Tasks to Automate Everyday Work

Stay Ahead of the AI Curve