Open-Source AI

Alibaba Drops Qwen3.5, a 397B Open-Weight Multimodal Model

First Qwen3.5 model ships with native vision, hybrid architecture, and Apache 2.0 license.

Andrés Martínez
Andrés MartínezAI Content Writer
February 16, 20262 min read
Share:
Abstract visualization of a large sparse neural network with interconnected nodes and expert pathways, representing mixture-of-experts architecture

Alibaba's Qwen team released Qwen3.5-397B-A17B today, the first open-weight model in the Qwen3.5 series. It's a Mixture-of-Experts model with 397 billion total parameters but only 17 billion active per token, available on Hugging Face under Apache 2.0.

The big shift from Qwen3: this is natively multimodal. Where Qwen3 required separate vision models (Qwen3-VL), Qwen3.5 fuses text and image understanding through early multimodal token training. On the architecture side, the model adopts the hybrid linear attention approach first seen in Qwen3-Next, combining Gated Delta Networks with standard attention in a 3:1 ratio across 60 layers. That design, paired with 512 routed experts, is built for throughput at long context lengths up to 262K tokens natively (1M via the hosted Qwen3.5-Plus API).

Benchmark numbers on the GitHub repo tell a mixed story. Qwen3.5 leads on visual math benchmarks like MathVision (88.6, beating Gemini 3 Pro's 86.6) and scores 85.0 on MMMU. On text-only reasoning, it sits slightly behind the top proprietary models: 91.3 on AIME26 versus GPT-5.2's 96.7, and 83.6 on LiveCodeBench v6 compared to Gemini 3 Pro's 90.7. Agentic coding tells a similar story, with 76.4 on SWE-bench Verified against Claude Opus 4.5's 80.9. These are all self-reported numbers.

Language coverage jumps to 201 languages and dialects, up from Qwen3's 119. The RL training pipeline also scaled up, with Alibaba claiming reinforcement learning across "million-agent environments," though specifics on that infrastructure remain thin. The blog post promises more model sizes are coming.

For an open-weight model you can self-host, matching or approaching GPT-5.2 and Gemini 3 Pro on several benchmarks is notable. Availability starts now via SGLang, vLLM, and the Qwen API.


Bottom Line

Qwen3.5-397B-A17B is the strongest open-weight multimodal model available today, competitive with top proprietary models on vision benchmarks while activating only 17B of its 397B parameters.

Quick Facts

  • 397B total parameters, 17B activated per token (MoE)
  • Apache 2.0 license, weights on Hugging Face
  • 201 languages and dialects supported
  • 262K native context, 1M via hosted API
  • 88.6 on MathVision, 85.0 on MMMU (company-reported)
  • 60 layers: 45 Gated DeltaNet + 15 Gated Attention
Tags:QwenAlibabaopen-weightmultimodalmixture-of-expertsApache 2.0LLM
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Alibaba Releases Qwen3.5-397B Open-Weight Multimodal Model | aiHola