Alibaba's Qwen team dropped Qwen3.6-35B-A3B on Hugging Face Thursday. The model card puts 35B total parameters behind a MoE router that activates just 3B per token. It's the first release in the 3.6 line, arriving roughly two months after the Qwen3.5 medium series.
Coding is the pitch. The blog post highlights frontend workflows and repository-level reasoning. Self-reported numbers put SWE-bench Verified at 73.4 and Terminal-Bench 2.0 at 51.5, both up from the 3.5 checkpoint. Independent evals don't exist yet.
On vision, Qwen claims parity or better against Claude Sonnet 4.5 on every benchmark it chose to publish. MMMU: 81.7 vs 79.6. RealWorldQA: 85.3 vs 70.3. MathVista-mini: 86.4 vs 79.8. The RealWorldQA gap is suspiciously wide, and Sonnet's parameter count is undisclosed, so compute-per-token comparisons are guesswork.
The other headline feature is "Thinking Preservation," an option to keep reasoning traces from earlier turns instead of discarding them. Qwen pitches this for long-running agent loops where repeated re-thinking burns tokens.
Architecture holds steady from 3.5: Gated DeltaNet plus 256 experts (8 routed, 1 shared), 262K native context extensible past 1M via YaRN. License is Apache 2.0. Weights are live on Hugging Face and ModelScope, and the model is already running on Qwen Chat.
Bottom Line
Qwen3.6-35B-A3B activates 3B of 35B parameters per token and posts a self-reported 85.3 on RealWorldQA against Claude Sonnet 4.5's 70.3.
Quick Facts
- 35B total parameters, 3B activated per token
- 256 experts, 8 routed plus 1 shared active
- 262,144 native context, extensible to 1,010,000 via YaRN
- Apache 2.0 license, available on Hugging Face and ModelScope
- SWE-bench Verified: 73.4 (Qwen self-reported)




