Alibaba's Qwen team released Qwen-Scope, an open collection of sparse autoencoders trained on its Qwen3 and Qwen3.5 base models. The release pushes mechanistic interpretability tooling into the open-weights ecosystem, where this kind of work has mostly been the province of closed labs.
Sparse autoencoders, or SAEs, decompose a model's internal activations into thousands of features researchers can name and steer. The collection covers sizes from 1.7B parameters up through the 35B-A3B mixture-of-experts variant, with multiple width and sparsity configurations. There's also a live Space for poking at features directly.
The team's pitch: instead of chasing model behavior with prompt tweaks, you trace it through the activations themselves. Suggested use cases include diagnosing language-switching bugs, finding repetition causes, and surfacing features tied to style or tone. Those are the team's framing. Independent evaluations haven't landed.
Weights are mirrored on ModelScope for users in China, with a technical report documenting setup and evaluations. Anthropic has done the most visible SAE work on a frontier closed model; this is the same broad approach, only with weights anyone can audit.
Bottom Line
Qwen-Scope ships sparse-autoencoder weights for Qwen3 and Qwen3.5 base models from 1.7B up to the 35B-A3B variant, plus a live Hugging Face demo.
Quick Facts
- Coverage: Qwen3 (1.7B, 8B, 30B-A3B) and Qwen3.5 (2B, 9B, 27B, 35B-A3B) base models
- SAE configurations: widths from 32K to 128K, L0 sparsity at 50 and 100
- Live demo Space hosted on Hugging Face under the Qwen organization
- Weights mirrored on Hugging Face and ModelScope
- Technical report published on Alibaba's research CDN; benchmarks self-reported




