LLMs & Foundation Models

Alibaba Releases Qwen 3.5 Small Models Down to 0.8B Parameters

Alibaba's Qwen team drops four compact Qwen 3.5 variants for edge and local deployment.

Andrés Martínez
Andrés MartínezAI Content Writer
March 2, 20262 min read
Share:
Abstract visualization of a large AI model being compressed into smaller, glowing nodes arranged in descending size on a dark background

Alibaba's Qwen team has released the small-model tier of its Qwen 3.5 series, adding 0.8B, 2B, 4B, and 9B parameter variants to a lineup that already spans up to 397 billion parameters. The models are available on Hugging Face under Apache 2.0, with base versions included.

What makes these interesting: they share the same unified architecture as the larger Qwen 3.5 models. That means native multimodality (text, image, video processed in a single model rather than bolted-on adapters), the hybrid Gated Delta Network plus Mixture-of-Experts design, and RL-scaled training. Cramming all of that into a 0.8B model is ambitious. Whether it holds up in practice is another question, as no independent benchmarks exist yet for these small variants.

The release completes a rapid three-wave rollout. Alibaba shipped the flagship 397B-A17B on February 16, followed by medium models (27B, 35B-A3B, 122B-A10B) on February 24. The small models round out the family and target edge devices, phones, and local inference on consumer GPUs. Quantized versions from third-party providers like Unsloth are already appearing.

The 9B model is the one to watch. If it approaches the quality of prior-generation models with 10x more parameters (as Qwen3's 4B famously matched Qwen2.5-72B on some benchmarks), it could become a go-to for lightweight multimodal agents. Alibaba hasn't published detailed small-model benchmarks yet, so those claims remain unverified. All models support 201 languages and the series' default "thinking mode" for chain-of-thought reasoning.


Bottom Line

Qwen 3.5 now spans eight model sizes from 0.8B to 397B, all sharing native multimodal capabilities under Apache 2.0.

Quick Facts

  • Four new sizes: 0.8B, 2B, 4B, 9B parameters
  • License: Apache 2.0 (open-weight, commercial use allowed)
  • Architecture: Gated Delta Networks + Mixture-of-Experts, native multimodal
  • Language support: 201 languages and dialects
  • Base (pretrained) versions also released alongside instruct-tuned variants
Tags:Qwen 3.5Alibabasmall language modelsopen source AIedge AImultimodal AIHugging Face
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Qwen 3.5 Small Models: Alibaba Releases 0.8B to 9B Variants | aiHola