Alibaba's Qwen team shipped Qwen-Image-2512 on December 31, a substantial update to their text-to-image model that specifically targets the waxy, artificial quality plaguing AI-generated humans. The 20-billion parameter diffusion transformer is fully open-source under Apache 2.0.
The update focuses on three weak spots: human realism, natural textures, and embedded text. Faces now render individual hair strands and age-appropriate details like wrinkles and skin texture. Landscapes, water, and animal fur come through with sharper gradients. Text on generated images, a historically painful limitation, handles multilingual layouts more reliably.
In blind testing on Alibaba's own AI Arena platform, Qwen-Image-2512 ranked fourth overall across more than 10,000 comparisons, per the company. That makes it the highest-ranked open-source model in the benchmark, though the testing was conducted by Alibaba itself. It competes against Tencent's HunyuanImage-3.0 and Black Forest Labs' Flux.2, among others.
Weights are available now on Hugging Face and ModelScope. For those who prefer managed infrastructure, Alibaba Cloud offers API access at $0.075 per image through Model Studio.
The Bottom Line: Qwen-Image-2512 gives enterprises a production-grade image generator they can self-host and fine-tune without licensing fees.
QUICK FACTS
- Model size: 20 billion parameters (MMDiT architecture)
- License: Apache 2.0 (commercial use permitted)
- Release date: December 31, 2025
- AI Arena ranking: 4th overall, 1st among open-source (company-reported)
- API pricing: $0.075 per generated image via Alibaba Cloud




