Alibaba Qwen-VLA Controls Multiple Robot Types With One Mode

Robotic arm gripping an object on a lab bench beside a navigation robot, suggesting a single AI controlling multiple machine types

Alibaba's Qwen team put out Qwen-VLA, a vision-language-action model that runs across different robot bodies without retraining a separate policy for each one. The technical report landed May 28, and a project repository is live. The models themselves aren't out yet.

VLA models take a camera image plus a text command and spit out robot actions. Qwen-VLA is built on the Qwen3.5-4B vision-language backbone with a 1.15B-parameter DiT flow-matching action decoder. Switching robots means swapping the text description of the platform, what the team calls embodiment-aware prompt conditioning. No per-platform output heads.

It folds manipulation, navigation and trajectory prediction into one framework. The pitch is that a single generalist matches specialists tuned per task. On the ALOHA bimanual setup, the comparison points GR00T N1.6 (NVIDIA) and π0.5 (Physical Intelligence) were fine-tuned per task individually, while Qwen-VLA ran as one model across everything.

The self-reported numbers: 97.9% on LIBERO, 87.2% on RoboTwin-Hard, and 76.9% average success on out-of-distribution real-world ALOHA tasks. Independent replication hasn't happened yet, and there's no word on when weights ship or under what license.

Bottom Line

Qwen-VLA reports 97.9% on LIBERO and 76.9% on out-of-distribution ALOHA tasks, but only the report and repo are public, not the weights.

Quick Facts

Backbone: Qwen3.5-4B vision-language model
Action decoder: 1.15B-parameter DiT flow-matching
Technical report posted May 28, 2026 (arXiv 2605.30280)
LIBERO: 97.9%, RoboTwin-Hard: 87.2% (company-reported)
Real-world ALOHA OOD success: 76.9% average (company-reported)

Tags:QwenAlibabaroboticsvision-language-actionembodied AIVLA modelsopen source

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Alibaba's Qwen Team Releases Qwen-VLA for Cross-Robot Control

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Genesis AI Open-Sources Genesis World 1.0, a Robotics Simulator Built for Evaluation First

StepFun Releases Open-Weight Step 3.7 Flash for Agentic Work

Xiaomi Cuts MiMo-V2.5 API Prices by Up to 99%

Stay Ahead of the AI Curve