Alibaba Updates HappyHorse Video Model to 1.1

Translucent film-strip panels arranged in sequence inside a softly lit data center corridor, suggesting connected video scenes from one prompt

Alibaba updated HappyHorse, the video model it quietly floated to the top of an arena leaderboard earlier this year before claiming it. Version 1.1's image-to-video build now shows up on Model Studio, with the company citing better skin texture, steadier character identity across clips, and tighter audio-visual sync. The usual upgrade copy, in other words.

Four modes carry over: text-to-video, image-to-video, reference-to-video with up to nine reference images, and video editing. Prompts run to 2,500 characters, output tops out at 1080p, and clips land between 3 and 15 seconds. The reference-to-video mode and multi-shot consistency are the parts that matter, since keeping a character stable across cuts is where most rivals still wobble.

Native lip-sync covers seven languages on the 1.0 line. Whether 1.1 shifts that list, Alibaba doesn't say.

Pricing is the soft spot. Alibaba's enterprise channel billed 1.0 text-to-video at roughly 22 cents per second for 1080p, while third-party resellers range from 16 to 32 cents. The original Russian post pegged the rate at 14 cents in HD and 18 cents at 1080p, plus a 40 percent launch discount, none of which lines up with published provider rates. A 1.1 text-to-video endpoint is already live on reseller Atlas Cloud. Direct access through Alibaba's own channels for the previous version stayed in closed beta and region-locked, so whether you can reach 1.1 from outside China is the real open question.

Bottom Line

HappyHorse 1.1's image-to-video build is live on Alibaba Cloud Model Studio, supporting 2,500-character prompts, 1080p output, and four generation modes including nine-image reference-to-video.

Quick Facts

Model: HappyHorse 1.1, update to HappyHorse 1.0
Modes: text-to-video, image-to-video, reference-to-video (1-9 images), video editing
Max prompt length: 2,500 characters
Output: up to 1080p, clips 3-15 seconds
Lip-sync: seven languages on the 1.0 line
1080p pricing (1.0, Alibaba enterprise): about $0.22/sec; resellers $0.16-$0.32/sec
Source claims of $0.14/$0.18 rates and 40% discount: unverified

Tags:AlibabaHappyHorseAI videotext-to-videoModel Studiogenerative AIQwen

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Alibaba Updates HappyHorse Video Model to 1.1

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Google's Gemini-SQL2 Tops BIRD Text-to-SQL Leaderboard

ByteDance Seedance 2.0 Mini Rumored for Mid-June Launch

Alibaba's Qwen Team Releases Three Robotics Models, Withholds the Weights

Stay Ahead of the AI Curve