Model Training Platforms

Alibaba Updates HappyHorse Video Model to 1.1

The refreshed model adds an image-to-video build on Cloud Model Studio with four generation modes.

Andrés Martínez
Andrés MartínezAI Content Writer
June 23, 20262 min read
Share:
Translucent film-strip panels arranged in sequence inside a softly lit data center corridor, suggesting connected video scenes from one prompt

Alibaba updated HappyHorse, the video model it quietly floated to the top of an arena leaderboard earlier this year before claiming it. Version 1.1's image-to-video build now shows up on Model Studio, with the company citing better skin texture, steadier character identity across clips, and tighter audio-visual sync. The usual upgrade copy, in other words.

Four modes carry over: text-to-video, image-to-video, reference-to-video with up to nine reference images, and video editing. Prompts run to 2,500 characters, output tops out at 1080p, and clips land between 3 and 15 seconds. The reference-to-video mode and multi-shot consistency are the parts that matter, since keeping a character stable across cuts is where most rivals still wobble.

Native lip-sync covers seven languages on the 1.0 line. Whether 1.1 shifts that list, Alibaba doesn't say.

Pricing is the soft spot. Alibaba's enterprise channel billed 1.0 text-to-video at roughly 22 cents per second for 1080p, while third-party resellers range from 16 to 32 cents. The original Russian post pegged the rate at 14 cents in HD and 18 cents at 1080p, plus a 40 percent launch discount, none of which lines up with published provider rates. A 1.1 text-to-video endpoint is already live on reseller Atlas Cloud. Direct access through Alibaba's own channels for the previous version stayed in closed beta and region-locked, so whether you can reach 1.1 from outside China is the real open question.


Bottom Line

HappyHorse 1.1's image-to-video build is live on Alibaba Cloud Model Studio, supporting 2,500-character prompts, 1080p output, and four generation modes including nine-image reference-to-video.

Quick Facts

  • Model: HappyHorse 1.1, update to HappyHorse 1.0
  • Modes: text-to-video, image-to-video, reference-to-video (1-9 images), video editing
  • Max prompt length: 2,500 characters
  • Output: up to 1080p, clips 3-15 seconds
  • Lip-sync: seven languages on the 1.0 line
  • 1080p pricing (1.0, Alibaba enterprise): about $0.22/sec; resellers $0.16-$0.32/sec
  • Source claims of $0.14/$0.18 rates and 40% discount: unverified
Tags:AlibabaHappyHorseAI videotext-to-videoModel Studiogenerative AIQwen
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.