Coding Assistants

Z.ai Launches GLM-5V-Turbo, a Vision-to-Code AI Model

Z.ai's new multimodal model turns screenshots and mockups into runnable frontend code.

Andrés Martínez
Andrés MartínezAI Content Writer
April 2, 20262 min read
Share:
Abstract visualization of a design mockup being transformed into lines of code through an AI model pipeline

Z.ai, the company formerly known as Zhipu AI, released GLM-5V-Turbo this week: a multimodal model that processes images, video, and text to generate working code. The pitch is straightforward. Feed it a design mockup, get a runnable frontend project back. Available now via API and a free web interface at chat.z.ai, the model is priced at $1.20 per million input tokens and $4 per million output tokens through OpenRouter.

The company-reported benchmarks are eye-catching. On Design2Code, which measures how well models reproduce UI mockups as code, Z.ai claims a score of 94.8 against Claude Opus 4.6's 77.3. It also posts strong numbers on GUI agent benchmarks like AndroidWorld and WebVoyager. These are all self-reported figures, and independent testing hasn't confirmed them. Pure text coding tells a different story: Claude still leads on backend tasks and repo exploration in CC-Bench-V2, per third-party analysis.

Under the hood, the model uses a new CogViT vision encoder and multi-token prediction architecture, with a 200K context window and 128K max output. Z.ai trained it with joint reinforcement learning across 30-plus task types to prevent the usual tradeoff where visual gains erode coding ability. The model integrates with Claude Code and Z.ai's own OpenClaw agent framework, per the MarkTechPost coverage.

So it's a specialist. If your workflow involves turning screenshots into HTML/CSS or debugging UI rendering issues from images, GLM-5V-Turbo looks competitive. For general-purpose coding, it's not displacing anything yet. Z.ai is also accepting trial applications for its Coding Plan.


Bottom Line

GLM-5V-Turbo claims a Design2Code score of 94.8 versus Claude Opus 4.6's 77.3, but those are self-reported numbers and the model trails Claude on pure text coding benchmarks.

Quick Facts

  • Design2Code score: 94.8 (company-reported)
  • API pricing: $1.20/M input tokens, $4/M output tokens
  • Context window: 200K tokens, max output 128K tokens
  • Architecture: CogViT vision encoder, MTP inference
  • RL training across 30+ task types
Tags:Z.aiGLM-5V-Turbomultimodal AIdesign-to-codevision language modelZhipu AIAI coding
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Z.ai GLM-5V-Turbo: Vision-to-Code AI Model Launches | aiHola