Zhipu AI, which operates internationally as Z.ai, shipped GLM-5.2 on June 13 with a 1-million-token context window, roughly five times the 200K ceiling on GLM-5.1. The model went live the same day across every GLM Coding Plan tier, and the company laid out the details in a tech blog.
It ships in two reasoning modes. High handles fast, everyday generation. Max trades speed for deeper multi-step planning, which Z.ai recommends for harder coding work. The model slots into agents like Claude Code, Cline, Roo Code, and OpenCode, and the full-window variant carries the ID glm-5.2[1m].
The catch is quota burn. Z.ai's own documentation says GLM-5.2 consumes plan quota at 3x during peak hours, 14:00 to 18:00 Beijing time, and 2x off-peak, putting it in the same bracket as Claude Opus. A promotion drops off-peak usage to 1x through the end of September, though that's a limited-time deal, not a permanent rate.
The original report framed open weights as a near-term plan. They're already out. The model weights are live on Hugging Face under an MIT license with no regional restrictions, alongside the standalone API. Benchmark numbers came late, after the release rather than with it.
Bottom Line
GLM-5.2 gives Coding Plan subscribers a 1M-token window now, but burns quota at up to 3x the standard rate during peak hours.
Quick Facts
- Released June 13, 2026 by Zhipu AI (Z.ai)
- 1,000,000-token context window, up from ~200K on GLM-5.1
- Max output capped at 131,072 tokens
- Quota multiplier: 3x peak, 2x off-peak (1x off-peak promo through end of September)
- MIT-licensed open weights live on Hugging Face




