LLMs & Foundation Models

Zyphra Launches ZAYA1-8B MoE Model Trained on AMD

Sub-1B-active MoE trained entirely on AMD MI300x; claims math wins over Claude 4.5 and GPT-5.

Andrés Martínez
Andrés MartínezAI Content Writer
May 8, 20262 min read
Share:
Abstract red and orange circuit pattern radiating outward from a compact glowing core, suggesting an efficient AI accelerator chip

Zyphra released ZAYA1-8B this week, a mixture-of-experts model with 8.4 billion total parameters but only 760 million active per token. The company claims it matches much larger open-weight rivals on math and coding benchmarks. The differentiator: trained entirely on AMD hardware.

Per Zyphra's research post, ZAYA1-8B was pretrained, midtrained, and supervised fine-tuned on a 1,024-node AMD Instinct MI300x cluster with AMD Pensando Pollara interconnect, built on IBM Cloud. No NVIDIA in the stack. For a serious reasoning model in 2026, still rare.

Zyphra reports 89.6 on HMMT'25 versus 88.3 for Claude 4.5 Sonnet and GPT-5-High. Self-reported, and dependent on a new test-time compute method the company calls Markovian RSA, which spawns parallel reasoning traces and recursively aggregates tail segments to keep context bounded. Under extra-high compute (5.5M tokens per problem), Zyphra says ZAYA1-8B also tops DeepSeek-V3.2 and GPT-OSS-120B High on APEX-shortlist.

Architecture-wise, ZAYA1-8B layers in Compressed Convolutional Attention, an MLP-based expert router with PID-controller bias balancing, and learned residual scaling. CCA cuts KV-cache memory by 8x versus standard attention, per the company.

CEO Krithik Puthalath called the result a demonstration of "maximizing the intelligence extracted per parameter and per FLOP," which is also the standard line for any efficiency-focused model release. Independent benchmarks haven't landed yet.

Weights are live on Hugging Face under Apache-2.0. The serverless endpoint runs on Zyphra Cloud.


Bottom Line

ZAYA1-8B is the first MoE pretrained, midtrained, and SFT'd entirely on AMD's MI300x stack, with weights now on Hugging Face under Apache-2.0.

Quick Facts

  • Active parameters: 760 million (8.4B total)
  • Training cluster: 1,024 AMD Instinct MI300x nodes with AMD Pensando Pollara interconnect, built with IBM
  • HMMT'25 score: 89.6 (company-reported); Claude 4.5 Sonnet and GPT-5-High at 88.3
  • KV-cache compression: 8x via Compressed Convolutional Attention (company-reported)
  • License: Apache-2.0; available on Hugging Face and Zyphra Cloud
Tags:ZyphraAMDZAYA1-8BMoEopen-weight modelsAI hardwarereasoning models
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Zyphra Launches ZAYA1-8B MoE Trained Entirely on AMD | aiHola