Zyphra Launches ZAYA1-8B MoE Trained Entirely on AMD

Abstract red and orange circuit pattern radiating outward from a compact glowing core, suggesting an efficient AI accelerator chip

Zyphra released ZAYA1-8B this week, a mixture-of-experts model with 8.4 billion total parameters but only 760 million active per token. The company claims it matches much larger open-weight rivals on math and coding benchmarks. The differentiator: trained entirely on AMD hardware.

Per Zyphra's research post, ZAYA1-8B was pretrained, midtrained, and supervised fine-tuned on a 1,024-node AMD Instinct MI300x cluster with AMD Pensando Pollara interconnect, built on IBM Cloud. No NVIDIA in the stack. For a serious reasoning model in 2026, still rare.

Zyphra reports 89.6 on HMMT'25 versus 88.3 for Claude 4.5 Sonnet and GPT-5-High. Self-reported, and dependent on a new test-time compute method the company calls Markovian RSA, which spawns parallel reasoning traces and recursively aggregates tail segments to keep context bounded. Under extra-high compute (5.5M tokens per problem), Zyphra says ZAYA1-8B also tops DeepSeek-V3.2 and GPT-OSS-120B High on APEX-shortlist.

Architecture-wise, ZAYA1-8B layers in Compressed Convolutional Attention, an MLP-based expert router with PID-controller bias balancing, and learned residual scaling. CCA cuts KV-cache memory by 8x versus standard attention, per the company.

CEO Krithik Puthalath called the result a demonstration of "maximizing the intelligence extracted per parameter and per FLOP," which is also the standard line for any efficiency-focused model release. Independent benchmarks haven't landed yet.

Weights are live on Hugging Face under Apache-2.0. The serverless endpoint runs on Zyphra Cloud.

Bottom Line

ZAYA1-8B is the first MoE pretrained, midtrained, and SFT'd entirely on AMD's MI300x stack, with weights now on Hugging Face under Apache-2.0.

Quick Facts

Active parameters: 760 million (8.4B total)
Training cluster: 1,024 AMD Instinct MI300x nodes with AMD Pensando Pollara interconnect, built with IBM
HMMT'25 score: 89.6 (company-reported); Claude 4.5 Sonnet and GPT-5-High at 88.3
KV-cache compression: 8x via Compressed Convolutional Attention (company-reported)
License: Apache-2.0; available on Hugging Face and Zyphra Cloud

Tags:ZyphraAMDZAYA1-8BMoEopen-weight modelsAI hardwarereasoning models

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Zyphra Launches ZAYA1-8B MoE Model Trained on AMD

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Tencent Open-Sources 440MB Offline Translation Model for Phones

OpenAI Adds Three Voice Models to Realtime API

OpenAI Makes GPT-5.5 Instant the New ChatGPT Default

Stay Ahead of the AI Curve