Zyphra Releases ZAYA1-74B-Preview MoE Trained on AMD

Glowing server rack with red accent lighting representing a large language model trained on AMD hardware

Zyphra has released ZAYA1-74B-Preview, a mixture-of-experts model with 4B active and 74B total parameters. The model card went live this week under an Apache 2.0 license, just days after the company shipped its smaller ZAYA1-8B sibling.

This is a pre-RL checkpoint. The model has finished pretraining, midtraining, and context extension, but no instruction tuning or reinforcement learning yet. Zyphra's technical blog leans on pass@4 scores to argue the base has enough latent capability for RL to lift it. Pass@1 still trails post-trained competitors, which is the comparison most users actually care about.

The pitch is trust the trajectory. On the smaller 8B, Zyphra says the jump from a similar mid-training checkpoint to the final post-RL release added 20.8 points on AIME'26, 32.4 on HMMT'26, and 19.0 on IFEval. All numbers are self-reported on the company's own evaluation harness.

Architecture mirrors the 8B with one twist: every other attention layer becomes a 4K sliding window, which Zyphra says boosts long-context efficiency. Pretraining covered roughly 15T tokens, midtraining added another 3T across three phases, and context stretched in steps to 32k, 128k, then 256k.

The whole run happened on AMD MI300X hardware with Pensando Pollara networking, continuing the AMD-native pipeline behind the original ZAYA1 release. Full RL is already underway. Zyphra plans to ship the post-trained 74B in coming weeks.

Bottom Line

Zyphra's 4B-active, 74B-total MoE arrives as a pre-RL preview under Apache 2.0, with the full reasoning model due in coming weeks.

Quick Facts

4B active, 74B total parameters (MoE)
Apache 2.0 license
Pretraining covered ~15T tokens
Context extended to 256k
Trained on AMD MI300X GPUs with Pensando Pollara networking

Tags:ZyphraZAYA1AMDMixture of Expertsopen-source AILLMreasoning models

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Zyphra Releases 74B MoE Checkpoint Trained Entirely on AMD

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Zyphra Launches ZAYA1-8B MoE Model Trained on AMD

Tencent Open-Sources 440MB Offline Translation Model for Phones

NVIDIA Releases Nemotron 3 Nano Omni Multimodal Model

Stay Ahead of the AI Curve