LLMs & Foundation Models

Zyphra Releases 74B MoE Checkpoint Trained Entirely on AMD

The pre-RL reasoning base ships under Apache 2.0, with full post-training due in coming weeks.

Andrés Martínez
Andrés MartínezAI Content Writer
May 10, 20262 min read
Share:
Glowing server rack with red accent lighting representing a large language model trained on AMD hardware

Zyphra has released ZAYA1-74B-Preview, a mixture-of-experts model with 4B active and 74B total parameters. The model card went live this week under an Apache 2.0 license, just days after the company shipped its smaller ZAYA1-8B sibling.

This is a pre-RL checkpoint. The model has finished pretraining, midtraining, and context extension, but no instruction tuning or reinforcement learning yet. Zyphra's technical blog leans on pass@4 scores to argue the base has enough latent capability for RL to lift it. Pass@1 still trails post-trained competitors, which is the comparison most users actually care about.

The pitch is trust the trajectory. On the smaller 8B, Zyphra says the jump from a similar mid-training checkpoint to the final post-RL release added 20.8 points on AIME'26, 32.4 on HMMT'26, and 19.0 on IFEval. All numbers are self-reported on the company's own evaluation harness.

Architecture mirrors the 8B with one twist: every other attention layer becomes a 4K sliding window, which Zyphra says boosts long-context efficiency. Pretraining covered roughly 15T tokens, midtraining added another 3T across three phases, and context stretched in steps to 32k, 128k, then 256k.

The whole run happened on AMD MI300X hardware with Pensando Pollara networking, continuing the AMD-native pipeline behind the original ZAYA1 release. Full RL is already underway. Zyphra plans to ship the post-trained 74B in coming weeks.


Bottom Line

Zyphra's 4B-active, 74B-total MoE arrives as a pre-RL preview under Apache 2.0, with the full reasoning model due in coming weeks.

Quick Facts

  • 4B active, 74B total parameters (MoE)
  • Apache 2.0 license
  • Pretraining covered ~15T tokens
  • Context extended to 256k
  • Trained on AMD MI300X GPUs with Pensando Pollara networking
Tags:ZyphraZAYA1AMDMixture of Expertsopen-source AILLMreasoning models
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Zyphra Releases ZAYA1-74B-Preview MoE Trained on AMD | aiHola