Cerebras Runs Kimi K2.6 at Near 1,000 Tokens Per Second

A wafer-scale AI processor glowing inside a data center server rack, conveying high-speed computation

Cerebras is now serving Moonshot AI's Kimi K2.6 at close to 1,000 tokens per second for enterprise customers, the chipmaker said this week on its company blog. It's the first trillion-parameter open-weight model the company has put into production.

Benchmarking firm Artificial Analysis clocked the actual output at 981 tokens per second, which Cerebras frames as 6.7x faster than the next-fastest GPU cloud. That figure comes from a third party, not Cerebras alone. For a 10,000-token coding request, the company says it returned a 500-token answer in 5.6 seconds versus 163.7 on Kimi's own endpoint.

The catch: this is enterprise trials only. No public access yet, and Cerebras hasn't said when that changes.

Timing isn't accidental. Cerebras went public on May 14 and raised $5.5 billion, the largest U.S. tech IPO since Uber. Shares priced at $185 and jumped on day one, though the resulting valuation gets reported anywhere from $56 billion to $95 billion depending on how you count diluted shares. The trillion-parameter demo reads like a signal to Wall Street that the wafer-scale chips can handle frontier-scale models, not just mid-sized ones.

Cerebras has also tied itself to OpenAI through a multi-year compute deal worth more than $20 billion, running through 2028. K2.6 ranks near the top on coding benchmarks, but those scores are reported, not independently confirmed.

The piece nobody has committed to is a public launch date. For now it stays behind the enterprise wall.

Bottom Line

Cerebras hit 981 tokens per second on a trillion-parameter model, but only enterprise customers can use it.

Quick Facts

981 output tokens/sec, measured by Artificial Analysis
Kimi K2.6: trillion-parameter open-weight model from Moonshot AI
First trillion-parameter model Cerebras has served in production
10,000-token request answered in 5.6s vs 163.7s on Kimi's endpoint (company-reported)
Cerebras IPO raised $5.5 billion, listed May 14, 2026

Tags:CerebrasKimi K2.6AI inferenceMoonshot AIAI chipsIPOagentic coding

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Cerebras Runs Trillion-Parameter Kimi K2.6 Near 1,000 Tokens/Sec

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

Google Triples Gemini Limits Inside Antigravity, Then Quietly Does It Again

DeepSeek Forms Beijing Team to Build Claude Code Rival

Cursor's Composer 2.5 Closes the Gap on Claude Opus 4.7 at a Fraction of the Price

Stay Ahead of the AI Curve