Subquadratic SubQ Launches With 12M Token Context Window

Abstract visualization of dense neural network attention patterns simplifying into sparse linear streams over a Miami skyline at dusk

Miami startup Subquadratic came out of stealth Tuesday with $29 million in seed funding and SubQ, a model the company says runs roughly 52 times faster than FlashAttention at one million tokens and supports a 12 million token context window. CTO Alex Whedon framed it in a launch post on X as the first frontier model to break the quadratic attention barrier.

Subquadratic's pitch is aggressive. At 12M tokens, the company announcement claims its architecture cuts attention compute by nearly 1,000x compared with other frontier models. At one million tokens, it says SubQ costs roughly a fifth of what Claude Opus 4.7 or GPT-5.5 charge for comparable workloads. On RULER 128K, the company puts the cost gap at about 300x.

The architectural claim

The technique is called Subquadratic Sparse Attention, or SSA. The premise: most token-to-token comparisons in standard attention are wasted compute, so let the model learn which positions actually matter and compute attention only over those. Selection is content-dependent, not based on fixed positional patterns like older sparse-attention work.

If that sounds familiar, it should. Mamba, RWKV, Longformer, DeepSeek Sparse Attention, Kimi Linear. A decade of attempts to escape O(n²) attention. None has displaced dense attention at the frontier.

About those weights

Within hours of launch, AI engineer Will Depue posted that SubQ was "almost surely a sparse attention finetune of Kimi or DeepSeek." Whedon then confirmed it. The company is "using weights from open-source models as a starting point, as a function of our funding and maturity as a company," he wrote on X. Depue followed up arguing the O(n) scaling claims and the reported speedups "don't seem to line up."

That admission complicates the framing. A sparse-attention layer grafted onto somebody else's pretraining run is a legitimate engineering contribution. It is not the same thing as a ground-up redesign of how attention works, which is roughly how the company's launch coverage described it.

The benchmarks, with caveats

The published numbers look strong on paper. 95% on RULER 128K. 81.8% on SWE-Bench Verified, edging Opus 4.6. 92.1% on needle-in-a-haystack at 12M tokens. On MRCR v2 at one million tokens, though, SubQ scores 65.9%, behind GPT-5.5's 74%. According to coverage in The New Stack, each model was run only once due to inference cost, and Whedon himself described SubQ as "way smaller than the big labs."

One run per benchmark on a model the team admits is sub-frontier sized. That is not a fraud claim. It is a reason to wait for independent reproduction before declaring a breakthrough.

We've been here

The company that should be on every reader's mind is Magic.dev. In August 2024, Magic announced a 100M-token context model with claimed 1,000x efficiency gains, and went on to raise more than $500 million on the strength of those numbers. Nearly two years later, there is no public evidence that LTM-2-mini is in production use outside Magic. VentureBeat draws the parallel directly, and Subquadratic's reported $500 million valuation on a seed round, with no public weights and no peer-reviewed paper, looks like the same movie restarting.

What's next

Subquadratic is taking access requests for three products in private beta: an API exposing the full 12M window, SubQ Code (a CLI agent that loads whole repos into context), and SubQ Search, free during beta. The 50 million token context target is set for Q4. A full technical report has not been released, and the model weights are not open. Until that report lands and outside labs can rerun the benchmarks, AI commentator Dan McAteer's read in his own X post is hard to argue with: "either the biggest breakthrough since the Transformer... or it's AI Theranos."

Tags:SubquadraticSubQLLMsparse attentioncontext windowAI fundingMiami startupfrontier AIAlex Whedon

Liza Chan

AI & Emerging Tech Correspondent

Liza covers the rapidly evolving world of artificial intelligence, from breakthroughs in research labs to real-world applications reshaping industries. With a background in computer science and journalism, she translates complex technical developments into accessible insights for curious readers.

Subquadratic Launches SubQ Model With 12M Token Context, $29M Seed

The architectural claim

About those weights

The benchmarks, with caveats

We've been here

What's next

Liza Chan

Related Articles

Grok 4.3 Lands With Lower Prices and Imagine Agent Mode

Sakana AI's KAME architecture lets voice models think while they speak

Bloomberg's ASKB AI Reaches 125,000 Terminal Users in Beta

Stay Ahead of the AI Curve