Meta MTIA Chips: Four Custom AI Accelerators by 2027

Meta on March 11 unveiled a roadmap for four generations of custom silicon, the MTIA 300, 400, 450, and 500, all slated for deployment by 2027. The first is already running in production. The second has cleared lab testing. The remaining two, code-named Arke and Astrid, are being built in parallel and will ship roughly six months apart.

This comes barely three weeks after Meta signed a deal with AMD worth an estimated $100 billion over five years for 6 gigawatts of GPU capacity, and days before that, committed to deploying millions of Nvidia's Blackwell and Vera Rubin processors. The company is also renting Google TPUs. So what exactly is the MTIA program for, if Meta is simultaneously locking in tens of billions in third-party silicon?

The inference bet

Here's the short version: Meta isn't trying to replace Nvidia. The MTIA chips are custom ASICs designed for Meta's own workloads (recommendation systems, ad ranking, generative AI inference) and built to be more cost-efficient at those specific tasks than general-purpose GPUs. The technical blog post is candid about this: MTIA 450 and 500 are optimized for inference first, then adapted backward for training. That's the inverse of how Nvidia and AMD build their flagship chips, which target the hardest workload (large-scale pretraining) and then get pressed into service for everything else.

The logic isn't crazy. Inference is where Meta's compute actually goes. Every time someone scrolls Instagram or asks Meta AI a question, that's inference. Training a frontier model happens once (or a few times); serving it happens billions of times a day. And Meta VP of Engineering Yee Jiun Song told CNBC that the chips give the company supply chain diversity and insulation from price fluctuations, which, given Nvidia's margins, is a polite way of saying they're expensive.

What the numbers look like

The progression from MTIA 300 to 500 is steep. According to Meta's newsroom announcement, HBM bandwidth increases 4.5x across the four chips, and compute FLOPS jumps 25x (measured from MTIA 300's MX8 to MTIA 500's MX4). The MTIA 400 delivers 400% higher FP8 FLOPS and 51% higher HBM bandwidth than its predecessor. But those are Meta's own comparisons against its own previous chips, not against Nvidia's Blackwell or AMD's MI450.

Meta claims the MTIA 400 offers "raw performance competitive with leading commercial products." That's a carefully worded hedge. Which products? Under what workloads? The company doesn't say. Given that the MTIA 400 uses a 72-chip rack configuration connected via a switched backplane (similar in concept to Nvidia's NVL72), the comparison might hold for specific inference tasks while falling apart for general training. I'd want to see independent benchmarks before taking the claim at face value.

The MTIA 450 is where things get more technically interesting. It doubles HBM bandwidth over the 400, bumps MX4 FLOPS by 75%, and introduces hardware acceleration for attention and FFN computation. Data Center Dynamics reports seven petaflops of FP8 compute, 288GB of HBM, and a 1,400W TDP. The MTIA 500 pushes further: 10 petaflops FP8, 384 to 512GB of HBM, 1,700W, and a 2x2 chiplet design with separate compute, network, and SoC chiplets.

Those power numbers are worth pausing on. 1,700 watts per chip is enormous. For context, Nvidia's B200 draws around 1,000W. Meta's answer is that all four chips share the same rack and network infrastructure, so upgrades are swappable, but the cooling requirements for racks full of 1,700W accelerators are non-trivial.

The speed question

A chip every six months sounds aggressive, and it is. The industry standard for new accelerator generations is one to two years. But Meta's cadence is possible because of a chiplet-based modular design: you swap out the compute die while keeping the I/O, networking, chassis, and rack infrastructure constant. It's less "we designed four chips from scratch" and more "we iterated four times on a common platform."

Song described the approach as building in parallel, with teams working on multiple generations simultaneously. The MTIA program is developed in partnership with Broadcom, and earlier reporting from Wccftech indicated MediaTek was also involved in developing the Arke variant. That tracks with Meta's stated goal of diversifying its supply chain at every level, not just at the GPU vendor level but down to the ASIC design partners.

The $135 billion context

None of this makes sense without the spending numbers. Meta guided for $115 billion to $135 billion in capital expenditure for 2026. That is, to use a technical term, an absurd amount of money. At that scale, even marginal efficiency gains from custom silicon translate into billions saved. If MTIA chips can serve the same recommendation and inference workloads at, say, 40% lower total cost of ownership than GPUs (Meta's own figure from their ISCA'25 paper on the previous-generation MTIA 2i), the math justifies a dedicated chip program even if it never touches frontier model training.

But there's a catch. Song acknowledged to CNBC that Meta is "absolutely worried about HBM supply." High-bandwidth memory is the bottleneck for the entire AI chip industry right now, and Meta's roadmap calls for increasingly large amounts of it across each successive chip generation. The company says it has secured supply for its current plans, but wouldn't comment on whether it has signed long-term contracts with Samsung, SK Hynix, or Micron.

What this isn't

This isn't Meta building a competitor to Nvidia's data center business. It is not going to sell MTIA chips to anyone. These are internal tools, optimized for internal workloads, running Meta's own models on Meta's own infrastructure. Google has been doing this with TPUs for years. Amazon has Trainium and Inferentia. Microsoft launched Maia. The playbook is established.

The more interesting question is whether Meta's inference-first approach actually produces better economics than just buying more Nvidia GPUs. Custom silicon programs are expensive to run, they require dedicated engineering teams, and they lock you into architectural decisions that might not age well as models evolve. Song's own quote to CNBC about AI development accelerating at a pace that has "blown everyone's minds" cuts both ways: fast iteration is great until the workloads shift in a direction your custom hardware can't follow.

The MTIA 400 is heading to data centers now. The 450 and 500 are scheduled for early and mid-2027. By then we should have some real production data on whether the inference-first gamble pays off, or whether Meta's custom chips end up as an expensive hedge that mostly serves ranking and recommendation models while GPUs do the heavy lifting.