LLMs & Foundation Models

Meta Launches Muse Spark, Its First Model Since Llama 4 Flop

Meta's new Muse Spark model is competitive but not dominant, and it's closed-source. A big shift.

Liza Chan
Liza ChanAI & Emerging Tech Correspondent
April 9, 20264 min read
Share:
Abstract visualization of a neural network with warm orange and blue nodes connected by glowing pathways, suggesting parallel processing streams

Meta released Muse Spark on Wednesday, the first AI model out of its nine-month-old Superintelligence Labs unit and a tacit admission that the Llama era needed a reboot. The model, codenamed Avocado, powers the updated Meta AI assistant across meta.ai and the company's apps starting today, with WhatsApp, Instagram, and Messenger rollouts coming in the next few weeks.

This is Alexandr Wang's first big deliverable since Meta paid $14.3 billion for a 49% stake in Scale AI last June and installed him as chief AI officer. The price tag makes Muse Spark one of the most expensive model debuts in history, though Meta would probably prefer you not do that math.

The benchmarks tell a familiar story

Muse Spark scores 52 on the Artificial Analysis Intelligence Index, which slots it behind GPT-5.4 and Gemini 3.1 Pro (both at 57) and Claude Opus 4.6 (53). Competitive, sure. But "competitive" is doing a lot of work in Meta's messaging here.

The health benchmarks are where things get interesting. On HealthBench Hard, Muse Spark posted 42.8%, ahead of every rival, including GPT-5.4 at 40.1%. Meta says it worked with over 1,000 physicians to curate training data for health reasoning, which is either a genuine differentiator or the kind of stat that sounds better in a press release than in practice. Time will tell which.

Coding is a different story. On Terminal-Bench 2.0, Muse Spark scored 59.0 against GPT-5.4's 75.1. On ARC-AGI-2, which tests abstract reasoning, it managed 42.5 while both GPT-5.4 and Gemini 3.1 Pro scored in the mid-70s. Meta acknowledged the gaps directly, noting continued investment in "long-horizon agentic systems and coding workflows." At least they're not pretending.

About those Llama 4 benchmarks

There's an elephant in the room. Fortune noted that Meta previously got caught using specialized, unreleased model variants to inflate Llama 4's benchmark scores. The general-release version didn't match. Whether you trust the Muse Spark numbers at face value depends on how much goodwill you think Meta has rebuilt since then.

Independent evaluation from Artificial Analysis does broadly corroborate the picture: Muse Spark is a genuine jump from Llama 4 Maverick, which scored just 18 on the same index. That's a real improvement. But "third-party confirms it's better than the thing everyone agreed was bad" is a low bar.

Closed source, and that matters

The other big shift: Muse Spark is proprietary. No open weights. Meta's CNBC interview included vague language about hoping to open-source future versions, but for now, access runs through Meta's own apps or a private API preview for select partners. That's a significant departure from the Llama strategy that made Meta popular with the open-source community.

The model requires a Facebook or Instagram login to use, which, as TechCrunch pointed out, raises obvious privacy questions. Meta doesn't explicitly say it uses personal data from those accounts, but the company's track record and its positioning of Muse Spark as a "personal superintelligence" product don't exactly inspire confidence.

What's the actual play here?

Meta's framing of Muse Spark as "small and fast by design" reads like careful expectation management. The company is spending somewhere between $115 billion and $135 billion on AI infrastructure in 2026 alone, according to its latest earnings report. Calling your first model from that investment "small" is a choice.

The multi-agent "Contemplating" mode, where parallel subagents tackle different parts of a query simultaneously, is genuinely novel in how it's deployed at consumer scale. Whether it works as advertised for Meta's 3 billion users is another question entirely. It's rolling out gradually, which in product-speak usually means "we're not sure it's ready."

Muse Spark is available now at meta.ai. The Contemplating mode and broader app rollouts are expected in the coming weeks. API pricing hasn't been announced.

Tags:MetaMuse SparkAI modelsAlexandr WangMeta Superintelligence LabsLlamaAI benchmarks
Liza Chan

Liza Chan

AI & Emerging Tech Correspondent

Liza covers the rapidly evolving world of artificial intelligence, from breakthroughs in research labs to real-world applications reshaping industries. With a background in computer science and journalism, she translates complex technical developments into accessible insights for curious readers.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Meta Launches Muse Spark AI Model: Benchmarks and Details | aiHola