Perplexity Upgrades Deep Research, Open-Sources DRACO Benchmark

Abstract visualization of AI-powered research synthesis with connected documents and search visualization

Perplexity has shipped a major update to its Deep Research feature, pairing Anthropic's Opus 4.5 model with its proprietary search tools and code execution sandbox. The company claims state-of-the-art results on external benchmarks, including 79.5% on Google DeepMind's DeepSearchQA. Max subscribers get immediate access; Pro users will see the update roll out over the coming days.

Alongside the product upgrade, Perplexity released DRACO (Deep Research Accuracy, Completeness, and Objectivity), a new open-source benchmark for evaluating research agents. The dataset includes 100 tasks across ten domains, from medicine and law to finance and UX design. Each task comes with expert-crafted rubrics averaging around 40 evaluation criteria. The tasks originated from actual user queries where initial responses fell short, which Perplexity argues makes the benchmark harder than synthetic academic tests.

On its own benchmark, Perplexity reports a 67.15% score, ahead of Gemini Deep Research at 58.97% and OpenAI's o3 at 52.06%. Citation quality stands out: 76% versus 60.4% for o3, per the company's numbers. These are self-reported results, so independent validation remains pending.

The update also delivered speed gains. Perplexity's system completed benchmark tasks in 459 seconds on average, compared to 592 to 1,808 seconds for competitors, the company claims.

The Bottom Line: Perplexity is betting that vertically integrated search plus top-tier reasoning models can outperform competitors charging ten times more per query.

QUICK FACTS

DRACO benchmark: 100 tasks, 10 domains, ~40 evaluation criteria per task
Perplexity DRACO score: 67.15% (self-reported)
DeepSearchQA score: 79.5% (self-reported)
Average completion time: 459.6 seconds vs. 592-1,808 seconds for competitors (company-reported)
Availability: Max users now, Pro users in coming days

Tags:Perplexity DRACO Claude Opus 4.5 AI benchmarks Deep Research Anthropic open source

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Perplexity Upgrades Deep Research, Open-Sources DRACO Benchmark

QUICK FACTS

Andrés Martínez

Related Articles

Qwen Releases DeepPlanning, a Benchmark That Breaks Frontier AI Models

Tencent Releases CL-bench, a Benchmark That Exposes Context Learning Gaps in LLMs

Zhipu Open-Sources GLM-OCR, a 0.9B Model That Tops Document Parsing Benchmarks

Stay Ahead of the AI Curve