DeepSeek Makes V4-Pro 75% Price Cut Permanent

Downward-trending price chart over a glowing circuit-board backdrop, suggesting falling AI API costs

DeepSeek has made its steep V4-Pro discount permanent, turning what started as a limited April promo into the model's standing rate. The company confirmed it in an X post this week. Input tokens now sit at $0.435 per million, output at $0.87, down from the original $1.74 and $3.48.

That's a 75% cut on both ends, locked in rather than set to expire. DeepSeek also dropped input cache-hit prices across its API to one-tenth of launch levels, a 90% reduction that mostly helps agent loops and apps resending the same system prompt. The full rates are on the company's pricing page.

V4-Pro is a 1.6 trillion parameter mixture-of-experts model, 49 billion active per pass, with a 1M-token context window. DeepSeek shipped it in preview on April 24. Per token it undercuts GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro by a wide margin, going by the OpenRouter listing.

Whether it actually matches those models in real use is the open question. Independent testing is still thin, and the headline benchmark scores are self-reported. "We are making our discount permanent," DeepSeek wrote, which reads less like generosity than a play for developers fed up with Western rate limits. The new rates are live now.

Bottom Line

V4-Pro now permanently costs $0.435 per million input tokens and $0.87 per million output, a 75% cut from its launch pricing.

Quick Facts

Permanent rate: $0.435 per million input tokens, $0.87 output
Down from launch pricing of $1.74 input, $3.48 output
Input cache-hit prices cut to one-tenth (90% reduction)
1.6 trillion parameters, 49 billion active, 1M-token context
Released in preview April 24, 2026; benchmarks company-reported

Tags:DeepSeekV4-ProAI pricingAPIopen-weight modelsChina AI

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

DeepSeek Makes Its V4-Pro 75% Price Cut Permanent

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

DeepSeek Sparse Attention Gets a From-Scratch Implementation Built for Reading

Alibaba's Qwen3.7-Max Runs 35 Hours Autonomously on Kernel Task

Open-Weight LLMs in 2026 Reshape Attention to Cut Long-Context Costs

Stay Ahead of the AI Curve