DeepSeek has made its steep V4-Pro discount permanent, turning what started as a limited April promo into the model's standing rate. The company confirmed it in an X post this week. Input tokens now sit at $0.435 per million, output at $0.87, down from the original $1.74 and $3.48.
That's a 75% cut on both ends, locked in rather than set to expire. DeepSeek also dropped input cache-hit prices across its API to one-tenth of launch levels, a 90% reduction that mostly helps agent loops and apps resending the same system prompt. The full rates are on the company's pricing page.
V4-Pro is a 1.6 trillion parameter mixture-of-experts model, 49 billion active per pass, with a 1M-token context window. DeepSeek shipped it in preview on April 24. Per token it undercuts GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro by a wide margin, going by the OpenRouter listing.
Whether it actually matches those models in real use is the open question. Independent testing is still thin, and the headline benchmark scores are self-reported. "We are making our discount permanent," DeepSeek wrote, which reads less like generosity than a play for developers fed up with Western rate limits. The new rates are live now.
Bottom Line
V4-Pro now permanently costs $0.435 per million input tokens and $0.87 per million output, a 75% cut from its launch pricing.
Quick Facts
- Permanent rate: $0.435 per million input tokens, $0.87 output
- Down from launch pricing of $1.74 input, $3.48 output
- Input cache-hit prices cut to one-tenth (90% reduction)
- 1.6 trillion parameters, 49 billion active, 1M-token context
- Released in preview April 24, 2026; benchmarks company-reported




