DeepSeek rolled out preview versions of its V4 series Friday, pushing two open-weight Mixture-of-Experts models with a million-token context length. The model card lists V4-Pro at 1.6 trillion total parameters with 49 billion active per token. V4-Flash comes in at 284 billion total and 13 billion active.
Both ship under MIT license. The lab says it pre-trained the models on more than 32 trillion tokens and built a hybrid attention setup it calls CSA plus HCA. DeepSeek's technical report claims this cuts single-token inference FLOPs to 27% of V3.2's at 1M tokens, with KV cache down to 10%. Self-reported, and not independently tested yet.
DeepSeek bills V4-Pro-Max, the highest reasoning mode, as "the best open-source model available today." Its own benchmark table puts the model at 93.5 on LiveCodeBench and a 3206 Codeforces rating, topping Opus 4.6 Max and GPT-5.4 on both. It trails Gemini 3.1 Pro on broader reasoning tests like GPQA Diamond and HLE.
Users can try the models through Expert and Instant modes on DeepSeek's chat. API access is live. Pricing wasn't disclosed.
Bottom Line
DeepSeek's V4-Pro-Max posts a 3206 Codeforces rating and 93.5 on LiveCodeBench on the company's own benchmarks, pending independent verification.
Quick Facts
- V4-Pro: 1.6T total parameters, 49B active
- V4-Flash: 284B total parameters, 13B active
- Context length: 1 million tokens
- Pre-training corpus: 32T+ tokens (company-reported)
- License: MIT




