DeepSeek is preparing to release its V4 flagship model in late April, running on Huawei's Ascend chips instead of NVIDIA hardware. The move, first reported by Reuters on April 3 citing The Information, makes V4 the first frontier AI model designed to operate entirely on Chinese-made silicon.
The model has been delayed twice. Originally expected around February during the Lunar New Year, then pushed to March, V4 is now targeting the last two weeks of April. On April 8, DeepSeek quietly added "Fast Mode" and "Expert Mode" toggles to its web interface, which industry observers read as infrastructure prep for the full release.
The chip story matters more than the model
Forget the parameter count for a second. The real news is the hardware underneath. DeepSeek spent months working with Huawei and Cambricon Technologies to rewrite core pieces of V4's code for Huawei's CANN architecture, bypassing NVIDIA's CUDA ecosystem entirely. Two additional V4 variants, each optimized for different capabilities, are also reportedly in development for Chinese chips.
Alibaba, ByteDance, and Tencent have placed bulk orders for hundreds of thousands of Huawei's upcoming chips. Prices have jumped roughly 20% in weeks. That kind of purchasing behavior from companies that large isn't speculative. It's a supply chain bet.
DeepSeek also broke with standard practice by refusing to share V4 with US chipmakers for performance optimization. Instead, early access went exclusively to domestic suppliers. Whether that was a political decision or a practical one (or both) is something the company hasn't addressed.
What we think we know about specs
V4 reportedly uses a Mixture-of-Experts architecture scaled to around 1 trillion total parameters, though only about 37 billion activate per token, keeping inference costs manageable. A "V4 Lite" variant briefly appeared on DeepSeek's platform on March 9 before being pulled, suggesting the core architecture was already functional weeks ago.
Claims of a million-token context window and native multimodal support (text, image, video) are circulating, but none of this is confirmed by DeepSeek. The company hasn't said anything publicly about V4's capabilities, pricing, or timeline. Everything comes from leaks and source-backed reporting.
Leaked benchmark numbers look strong on paper: 81% on SWE-bench, competitive with the best proprietary models. But self-reported benchmarks from Chinese AI labs (or any AI lab, frankly) deserve a healthy dose of skepticism until independent testing happens.
So what does the tiered interface tell us?
The Fast Mode and Expert Mode split is interesting beyond its role as a V4 teaser. DeepSeek built its reputation on being entirely free with no usage tiers. "We've optimized every layer," is the kind of thing the company's backers at High-Flyer Capital might say, but free global-scale AI inference is expensive, no matter how efficient your architecture is.
The tiering suggests DeepSeek is finally confronting that reality. Expert Mode reportedly performs better on complex math and physics reasoning. A rumored "Vision Mode" may follow. Whether this eventually leads to a paid tier remains unclear, but the direction feels obvious.
What happens next
DeepSeek and Huawei have both declined to comment. If V4 performs at a competitive level on domestic chips, it directly challenges the assumption behind US chip export controls: that Chinese companies can't build frontier AI without NVIDIA. If the model underperforms, the geopolitical narrative stays the same, and NVIDIA's moat looks intact. The stakes are binary in a way that most model launches aren't.
Independent benchmark results will be the thing to watch. The model is expected before the end of April.




