Rakuten AI 3.0 Built on DeepSeek V3, Config Files Reveal

Abstract visualization of overlapping Japanese and Chinese language data flowing through a neural network architecture

Rakuten Group released Rakuten AI 3.0 on March 17, calling it Japan's largest high-performance AI model. Within hours, users on X found something awkward in the Hugging Face repo: the config.json file lists "model_type": "deepseek_v3," and the model's tags include both "DeepSeek-V3" and "Mistral."

The 671 billion parameter Mixture of Experts architecture with 37 billion activated parameters per token matches DeepSeek V3 exactly. Which makes the framing in Rakuten's press release worth reading closely.

What Rakuten actually says (and doesn't)

The press release describes the model as "developed by leveraging the best from the open source community and building on Rakuten's high-quality, bilingual original data, engineering and research." That's a careful sentence. It doesn't say Rakuten built the architecture. It doesn't say the model was trained from scratch. It says they leveraged open-source models and added their own data on top.

So technically, fine-tuning DeepSeek V3 on Japanese data fits that description. The question is whether most people reading the headline "Japan's Largest High-Performance AI Model" would walk away thinking it was built in Japan from the ground up.

Ting Cai, Rakuten's Chief AI & Data Officer, called it "an outstanding combination of data, engineering and innovative architecture at scale," which seems like a generous description of what may be a fine-tune with bilingual data. Cai hasn't addressed the DeepSeek connection specifically.

The government funding angle

This gets more complicated. Rakuten AI 3.0 was developed under the GENIAC project, a government initiative run by Japan's Ministry of Economy, Trade and Industry (METI) and NEDO that provides computing resources to bolster domestic AI development. Rakuten was selected for the program's third term in July 2025, and part of the training cost came from GENIAC funding.

GENIAC's stated purpose is to raise Japan's capability to develop generative AI. If the resulting model is a fine-tuned version of a Chinese open-source model, that raises uncomfortable questions about what "domestic development" means in practice. The program's whole point is strengthening Japan's AI independence.

The bias problem

Users also reported that when asked politically sensitive questions, the model's responses lean toward Chinese perspectives rather than Japanese ones. If the base model is indeed DeepSeek V3 and the fine-tuning didn't sufficiently adjust these tendencies, that's not surprising. DeepSeek models are trained on Chinese data with Chinese alignment. A Japanese language fine-tune wouldn't necessarily overwrite those deeper patterns.

Rakuten claims the model outperforms GPT-4o on Japanese cultural knowledge, history, and instruction-following benchmarks. Those claims are plausible for a large model fine-tuned on quality Japanese data, but the benchmark selection matters. Rakuten picked tests where a Japanese-optimized model would naturally excel: cultural Q&A, history, instruction compliance. Nobody is claiming it beats GPT-4o across the board.

What this is really about

Fine-tuning open-source models is standard practice. Nothing wrong with it. Mistral, Llama, and DeepSeek are all used as base models across the industry. But the presentation matters. When you accept government funding earmarked for domestic AI capability, market the result as Japan's largest AI model, and bury the provenance in a config file, people notice.

Rakuten hasn't issued a response to the controversy. The model sits on Hugging Face under an Apache 2.0 license, 673 GB of weights that anyone can inspect. The config file isn't hidden, it just wasn't mentioned in the press release. Whether that counts as transparency or obfuscation depends on your expectations.

Tags:RakutenDeepSeekJapanese AIopen-source LLMGENIACfine-tuningHugging FaceMixture of ExpertsMETIAI controversy

Liza Chan

AI & Emerging Tech Correspondent

Liza covers the rapidly evolving world of artificial intelligence, from breakthroughs in research labs to real-world applications reshaping industries. With a background in computer science and journalism, she translates complex technical developments into accessible insights for curious readers.

Rakuten AI 3.0 Launches as Japan's Biggest LLM, but Config Files Point to DeepSeek V3

What Rakuten actually says (and doesn't)

The government funding angle

The bias problem

What this is really about

Liza Chan

Related Articles

Anthropic Doubles Claude Code Limits After Buying Out Colossus 1

Anthropic Says Claude Opus 4.7 Halved Sycophancy in Relationship Advice

Mitchell Hashimoto Pulls Ghostty From GitHub After Daily Outages

Stay Ahead of the AI Curve