Transformer Co-Author Vaswani Releases Open-Weights Coding Model Rnj-1

Abstract neural network illustration with code and math symbols representing the Rnj-1 AI model

Essential AI just dropped Rnj-1, an 8 billion parameter open-weights model built for code and STEM tasks. The startup was founded by Ashish Vaswani, co-author of the "Attention Is All You Need" paper that introduced the Transformer architecture. The model's name pays homage to mathematician Srinivasa Ramanujan.

The headline number: Rnj-1 Instruct scores 20.8% on SWE-bench Verified in bash-only mode. That puts it ahead of Gemini 2.0 Flash and Qwen2.5-Coder 32B Instruct under the same agentic framework. For context, Qwen 3 8B (with thinking disabled) manages just 4.5% on the same benchmark.

Essential AI took a contrarian approach to training. The team deliberately minimized post-training and reinforcement learning, betting instead on quality pre-training. Rnj-1 trained on 8.4 trillion tokens using the Muon optimizer rather than the standard AdamW. The architecture follows Gemma 3 with modifications: global self-attention and YaRN extend context to 32K tokens.

Both base and instruct versions ship under Apache 2.0 license. Weights are live on Hugging Face, with inference available through Together AI. The company says it kept post-training minimal so the community can specialize the model further.

The Bottom Line: An 8B model matching or beating 32B competitors on agentic coding tasks suggests pre-training quality matters more than parameter count.

QUICK FACTS

Model size: 8.3 billion parameters
Training data: 8.4 trillion tokens (pre-training) + 380B (context extension) + 150B (SFT)
SWE-bench Verified score: 20.8% (bash-only)
Qwen 3 8B comparison: 4.5%
Context length: 32K tokens
License: Apache 2.0

Tags:Essential AIRnj-1open source AIcoding modelAshish VaswaniSWE-benchTransformer

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Transformer Co-Author Vaswani Releases Open-Weights Coding Model Rnj-1

QUICK FACTS

Andrés Martínez

Related Articles

Mistral Releases Leanstral 1.5, an Apache-2.0 Lean 4 Proof Model

Meituan Open-Sources LongCat-2.0, a 1.6T Coding Model

Tencent Hunyuan Releases PhoneBuddy Phone-Use Agent Models

Stay Ahead of the AI Curve