MiniMax released M2.7 on March 18, a proprietary model the company says actively participated in its own reinforcement learning process. Per the announcement post, the model built dozens of complex skills within its RL harness, updated its own memory, and optimized its training pipeline based on experiment results. MiniMax claims M2.7 handled 30-50% of the research workflow that previously required multiple human researchers.
That self-evolution claim is the headline, but the benchmark numbers tell a more grounded story. M2.7 scored 56.22% on SWE-Pro, which MiniMax says matches GPT-5.3-Codex. On Terminal Bench 2 it hit 57.0%, and on VIBE-Pro (repo-level code generation) it reached 55.6%. Its GDPval-AA ELO of 1495 ranks highest among open-source-lineage models, though it trails Opus 4.6, Sonnet 4.6, and GPT-5.4. All benchmarks are company-reported.
In a separate test, MiniMax ran M2.7 through 22 MLE Bench Lite machine learning competitions with 24-hour autonomous runs. The best attempt earned 9 gold, 5 silver, and 1 bronze medal, averaging a 66.6% medal rate across three trials. That ties Google's Gemini 3.1 but sits behind Opus 4.6 at 75.7%.
Pricing stays at $0.30 per million input tokens and $1.20 per million output, unchanged from M2.5. Two API versions ship: standard M2.7 and a faster M2.7-highspeed variant. MiniMax also open-sourced OpenRoom, an interactive demo placing AI characters in a web GUI environment. Available now on the MiniMax API and MiniMax Agent.
Bottom Line
M2.7 matches GPT-5.3-Codex on SWE-Pro at $0.30/$1.20 per million tokens, but the self-evolution claims lack independent verification.
Quick Facts
- SWE-Pro: 56.22% (company-reported, matches GPT-5.3-Codex per MiniMax)
- Terminal Bench 2: 57.0%
- GDPval-AA ELO: 1495 (highest among open-source-lineage models)
- MLE Bench Lite medal rate: 66.6% average across 3 trials
- Pricing: $0.30/1M input, $1.20/1M output (unchanged from M2.5)
- Context window: 200k tokens




