OpenAI shipped three voice models to its Realtime API on Thursday, all built for live conversation rather than batch transcription. The headline release is GPT-Realtime-2, which the company's announcement calls its first voice model with GPT-5-class reasoning. Context jumps from 32K to 128K tokens. Reasoning effort is adjustable across five tiers from minimal to xhigh, and the model can fire parallel tool calls while saying things like "let me check that" so the line doesn't go dead while it works.
The other two are narrower. GPT-Realtime-Translate handles 70+ input languages into 13 output languages live. GPT-Realtime-Whisper is a streaming speech-to-text model meant for captions and meeting notes as the speaker talks, not post-recorded audio.
Pricing splits by model. Realtime-2 stays on per-token billing at $32 per million input audio tokens and $64 output, with cached input at $0.40. Translate runs $0.034 per minute, Whisper $0.017 per minute, easier numbers to forecast against than the token model.
OpenAI's self-reported benchmarks show Realtime-2 (high) scoring 15.2% above Realtime-1.5 on Big Bench Audio, and the xhigh variant scoring 13.8% higher on Audio MultiChallenge for instruction-following. Zillow, an early tester, reports a 26-point jump in call success rate (95% vs. 69%) on what it calls its hardest adversarial benchmark, after prompt tuning. Both figures come from interested parties.
All three models are live in the API now, documented in the Realtime developer guide. EU data residency is supported.
Bottom Line
GPT-Realtime-2 quadruples the context window to 128K tokens and stays at $32/$64 per million audio input/output tokens.
Quick Facts
- Three models: GPT-Realtime-2, GPT-Realtime-Translate, GPT-Realtime-Whisper
- Context window: 32K to 128K tokens
- GPT-Realtime-2 pricing: $32/1M input audio tokens, $64/1M output, $0.40 cached input
- Translate: $0.034/min; Whisper: $0.017/min
- Translate supports 70+ input languages, 13 output languages (company-reported)
- Zillow reports 95% vs. 69% call success rate on its own benchmark (unverified)




