Elon Musk's xAI released the Grok Voice Agent API on December 17, giving developers access to the same voice stack powering Grok's mobile apps and Tesla vehicles. The company built everything in-house: voice activity detection, audio tokenization, and the underlying speech models.
The pitch centers on speed and price. xAI claims sub-one-second time-to-first-audio, which it says makes Grok nearly five times faster than the next-best performer on Artificial Analysis's Big Bench Audio benchmark. That benchmark measures speech reasoning accuracy across 1,000 audio questions adapted from Big Bench Hard. Grok scored 92.3%, edging out Gemini 2.5 Flash Native Audio and OpenAI's GPT Realtime, according to Artificial Analysis.
Then there's cost. A flat $0.05 per minute of connection time, period. xAI's own comparison puts OpenAI's Realtime API at roughly $0.10/min (a token-based estimate that "typically exceeds" that in production), ElevenLabs at $0.088/min, and Deepgram at $0.08/min. Whether those comparisons hold up in real-world usage remains to be seen.
The API supports tool calling (web search, X search, custom functions via JSON schema), SIP telephony integration through providers like Twilio and Vonage, and automatic language detection across 100+ languages. Developers get multiple voice options including Ara, Eve, and Leo, with auditory cues like whispers and sighs for added realism. It's compatible with OpenAI's Realtime API spec, which should ease migration.
Standalone speech-to-text and text-to-speech endpoints are coming in the next few weeks.
The Bottom Line: xAI is positioning Grok Voice as the cheapest high-performance voice agent API available, though the benchmark scores are self-reported through Artificial Analysis and independent production testing will determine if the speed claims hold.
QUICK FACTS
- Price: $0.05/min flat rate (connection time)
- Big Bench Audio score: 92.3% (Artificial Analysis verified)
- Time-to-first-audio: under 1 second (company-reported)
- Languages: 100+ with automatic detection
- Launch date: December 17, 2025
- Voices available: Ara, Eve, Leo, Sal, Rex, Mika, Valentin




