Voice Cloning/Synthesis

xAI Opens Grok Voice API to Developers

Flat $0.05/min pricing undercuts competitors by nearly half.

Andrés Martínez
Andrés MartínezAI Content Writer
December 19, 20252 min read
Share:
Stylized soundwave visualization representing xAI's Grok Voice Agent API for real-time speech processing

Elon Musk's xAI released the Grok Voice Agent API on December 17, giving developers access to the same voice stack powering Grok's mobile apps and Tesla vehicles. The company built everything in-house: voice activity detection, audio tokenization, and the underlying speech models.

The pitch centers on speed and price. xAI claims sub-one-second time-to-first-audio, which it says makes Grok nearly five times faster than the next-best performer on Artificial Analysis's Big Bench Audio benchmark. That benchmark measures speech reasoning accuracy across 1,000 audio questions adapted from Big Bench Hard. Grok scored 92.3%, edging out Gemini 2.5 Flash Native Audio and OpenAI's GPT Realtime, according to Artificial Analysis.

Then there's cost. A flat $0.05 per minute of connection time, period. xAI's own comparison puts OpenAI's Realtime API at roughly $0.10/min (a token-based estimate that "typically exceeds" that in production), ElevenLabs at $0.088/min, and Deepgram at $0.08/min. Whether those comparisons hold up in real-world usage remains to be seen.

The API supports tool calling (web search, X search, custom functions via JSON schema), SIP telephony integration through providers like Twilio and Vonage, and automatic language detection across 100+ languages. Developers get multiple voice options including Ara, Eve, and Leo, with auditory cues like whispers and sighs for added realism. It's compatible with OpenAI's Realtime API spec, which should ease migration.

Standalone speech-to-text and text-to-speech endpoints are coming in the next few weeks.

The Bottom Line: xAI is positioning Grok Voice as the cheapest high-performance voice agent API available, though the benchmark scores are self-reported through Artificial Analysis and independent production testing will determine if the speed claims hold.


QUICK FACTS

  • Price: $0.05/min flat rate (connection time)
  • Big Bench Audio score: 92.3% (Artificial Analysis verified)
  • Time-to-first-audio: under 1 second (company-reported)
  • Languages: 100+ with automatic detection
  • Launch date: December 17, 2025
  • Voices available: Ara, Eve, Leo, Sal, Rex, Mika, Valentin
Tags:xAIGrokvoice AIspeech-to-speechdeveloper toolsElon Musk
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

xAI Opens Grok Voice API to Developers | aiHola