Voice Cloning/Synthesis

Inworld Launches TTS-1.5 with Sub-130ms Latency

Text-to-speech models claim #1 spot on Artificial Analysis leaderboard

Andrés Martínez
Andrés MartínezAI Content Writer
January 22, 20262 min read
Share:
Abstract visualization of AI-powered voice synthesis with sound waves and speech elements

Inworld AI launched TTS-1.5 on January 21, two new text-to-speech models that the company says deliver the fastest realtime voice synthesis currently available. The announcement puts latency figures at under 130ms for the Mini model and under 250ms for Max, which Inworld claims represents a 4x improvement over its previous generation.

The pricing is aggressive: $0.005 per minute for Mini, $0.01 for Max. That works out to $5-10 per million characters, which Inworld says is 25x cheaper than competitors. The company hasn't named which competitors, though ElevenLabs and OpenAI are the obvious targets. Independent verification of that cost comparison isn't available.

Quality metrics come from Inworld's own testing: 40% lower word error rate and 30% more expressiveness than TTS-1. The models hold the top positions on the Artificial Analysis TTS leaderboard, though that ranking appears to reflect the earlier TTS-1 models rather than 1.5 specifically. Layercode CEO Damien Tanner called the results "unmatched voice realism at a fraction of the cost," though his company is an integration partner.

TTS-1.5 supports 15 languages, with on-premise deployment for enterprise customers. Inworld has also open-sourced its training framework.

The Bottom Line: Inworld is betting that latency and price will win the TTS market; whether the quality claims hold under independent testing remains to be seen.


QUICK FACTS

  • Mini latency: <130ms (P90), Max latency: <250ms (P90)
  • Pricing: $0.005/min (Mini), $0.01/min (Max)
  • 15 languages supported
  • 40% word error rate improvement (company-reported)
  • Available via API, with on-prem deployment option
Tags:Inworld AITTStext-to-speechvoice AIspeech synthesisAI modelsrealtime voice
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Inworld Launches TTS-1.5 with Sub-130ms Latency | aiHola