Inworld Launches TTS-1.5 with Sub-130ms Latency

Abstract visualization of AI-powered voice synthesis with sound waves and speech elements

Inworld AI launched TTS-1.5 on January 21, two new text-to-speech models that the company says deliver the fastest realtime voice synthesis currently available. The announcement puts latency figures at under 130ms for the Mini model and under 250ms for Max, which Inworld claims represents a 4x improvement over its previous generation.

The pricing is aggressive: $0.005 per minute for Mini, $0.01 for Max. That works out to $5-10 per million characters, which Inworld says is 25x cheaper than competitors. The company hasn't named which competitors, though ElevenLabs and OpenAI are the obvious targets. Independent verification of that cost comparison isn't available.

Quality metrics come from Inworld's own testing: 40% lower word error rate and 30% more expressiveness than TTS-1. The models hold the top positions on the Artificial Analysis TTS leaderboard, though that ranking appears to reflect the earlier TTS-1 models rather than 1.5 specifically. Layercode CEO Damien Tanner called the results "unmatched voice realism at a fraction of the cost," though his company is an integration partner.

TTS-1.5 supports 15 languages, with on-premise deployment for enterprise customers. Inworld has also open-sourced its training framework.

The Bottom Line: Inworld is betting that latency and price will win the TTS market; whether the quality claims hold under independent testing remains to be seen.

QUICK FACTS

Mini latency: <130ms (P90), Max latency: <250ms (P90)
Pricing: $0.005/min (Mini), $0.01/min (Max)
15 languages supported
40% word error rate improvement (company-reported)
Available via API, with on-prem deployment option

Tags:Inworld AITTStext-to-speechvoice AIspeech synthesisAI modelsrealtime voice

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Inworld Launches TTS-1.5 with Sub-130ms Latency

QUICK FACTS

Andrés Martínez

Related Articles

Anthropic Extends Claude Fable 5 Subscription Access to July 12

Microsoft Swaps OpenAI and Anthropic for MAI Models in Excel and Outlook

Meta Superintelligence Labs Launches Muse Image, Previews Muse Video

Stay Ahead of the AI Curve