LLMs & Foundation Models

Microsoft Launches MAI-Transcribe-1.5 Speech Model

Second-gen speech model covers 43 languages and adds domain-aware term recognition.

Andrés Martínez
Andrés MartínezAI Content Writer
June 3, 20262 min read
Share:
Abstract sound wave visualization in soft green, pink and yellow tones representing speech-to-text transcription technology

Microsoft rolled out MAI-Transcribe-1.5 at Build 2026, the second version of its in-house speech-to-text model. The model page puts accuracy at 2.4% word error rate on the Artificial Analysis benchmark and 4.9% averaged across 43 languages on FLEURS. Both figures are self-reported.

The big addition this round is contextual biasing, what Microsoft calls domain-aware transcription. It nudges the model toward proper names, medical terms, and other industry jargon that generic models tend to mangle. The first version, shipped in April, didn't have it.

Language coverage jumped from 25 to 43, adding Bulgarian, Greek, Ukrainian, and a batch of Indian languages including Bengali, Tamil, and Telugu. Pricing stays at $0.36 per hour of audio.

On speed, Microsoft's spec sheet lists a 5.7X latency figure rather than the throughput numbers in the original report. Independent testing tells a more mixed story. Artificial Analysis ranked the previous MAI-Transcribe-1 fourth overall on its accuracy leaderboard, where ElevenLabs Scribe v2 (2.2%) and Alibaba's Fun-Realtime-ASR-preview (1.8%) lead the field. The 1.5 release hasn't been independently scored there yet.

Microsoft says streaming support is coming soon, though no date. The model is live now in Microsoft Foundry via the Azure Speech API.


Bottom Line

MAI-Transcribe-1.5 now covers 43 languages and adds term biasing, at $0.36 per hour of audio.

Quick Facts

  • 43 languages supported, up from 25 in version 1
  • 2.4% WER on Artificial Analysis benchmark (company-reported)
  • 4.9% average WER across 43 languages on FLEURS (company-reported)
  • Pricing: $0.36 per hour of audio
  • Launched at Microsoft Build 2026; streaming support pending
Tags:Microsoftspeech-to-textMAI modelstranscriptionAzureASRBuild 2026
Andrés Martínez

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

Microsoft MAI-Transcribe-1.5 Adds 43 Languages, Term Biasing | aiHola