OpenBMB Releases VoxCPM 1.5 with Studio-Quality Audio Output

Audio waveform visualization showing quality improvement from low to high fidelity

OpenBMB dropped VoxCPM 1.5 on December 5, and the headline number is the sampling rate: 44.1kHz, up from 16kHz in the original release. That's CD-quality audio from a model you can run locally.

The efficiency gains matter more than they look. VoxCPM 1.5 encodes one second of audio in 6.25 tokens instead of 12.5. Halving the token rate doesn't just speed things up; it opens the door to longer audio generation without blowing up memory. The team notes RTF (real-time factor) on an RTX 4090 stays around 0.17, unchanged from before despite the quality bump. The model's trained on 1.8 million hours of Chinese and English audio, which shows in how well it handles context-aware prosody.

New fine-tuning scripts for LoRA and full parameter training ship with this release. OpenBMB is clearly betting that developers want to customize voice cloning for specific use cases rather than treating TTS as a black box. The Apache 2.0 license stays intact.

VoxCPM-0.5B remains supported for anyone not ready to upgrade. Output quality still depends heavily on reference audio quality, per the release notes, so don't expect miracles from noisy clips.

The Bottom Line: A meaningful upgrade for open-source TTS, with the 44.1kHz sampling rate and halved token requirements addressing the two biggest complaints about VoxCPM's first release.

QUICK FACTS

Sampling rate: 44.1kHz (up from 16kHz)
Token rate: 6.25 tokens per second of audio (previously 12.5)
Training data: 1.8 million hours (bilingual Chinese/English)
RTF: 0.17 on RTX 4090 (company-reported)
License: Apache 2.0

Tags:VoxCPMTTSvoice cloningOpenBMBspeech synthesisopen source

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

OpenBMB Releases VoxCPM 1.5 with Studio-Quality Audio Output

QUICK FACTS

Andrés Martínez

Related Articles

OpenBMB Open-Sources MiniCPM5-1B Weights, Data, and Code

Liquid AI Ships 8B On-Device MoE Model LFM2.5-8B-A1B

StepFun Releases Open-Weight Step 3.7 Flash for Agentic Work

Stay Ahead of the AI Curve