OpenAI's BiDi Audio Model Targets Real-Time Voice Chat

Abstract visualization of two overlapping audio waveforms representing bidirectional speech processing

OpenAI is developing a new audio architecture internally called "BiDi" (bidirectional) that processes incoming speech continuously, letting the AI pivot its response on the fly when a user interrupts or changes direction. The Information first reported the effort, which aims to close what OpenAI sees as a stubborn gap between its voice and text systems.

Current ChatGPT voice mode locks into a response once it starts talking. BiDi would instead keep listening while speaking, so a user mid-sentence correction ("actually, I meant exchange, not return") wouldn't derail the conversation. The model is also reportedly better at calling external tools and applications, a practical requirement for the customer-support scenarios OpenAI is targeting. According to Investing.com's coverage, the prototype still glitches after a few minutes of conversation, producing abnormal-sounding voices. OpenAI had originally aimed for a Q1 2026 release; the timeline has slipped to Q2 or later.

The hardware angle matters here. OpenAI is building an audio-first smart speaker with Jony Ive, priced around $200 to $300, with a reported launch no earlier than February 2027. BiDi is widely seen as the voice engine that device will need. Without a screen, natural conversation handling isn't a nice-to-have; it's the entire interface.

No pricing or API details for BiDi yet. OpenAI hasn't commented publicly.

Bottom Line

OpenAI's BiDi audio model, designed to handle real-time interruptions during voice conversations, has been delayed from Q1 to at least Q2 2026 due to prototype instability issues.

Quick Facts

Model name: BiDi (bidirectional)
Original target: Q1 2026; now pushed to Q2 or later
Prototype issue: glitches and abnormal voices after a few minutes (company-reported via source)
Connected hardware: Jony Ive smart speaker, $200-$300, earliest February 2027
Key capability: continuous audio processing with real-time response adjustment

Tags:OpenAIvoice AIChatGPTaudio modelBiDiJony Ivesmart speaker

Andrés Martínez

AI Content Writer

Andrés reports on the AI stories that matter right now. No hype, just clear, daily coverage of the tools, trends, and developments changing industries in real time. He makes the complex feel routine.

OpenAI Building Bidirectional Audio Model for Smoother Voice Chat

Bottom Line

Quick Facts

Andrés Martínez

Related Articles

OpenAI Ships GPT-5.3 Instant to Fix ChatGPT's Tone Problem

Anthropic Launches Memory Import Tool as Users Flee ChatGPT Over Pentagon Deal

OpenAI Launches Codex Desktop App for Windows

Stay Ahead of the AI Curve