AI Hardware

OpenAI Merges Teams to Fix Its Audio Problem Before the Hardware Launch

The company is racing to close the gap between its text and voice models before shipping a screenless device.

Liza Chan
Liza ChanAI & Emerging Tech Correspondent
January 2, 20264 min read
Share:
Abstract sound wave visualization representing voice-first AI technology

OpenAI has quietly combined its engineering, product, and research teams over the past two months to overhaul its audio AI. The push, first reported by The Information, is aimed at a new model expected in Q1 2026, timed to support a voice-first personal device arriving roughly a year from now.

The effort is being led by Kundan Kumar, who previously worked on audio at Character.AI before joining OpenAI. The guy knows the space. He co-founded Lyrebird AI, which pioneered voice cloning before Descript acquired it in 2019. So not some random PM getting thrown at the problem.

The actual problem

Here's what's interesting: OpenAI's current audio model, the one powering ChatGPT's voice features, lags behind its text models in both speed and accuracy. That shortfall has become a key focus, especially as OpenAI prepares to release its first line of voice-first devices.

The new model is supposed to sound more natural, handle interruptions mid-sentence, and even speak while you're talking. That last bit, the ability to actually overlap speech like a real conversation, is something current models can't do. They're all call-and-response. It's why talking to Siri still feels like leaving voicemails.

There's reportedly a new architecture involved, though the details are thin. OpenAI's current real-time model uses transformers. Whether they're pivoting entirely or just implementing a new variant isn't clear.

The Jony Ive situation

This is all happening because OpenAI spent $6.5 billion to acquire io, Jony Ive's hardware startup, back in May. All-stock deal, so technically no cash changed hands, but still. The io team, about 55 people, is now merged with OpenAI and working on what they're calling a "family" of devices.

What those devices actually are? Still unclear. The company is reportedly grappling with issues pertaining to the device's "personality," how it handles data privacy and its computing infrastructure. But some clues have emerged. Court filings from a trademark dispute revealed that OpenAI and io executives have been testing at least 30 different headphone sets, researching what's on the market.

Ive said in November they'd reveal hardware in two years or less. Altman called the first prototypes "jaw-dropping good," which is what you'd expect him to say.

The graveyard

Worth remembering what's already failed in this space. Humane's AI Pin burned through hundreds of millions before HP bought the remains for $116 million in February, less than half what they'd raised. The devices stopped working entirely by month's end. Returns had outpaced sales by summer 2024.

The Friend AI pendant, a thing that records your life and offers companionship, has raised privacy concerns that probably won't go away.

And now there are AI rings coming. Sandbar just launched one. Eric Migicovsky, the guy who made Pebble smartwatches, is building another. Both expected in 2026. Allowing wearers to, as TechCrunch put it, "literally talk to the hand."

Everyone's betting on audio

OpenAI's new audio model, slated for early 2026, will reportedly sound more natural, handle interruptions like an actual conversation partner, and even speak while you're talking, which is something today's models can't manage.

Meta has Ray-Ban glasses with a five-microphone array that helps isolate conversations in noisy rooms. Google's testing Audio Overviews that turn search results into spoken summaries. Tesla is putting Grok and other LLMs into cars for voice control.

The thesis across all of it: screens are out, audio is in. Every space, home, car, your face, becomes an interface.

Ive's angle on this is reducing device addiction, apparently. Seeing audio-first design as a chance to fix past mistakes. Whether a company that wants you talking to AI all day counts as reducing addiction is a question nobody's asking.

What we don't know

The actual form factor of OpenAI's device remains a mystery. Altman told employees it would fit in a pocket or sit on a desk, fully aware of surroundings. A "third device" alongside phone and laptop. Earlier rumors suggested something like smart glasses or screenless speakers.

Recent GSMArena reporting claims the first product might actually be a pen. A contextually aware pen. Make of that what you will.

The Q1 2026 model launch seems locked in. The hardware is probably late 2026 at earliest. OpenAI reportedly wants to ship 100 million units "faster than any company has ever shipped 100 million of something new before."

That's a very Sam Altman thing to say.

Tags:OpenAIaudio AIvoice interfaceJony IveAI hardwareChatGPT
Liza Chan

Liza Chan

AI & Emerging Tech Correspondent

Liza covers the rapidly evolving world of artificial intelligence, from breakthroughs in research labs to real-world applications reshaping industries. With a background in computer science and journalism, she translates complex technical developments into accessible insights for curious readers.

Related Articles

Stay Ahead of the AI Curve

Get the latest AI news, reviews, and deals delivered straight to your inbox. Join 100,000+ AI enthusiasts.

By subscribing, you agree to our Privacy Policy. Unsubscribe anytime.

OpenAI Merges Teams to Fix Its Audio Problem Before the Hardware Launch | aiHola