Tech

OpenAI launches three audio models for developers

OpenAI launches three audio models for real-time voice tasks to enhance conversational software agents

Published

2 months ago

May 8, 2026

Harvey Vargas

OpenAI launches three audio models for real-time voice tasks to enhance conversational software agents

https://live-hls-7agy.livepush.io/live_abr_cdn/nsitWAl1pTZtwca/emnmZbKZcO8RfwJR/index.m3u8

In Short:
– OpenAI introduced three new audio models for developers to enhance voice software capabilities.
– The models support real-time tasks, including complex requests, translation, and live speech-to-text.

OpenAI introduced three audio models for its developer platform, enhancing voice software capabilities.The new models aim to create more conversational agents capable of real-time task completion during live interactions.

New audio models

The models include GPT-Realtime-2, GPT-Realtime-Translate and GPT-Realtime-Whisper, available for testing in the developer playground.

GPT-Realtime-2 handles complex requests and maintains context in long voice sessions.

GPT-Realtime-Translate facilitates translation from over 70 languages into 13, suitable for customer support and educational settings.

GPT-Realtime-Whisper provides live speech-to-text capabilities, generating captions and meeting notes during discussions.

Customers testing these models comprise online real estate marketplace Zillow, travel agency Priceline, and telecommunications firm Deutsche Telekom.

Pricing for GPT-Realtime-2 starts at $32 per million audio input tokens, while GPT-Realtime-Translate is $0.034 per minute and GPT-Realtime-Whisper is $0.017 per minute.

Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents.

Voice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold.

Now available in the API… pic.twitter.com/2DY1LU2vO8

— OpenAI (@OpenAI) May 7, 2026

Pricing details

OpenAI’s latest advancements could significantly impact various industries, enhancing efficiency and user interactions.

The focus on real-time capabilities reflects the growing demand for integrated voice technology solutions.

Developers are encouraged to explore these tools to improve their voice-based applications.

The integration of these models could represent a significant step towards more intuitive software agents.

Ticker

Tech

OpenAI launches three audio models for developers

OpenAI launches three audio models for real-time voice tasks to enhance conversational software agents

New audio models

Pricing details

Trending Now