Back to Blog

OpenAI Launches New Voice Intelligence Features in Its API

OpenAI Launches New Voice Intelligence Features in Its API OpenAI Launches New Voice Intelligence Features in Its API OpenAI Launches New Voice Intelligence Features in Its API

OpenAI Launches New Voice Intelligence Features in Its API

OpenAI announced Thursday that its API will now include several new voice intelligence features designed to help developers create apps that can talk, transcribe, and translate conversations with users.

GPT-Realtime-2: Advanced Voice Model

  • Built with GPT-5-class reasoning to handle more complicated user requests
  • Creates realistic vocal simulations for user conversations
  • More sophisticated than its predecessor (GPT-Realtime-1.5)
  • Billed by token consumption

GPT-Realtime-Translate: Real-Time Translation

  • Provides real-time translation services that "keep pace" with users conversationally
  • Supports 70+ input languages (languages it can comprehend)
  • Supports 13 output languages (languages it relays to the speaker)
  • Billed by the minute

GPT-Realtime-Whisper: Live Transcription

  • Offers live speech-to-text capabilities captured as interactions occur
  • Enables real-time transcription during conversations
  • Billed by the minute

Key Applications

Primary Use Cases:

  • Customer service systems
  • Education platforms
  • Media and events
  • Creator platforms

"Together, the models we are launching move real-time audio from simple call-and-response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds," OpenAI stated.

Safety Guardrails

OpenAI has implemented safeguards to prevent misuse:

  • Protection against spam and fraud
  • Embedded triggers to halt conversations that violate harmful content guidelines
  • Built-in abuse prevention measures

Availability

All new voice models are included in OpenAI's Realtime API.

Pricing Structure:

  • Translate and Whisper: Billed by the minute
  • GPT-Realtime-2: Billed by token consumption