OpenAI Voice API Major Upgrade: More Accurate Transcription, 40% Faster Agent Speed

OpenAI has recently launched two key API updates for developers around the world, aiming to significantly enhance the performance of AI agents in voice interactions and complex task flows.

In terms of models, the new real-time model gpt-realtime-1.5 and its accompanying audio model have been officially released. Their core goal is to improve the reliability of voice commands. According to internal test data from OpenAI, the new model has improved the transcription accuracy of numbers and letters by about 10%, increased the accuracy of logical audio tasks by 5%, and also improved the accuracy of instruction execution by 7%, effectively solving the issue of deviations when AI listens to key phrases or executes complex voice commands.

OpenAI

In terms of architecture, the Responses API now supports the WebSocket protocol, marking a major transformation in AI communication. Unlike previous modes where the entire context had to be retransmitted with each request, WebSocket allows developers to establish a persistent connection, and the system only sends incremental data when new information is generated.

OpenAI noted that this improvement is particularly crucial for complex AI agents that frequently call a large number of tools, as it can directly increase their operating speed by 20% to 40%

Three-Year Delayed Long Article: Former OpenAI Security VP Wang Li Analyzes Scaling Laws: Your Model May Have Been Trained on the Wrong Data

Lilian Weng returns with a deep dive into scaling laws, arguing the industry consensus may be reversed: from Kaplan to Chinchilla, the mainstream data allocation might not be optimal. It examines compute, model size, and data quantity trade-offs, implying the billions-invested path requires reconsideration, prompting a re-evaluation of pretraining recipes.....

Sticking to the Trillion-Dollar Bottom Line: OpenAI Exposed to Plan to Postpone IPO Until 2027

OpenAI reportedly delays its IPO to next year, after rumors of a confidential filing and a $1 trillion valuation target. It weighs two options: waiting until 2027 for a full trillion-dollar listing, or lowering the valuation to go public sooner. CEO Altman is said to be adamant about the latter.....

OpenAI Restricts Release of GPT-5.0: Federal Regulation Intervenes, Access Requires Government-by-Government Approval

OpenAI has adjusted its GPT-5.0 release plan, following a request from the Trump administration, canceling the public launch and only opening it to a small number of close partners, using a government-by-government approval authorization model; if the restricted phase goes smoothly, full deployment will start within a few weeks.

OpenAI Voice API Major Upgrade: More Accurate Transcription, 40% Faster Agent Speed

Related Recommendations

OpenAI Codex Individual User Usage Surges 137 Times, AI Programming Has Gone Beyond Programmers

Three-Year Delayed Long Article: Former OpenAI Security VP Wang Li Analyzes Scaling Laws: Your Model May Have Been Trained on the Wrong Data

U.S. Government Demands OpenAI to Release GPT-5.6 in Phases, Regulatory Pressure Becomes the Norm

Sticking to the Trillion-Dollar Bottom Line: OpenAI Exposed to Plan to Postpone IPO Until 2027

OpenAI Restricts Release of GPT-5.0: Federal Regulation Intervenes, Access Requires Government-by-Government Approval