OpenAI officially launched its new foundational model GPT-5.4, aiming to create the most powerful and efficient professional work frontier model available today. According to AIbase, this series adopted a differentiated release strategy, launching not only the standard version but also the GPT-5.4Thinking, a reasoning model focused on complex logic, and the GPT-5.4Pro, optimized for high-performance needs.

In terms of technology, the API version of GPT-5.4 has made a significant leap, offering a context window of up to 1 million tokens, the largest in OpenAI's history. At the same time, the model significantly improves token efficiency, solving similar problems with fewer resources.
In terms of safety and accuracy, the new model reduces the single statement error rate by 33% compared to GPT-5.2, and the overall response error rate drops by 18%. In addition, to address potential "chain-of-thought deception" risks in reasoning models, OpenAI introduced a new security evaluation system. Tests show that GPT-5.4Thinking has higher transparency and is difficult to hide or fabricate its reasoning process.
In practical benchmark tests, GPT-5.4 performed strongly, not only breaking records in computer usage tests such as OSWorld-Verified and WebArena Verified, but also achieving an impressive score of 83% in the GDPval knowledge task test.
Mercor CEO Brendan Foody pointed out that the model also leads in the APEX-Agents benchmark tests in professional fields such as finance and law, especially excelling in generating financial models, legal analysis, and other long-term deliverables. Combined with the new "tool search" system, the model is more efficient when calling external tools, significantly reducing token loss in large-scale tool integration scenarios.