OpenAI releases GPT-5.2-Codex: The strongest intelligent agent programming model is here, capable of autonomously identifying vulnerabilities and submitting PRs

OpenAI has officially launched its most advanced agent programming model to date - GPT-5.2-Codex, marking a new era in AI's practical capabilities in the field of software engineering. This model is specifically designed to tackle complex, long-term real-world coding tasks. It not only sets new records in authoritative benchmark tests but also achieves full autonomous operation from code understanding, environment setup, to vulnerability discovery and submitting Pull Requests.

GPT-5.2-Codex is not just an incremental update but integrates the general reasoning capabilities of GPT-5.2 with the terminal operation skills of GPT-5.1-Codex-Max, and introduces a revolutionary "context compression" technology - significantly improving efficiency and accuracy when handling tasks that require ultra-long context, such as code refactoring and cross-library migration.

For developers' real workflows, the model has significantly improved execution reliability in native Windows 10/11 environments, moving beyond previous Linux-centric limitations. More notably, its visual understanding capabilities have taken a major leap: developers need only upload UI screenshots, technical diagrams, or hand-drawn sketches, and Codex can accurately interpret design intent and automatically generate clear, executable front-end or full-stack prototype code, greatly shortening the cycle from design to production.

In authoritative evaluations, GPT-5.2-Codex set new records in SWE-Bench Pro (software engineering repair) and Terminal-Bench 2.0 (terminal operations), with significantly higher tool calling success rates and factual consistency compared to its predecessors. Now, it can independently complete:

- Navigate large codebases

- Automatically write test cases

- Execute fuzz testing

- Generate security patches

- Create complete GitHub Pull Requests

The practical value has been verified in the security field. OpenAI disclosed that Andrew MacPherson, the chief engineer at security company Privy, used the previous generation of Codex models to successfully reproduce and deeply explore three unknown vulnerabilities in React Server Components. The AI agent fully assisted in setting up the testing environment, reasoning about attack surfaces, and executing automated testing, compressing the vulnerability verification cycle from days to hours.

Facilitating powerful capabilities comes with "dual-use" risks, so OpenAI has adopted a cautious deployment strategy: although it is not classified as a "high-risk" model, it has built-in multiple protective mechanisms. At the same time, the company has launched the "Trusted Access Pilot" program, granting high-privilege versions only to strictly vetted security researchers and critical infrastructure teams for controlled environment threat simulations and defense exercises.

Currently, all ChatGPT paying users can directly use GPT-5.2-Codex, and API access rights will be gradually opened in the coming weeks. When AI can not only write code but also understand business, fix vulnerabilities, and collaborate on development, the role of programmers is shifting from "coders" to "AI commanders" - and GPT-5.2-Codex is the strongest enabler of this paradigm shift.

OpenAI releases GPT-5.2-Codex: The strongest intelligent agent programming model is here, capable of autonomously identifying vulnerabilities and submitting PRs

Related Recommendations

Valuation Rises to 350 Billion Dollars! Anthropic Plans to Introduce a Stock Redemption Program, Intensifying the Competition for AI Talent

Major Update to OpenAI's Flagship Model: GPT-5.2 Series Doubles Inference Speed, Price Remains Unchanged

DeepMind Hosts an AI Offline Board Game Session: Gemini 3 Family Dominates Poker and Werewolf Rankings

Earthquake-Level Update in the Programming World! Details of Claude5 Core Leaked: Mid-Range Pricing but Can Outperform Flagship Models?

Claude 5 Revealed: Anthropic Unveils the Fennec Programming Model, Bringing a Major Shift in the Landscape