OpenAI and chip giant Cerebras announced a deep collaboration, officially launching the latest model optimized for real-time development scenarios — GPT-5.3-Codex-Spark. As the first public outcome of their collaboration, this model aims to completely solve the "waiting anxiety" in AI programming.

image.png

The core strengths of Codex-Spark:

  • Ultra-fast inference: Thanks to the computing power of Cerebras Wafer-Scale Engine, the model's inference speed has exceeded 1000 tokens/s. This means code generation is almost as fast as thoughts, achieving true real-time feedback.

  • Developer-driven: OpenAI points out that while current "agentic coding" can work automatically, it often makes developers feel they lose control. Codex-Spark is positioned as a "guidable collaborative tool," excelling in precise code modifications and context-based Q&A, keeping developers at the center of decision-making.

  • Compact yet powerful architecture: As a "high-capability small model" optimized for fast response, Codex-Spark significantly reduces task time on software engineering benchmarks such as SWE-Bench Pro, and its answer quality exceeds that of the previous GPT-5.1-Codex-mini.

Application scenarios and release scope:

This model is especially suitable for quickly visualizing UI layouts, optimizing styles, testing interface changes, and adjusting complex codebase logic. Currently, Codex-Spark has been released in "research preview" form to ChatGPT Pro users, covering dedicated apps, command-line tools (CLI), and VS Code extensions.