Artificial intelligence company Anthropic has officially released its latest flagship model Claude Opus4.5. According to the company's official introduction, the model has reached the current world-leading level in key productivity scenarios such as coding, agent (intelligent agent) operations, and computer use. It also shows significant improvements in common tasks such as research, presentations, and spreadsheets.

image.png

Core capabilities: Coding, reasoning, and long-term task management

Significant improvement in software engineering capabilities

In real-world software engineering tests, Opus4.5 achieved a very high performance standard. Anthropic states that it can reason out repair solutions for complex, multi-system bug fixing tasks without excessive guidance. In the Terminal Bench test, the model performed about 15% better than its predecessor Sonnet4.5. Developer feedback indicates that Opus4.5 is particularly skilled in code migration and refactoring, with more complex but efficient reasoning paths.

Long-term work and automated agents

Opus4.5 supports a longer context window, according to the official page it is 200K tokens. (Anthropic)

Anthropic has added a "effort parameter" to its development platform, allowing developers to adjust the model's computational intensity: they can reduce "thinking" intensity for speed and cost savings or enhance quality to pursue optimal output.

It performs exceptionally well in multi-agent (multi-agent) tasks. Anthropic's evaluation shows that the model's ability to coordinate sub-agents in complex agent systems has improved, significantly enhancing the quality and efficiency of task completion.

image.png

Enhanced conventional office and productivity tools capabilities

In the Claude application, long conversations no longer easily reach the context limit: the model automatically summarizes early content to maintain conversation continuity. Chrome extension is now fully available to Max users; previously, it was only in trial status. Excel integration has also been updated: in internal evaluations, Opus4.5 improved accuracy by about 20% and efficiency by about 15% in complex financial modeling and automation tasks.

In the Claude Code desktop version, users can run multiple sessions in parallel (such as for debugging, documentation writing, and testing agent tasks), and Plan Mode (planning mode) has been further enhanced: the model will propose an editable plan file (such as plan.md) before execution and clarify the questions with the user. (Anthropic)

Performance and efficiency improvements: stronger, more efficient, and more flexible

Opus4.5 performed excellently in multiple internal benchmark tests, covering coding (SWE-bench), agent capabilities (τ²-bench), reasoning, mathematics, and visual dimensions. In terms of efficiency, the new model significantly reduces token usage. For example, in certain settings, by adjusting the effort parameter, Opus4.5 can reduce the number of output tokens by up to 76% while maintaining or exceeding the performance of Sonnet4.5.

Additionally, through context compression (context compaction) and improved memory management, it can run more stably over a long period, making it suitable for large-scale, continuous agent workflows.

image.png

Safety: Dual enhancement of alignment and robustness

Anthropic states that Opus4.5 is one of the most aligned and robust cutting-edge models to date. In resisting malicious prompt injection attacks, Opus4.5 has better defense capabilities than previous versions. Anthropic claims it is harder to mislead compared to other cutting-edge models in the industry. The security assessment covers a wide range of active and passive risk paths. The complete evaluation results and methods are documented in the "system card" of Opus4.5.

image.png

Pricing, availability, and open platform

Price: The cost of calling Opus4.5 via the Claude API is $5 per million input tokens and $25 per million output tokens.

Availability: The model is already available in Anthropic's own applications and is open to developers via API. Additionally, it can be used on three cloud platforms (Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry).

Usage restriction changes: For users who have access to Opus (such as Max, Team Premium, etc.), Anthropic has removed the previous usage "cap" limits on the Opus model and increased the overall usage quota to enable broader application in daily work.

Impact

  • Engineers and developers: Opus4.5 excels in coding, debugging, refactoring, and large-scale multi-agent collaboration, which may significantly improve software development efficiency and reduce manual intervention.

  • Enterprises and office automation: By integrating tools like Excel and Chrome, enterprises can more easily embed AI into daily office processes, accelerating analysis and automation.

  • AI agent ecosystem: Stronger long-term reasoning abilities and memory management could promote the implementation of complex, long-term agents (such as process automation, customer service, R&D assistants, etc.).

  • Safety and trustworthiness: Anthropic emphasizes the improvement in alignment and robustness against attacks, which helps to enhance trust in high-responsibility scenarios (such as enterprise and critical tasks).

Claude Opus4.5 represents a major advancement for Anthropic in AI capabilities and safety. It not only demonstrates leading capabilities in coding and intelligent agent tasks, but also offers developers and enterprise users a more powerful productivity tool through higher efficiency, more flexible resource usage, and a more robust alignment mechanism. With its widespread availability on cloud platforms, Opus4.5 is expected to become a cornerstone in driving next-generation AI-driven workflows.