Major Upgrade! Claude Opus4.1 Makes a Grand Debut, Programming Ability Reaches New High of 74.5%

Today, the AI company Anthropic officially released an upgraded version of its flagship model, Claude Opus4 - Claude Opus4.1. This update aims to comprehensively enhance the model's agentic tasks, real-world programming, and reasoning capabilities, especially in programming and data analysis, which has attracted significant attention.

According to official information, the biggest highlight of Claude Opus4.1 is its remarkable improvement in programming performance. In the SWE-bench Verified programming evaluation, it achieved a score of 74.5%, demonstrating its strong capability in handling complex code problems. GitHub feedback also confirms this, as developers generally believe that Opus4.1 performs better than its predecessor in tasks such as multi-file code refactoring. Additionally, Japan's e-commerce giant Rakuten Group pointed out that the new model can more accurately locate errors in large codebases, effectively reducing unnecessary changes and potential bugs.

In addition to the leap in programming capabilities, Opus4.1 has made significant progress in deep research and data analysis, especially in terms of detail tracking and agentic search capabilities. The benchmark test results from Windsurf show that Opus4.1's performance has improved by one standard deviation compared to Opus4, a level of advancement comparable to the jump from Sonnet3.7 to Sonnet4.

Although this upgrade brings significant performance improvements, Anthropic emphasized that Opus4.1 is a progressive improvement, not a revolutionary update. It will continue to be deployed according to the **AI Safety Level 3 (ASL-3)** standard and shows robustness in multiple safety assessments. The new model has slightly improved in refusing illegal requests, with a harm-free response rate reaching 98.76%. Additionally, in terms of child safety, political bias, and agentic ability tests, Opus4.1's risk levels remain consistent with the previous version, and its cooperation in extreme abuse scenarios has decreased by about 25%, showing stronger security.

Claude Opus4.1 is now available to all paid users, Claude Code, API, Amazon Bedrock, and Google Cloud Vertex AI, with the same price as Opus4.

Attract

Prevent Falsification of the Golden Body: OpenAI Secretly Amends Its Charter to Significantly Increase the Difficulty of Removing Altman

After the 2023 coup attempt, OpenAI amended its bylaws to significantly enhance CEO Sam Altman's job security, raising the threshold for his dismissal from a simple majority vote to make external interference or internal removal more difficult. These changes were quietly implemented during the company's transition to a for-profit model, as revealed by expert witnesses in Elon Musk's lawsuit.....

Study Finds Google AI Search References YouTube Far More Than Professional Medical Websites

Google's AI search relies heavily on YouTube over medical websites for health queries, raising concerns about information reliability. A study of 50,000 German health searches found YouTube cited 4.4% of the time, surpassing professional medical sites, intensifying doubts about AI-generated health information accuracy.....

Major Upgrade! Claude Opus4.1 Makes a Grand Debut, Programming Ability Reaches New High of 74.5%

Attract

Related Recommendations

Prevent Falsification of the Golden Body: OpenAI Secretly Amends Its Charter to Significantly Increase the Difficulty of Removing Altman

OpenAI Launches Codex Chrome Extension to Enhance Browser Efficiency

Claude Code New Feature Launch: Monitor Tool Released, Supports Real-Time Background Process Monitoring

Tencent Responds to Controversy Over Data Crawling by OpenClaw: Located as a Local Mirror and Has Alleviated 99% of the Traffic Pressure

Study Finds Google AI Search References YouTube Far More Than Professional Medical Websites