MIT Startup OpenAGI Launches AI Agent Claims to Surpass OpenAI and Anthropic

Recently, the startup OpenAGI from MIT made its debut and announced its newly launched AI model Lux, which outperforms similar products from OpenAI and Anthropic in computer operations, with a cost only one-tenth of theirs. OpenAGI's CEO, Qin Zengyi, stated that Lux is a foundational model designed to automatically perform operations in desktop applications by analyzing computer screenshots.

In the latest Online-Mind2Web benchmark test, Lux achieved a success rate of 83.6%, while OpenAI's Operator reached only 61.3%, and Anthropic's Claude Computer Use was at 56.3%. This significant performance gap has generated great expectations for Lux's technical capabilities in the industry.

Different from traditional large language models, Lux uses a "Agent-based pre-training" approach, focusing on learning from computer screenshots and action sequences. This self-reinforcing training cycle enables Lux to improve its capabilities through continuous exploration, thereby achieving more efficient operations.

OpenAGI also claims that Lux's operational cost is about one-tenth of the cutting-edge models from OpenAI and Anthropic, and it is also faster. Unlike competitors that can only handle browser tasks, Lux can fully control desktop applications, including Excel and Slack, greatly expanding its market potential. The company also released a developer software development kit, allowing third parties to develop applications based on Lux.

In terms of security, OpenAGI designed built-in security mechanisms for Lux. When the model receives requests that may violate security policies, it will refuse to execute and alert the user. This feature is particularly important in the context of the rapid development of AI agents.

Dr. Qin Zengyi has a strong background and has participated in the development of several widely popular AI models, demonstrating the potential and innovation of smaller teams in technology.

Key points:

- 🚀 The Lux AI agent introduced by OpenAGI achieved an 83.6% success rate in computer operation benchmark tests, far exceeding OpenAI and Anthropic.

- 💡 Lux uses a unique learning method, training through computer screenshots and action sequences, and has the ability to self-reinforce.

- 🔒 Security mechanisms are built into Lux, enabling it to identify and reject potentially dangerous requests, ensuring user data security.

MIT Startup OpenAGI Launches AI Agent Claims to Surpass OpenAI and Anthropic

Related Recommendations

Anthropic Discloses Decryption of Two Cryptographic Algorithms, Breaks Post-Quantum Candidate Scheme in 60 Hours

Google Confirms Training Gemini4 Large Model: Commitment to Prioritize Computing Power for AGI, Future May Release New Models Monthly

Claude Opus 5 system prompt fully leaked: 1511 lines, filled with about 34,000 tokens of 'no'

Aliyun Open Sources 0.8B Document Parsing Model OvisOCR2, Ends-to-End Solution Tops OmniDocBench

Report: Zhiyuan Robotics Said to Be Striving for IPO with a Target Valuation of $20 Billion