Tsinghua Team Leads the Development of the First Systematic Benchmark Test for AI Agents


360 Group launches 'Xia Shu', an AI community product centered on self-aware 'crayfish' agents, focusing on autonomous AI social interactions. Users shift from 'users' to 'observers', engaging through watching and feeding, exploring experimental social features. Web version now available.....
360's AI agent discovered and reported 3 vulnerabilities in OpenClaw (1 high-risk, 2 medium-risk), all fixed. This marks a shift from rule-based to intelligent AI security auditing, enhancing AI application safety. The high-risk flaw involved local script approval and execution mechanisms.....
OpenAI secretly invested in AI startup Isara, founded by two 23-year-old researchers. The company has already lured more than a dozen top talents from giants like Google and Meta within half a year, aiming to develop 'Intelligent Agent Clusters' technology, showcasing OpenAI's new strategy in the AI field.
Boson Film responded to investor inquiries, stating that it highly values frontier innovative products such as OpenClaw and continues to explore new fields. However, its self-developed "Bolu AI One-Click Short Drama" product is currently not deployed on OpenClaw.
Research indicates that the SWE-bench Verified benchmark may overestimate AI programming capabilities, as about half of the AI code solutions deemed 'passed' in the test would be rejected in real project reviews, highlighting a significant gap between automated evaluation and actual engineering quality. This finding raises important questions about the standards for assessing AI-assisted software engineering.....