SALMONN Framework: Expanding General Auditory Capabilities of Large Language Models


Tsinghua University and partners released UltraRAG2.1, the first open-source RAG framework with MCP architecture. It enables multi-stage reasoning and evaluation of multimodal retrieval systems via YAML configuration, requiring no coding, lowering barriers and advancing RAG technology.....
The "Lingguang" application under Alipay has started internal testing, supporting login with a phone number or Alipay account. Its core feature, the "AGI Camera," can recognize real-world scene content through the lens in real time, enabling shooting and questioning as well as intelligent interaction, demonstrating the potential of multimodal AI applications.
Apple is hiring experts in reasoning models to address major LLM flaws, focusing on developing new architectures for enhanced reasoning, planning, tool use, and agent-based capabilities.....
Adobe's AI Foundry offers custom AI models like Firefly for enterprises, enabling tailored solutions through collaborative retraining. It supports multi-concept and multimodal applications, surpassing single-concept limitations.....
With the release of Notion3.0, its new autonomous AI agent feature has attracted significant attention, designed to help users automatically draft documents, update databases, and manage workflow processes. However, a recent report from the cybersecurity company CodeIntegrity revealed a critical security vulnerability in these AI agents, where malicious files (such as PDFs) can be exploited to trick the agent into bypassing security measures and stealing sensitive data. CodeIntegrity attributes this vulnerability to