Apple Releases Open Source Multimodal Machine Learning Model 'Ferret'


At the Beijing Auto Show, Ruan Chong, a former core researcher of DeepSeek's multimodal technology, appeared as the chief scientist of Yuanrong Qixing, marking the company's shift in autonomous driving technology. CEO Zhou Guang stated that multimodal large models achieved breakthroughs in early 2026, and the advantages of the autonomous driving route based on large models are significant, surpassing previous technologies.
Xiaomi released the MiMo-V2.5 series of large models on April 23 and initiated public testing. The series includes four models, with the core models MiMo-V2.5-Pro and MiMo-V2.5 being open-sourced globally, demonstrating its commitment to promoting an open AI ecosystem. This update is not only a product iteration but also a comprehensive upgrade of the technology foundation, featuring flagship performance that supports a context length of up to one million and complex task processing.
Xiaohongshu open-sources the RelaX reinforcement learning training engine, designed specifically for multimodal and agent scenarios, supporting unified processing of text, images, audio, and video, accurately aligning with the development trends of the AI industry.
ByteDance's Volcano Engine opened public API applications for the Seedance2.0 multimodal video generation model on April 2, transitioning from limited testing to broader availability. The model supports text, image, audio, and video inputs, enabling character consistency, director-level shot control, and physical simulation.....
Ant Forest LingBot Technology opens a large-scale RGB-D depth dataset called LingBot-Depth-Dataset, containing 3 million high-quality samples, of which 2 million are collected from real scenes and 1 million are rendered. The total size reaches 2.71 TB, covering 6 mainstream depth cameras. It is currently the largest real-scene RGB-D dataset in the open-source community, providing richer data support for embodied intelligence, spatial perception, and 3D vision fields.