Kunlun Tech: Multi-Modal Large Model Has Entered Experimental Training Phase


On June 29, 2025, the Alibaba International AI Team officially released the new multi-modal large model **Ovis-U1**, marking another major breakthrough in the field of multi-modal artificial intelligence. As the latest masterpiece of the Ovis series, Ovis-U1 integrates multi-modal understanding, image generation, and image editing functions, demonstrating powerful cross-modal processing capabilities, providing new possibilities for developers, researchers, and industry applications. This is a detailed report on Ovis-U1 by AIbase. Ovis-U1
The latest release from the Alibaba team, mPLUG-Owl3 is a general-purpose multi-modal large model, with its core capability being the understanding of long image sequences. By introducing a hyper attention module, mPLUG-Owl3 can efficiently process visual and language information, achieving in-depth understanding and communication of multi-modal data such as images and videos. This model has made significant breakthroughs in inference efficiency, image processing capabilities, and the application of multi-modal knowledge, particularly in video understanding, where it can 'watch' a 2-hour movie in 4 seconds and accurately answer related questions.
Scientists from Harvard University and Google's DeepMind AI Lab have collaborated to create a virtual AI mouse, which is not only a technological breakthrough but may also open up a new field called "Virtual Neuroscience". The significance of this research lies in its potential to not only help us understand how the brain controls complex body movements but also to have a profound impact on neuroscience and robotics. Virtual Rat: AI Brain for Agile Movements Imitating the Evolutionary Miracle: T
Researchers from the University of California, San Diego, and the Massachusetts Institute of Technology have developed a project called Open-TeleVision, which sounds pretty cool. This thing is an open-source remote operating system that boasts the ability to easily control robots from 3,000 miles away, allowing precise manipulation of various objects, just like the high-tech scenes in the movie "Avatar".Firstly, let's talk about the adaptability of this system, which is truly unmatched. With any
According to the CEO of Anthropic, the current cost of training AI models is as high as $10 billion, and it could rise to $100 billion or even $1 trillion in the next three years. This forecast has sparked concerns about whether the AI bubble is about to burst.Image Source Note: The image was generated by AI, provided by the image authorization service provider Midjourney Dario Amodei points out that as AI models continue to evolve, hardware demand will also grow exponentially, becoming a major