Zhejiang University Alumni Collaborate with Microsoft to Launch Multimodal Model LLaVA, Challenging GPT-4V


Google releases a new multimodal model Gemma 4 12B, revolutionizing the traditional architecture by eliminating the separate encoder component, achieving efficient local deployment and inference on consumer-level hardware. This breakthrough significantly reduces the computational complexity of multimodal models, improves processing speed, and marks a new stage in the open source large model ecosystem.
Google released the Gemma 4 12B multimodal model, which has 12 billion parameters and innovatively eliminates traditional encoders, allowing direct processing of visual and audio data. This model requires only 16GB of VRAM and can run locally on high-end laptops without relying on cloud resources.
MiniMax (Xiyu Technology) has launched the '10x Team' global talent collaboration program, aiming to gather top experts from various industries, combine industry expertise with cutting-edge AI technology, and promote the application of large models in vertical fields, extending productivity from general to specialized scenarios, achieving a tenfold increase in industry efficiency. It also opens up multimodal core resources to verify the value of industry insights.
NVIDIA announced a significant expansion of its open-source model family at the 2026 GTC conference, with the key release of the Nemotron 3 series multimodal models. Among them, Nemotron 3 Ultra is optimized based on the Blackwell architecture, achieving a fivefold improvement in throughput efficiency, specifically designed for complex code assistance and enterprise workflows. Meanwhile, the company also showcased its latest achievements in multimodal interaction, aiming to accelerate innovation in intelligent agents, physical AI, and healthcare fields.
Apple launched the multimodal model Manzano, which solves the long-standing problem in the AI field of being unable to balance visual understanding and image generation through an innovative dual-structure architecture.