ZhiYuan Research Institute Releases Emu2: A New Generation Generative Multimodal Foundation Model

站长之家

Published in AI News · 2 minute read · Jul 21, 2025

The Beijing Academy of Artificial Intelligence (BAAI) has released Emu2, a new generation of multi-modal foundational models. Through extensive autoregressive generative multi-modal pre-training, Emu2 has significantly advanced the breakthroughs in multi-modal contextual learning capabilities. Emu2 excels in few-shot multi-modal understanding tasks, surpassing mainstream multi-modal pre-training models like Flamingo-80B and IDEFICS-80B. It has achieved optimal performance in multiple few-shot understanding, visual question answering, and image generation tasks. Emu2-Chat can accurately comprehend text-image instructions, enabling better information perception, intent understanding, and decision planning. Emu2-Gen can accept sequences of interleaved images, text, and locations as input, achieving flexible, controllable, and high-quality image and video generation. Emu2 adopts a simpler modeling framework and scales the model up to 37 billion parameters. For more details, please refer to the project link released by BAAI.

Related AI News

AI Code Review Star Greptile Raises $30 Million in Funding, Valuation Reaches $180 Million

AI code review startup Greptile raised $30M Series A at $180M valuation, led by Benchmark. Founded by 2023 grad Daksh Gupta, its AI detects code errors but faces rivals like Graphite. Gupta's demanding work culture sparked controversy.....

Jul 21, 2025

46.2k

JD.com Launches Open-Source JoyAgent-JDGenie! GAIA Accuracy of 75.15% Leads in Multi-Agent Systems

JD.com open-sources the multi-agent system JoyAgent-JDGenie, leading the industry with a 75.15% accuracy rate on the GAIA benchmark. The system features end-to-end multi-agent collaboration capabilities, supports multi-modal processing such as text and images, and uses a memory optimization mechanism to improve efficiency. The framework includes sub-agent modules for report generation, code, PPT, etc., allowing developers to expand and customize functions. It adopts the Apache 2.0 open-source license, provides complete code and documentation support, and is convenient for secondary development. Its modular design and multi-agent collaboration capabilities

Jul 21, 2025

46.2k

Volc Engine's Chimera Digital Human Platform Launches Closed Beta Test, ByteDance Accelerates AI Strategy

Volc Engine, a subsidiary of ByteDance, is testing its new digital human platform, Chimera, which is based on AI large model technology and offers functions such as digital human generation and picture clothing changes. It is currently in a targeted invitation test phase, and public beta testing is expected to open by the end of the month. Subsequently, it will charge based on usage volume. In recent years, Volc Engine has continuously focused on the digital human field, obtaining digital human system certification in 2022, launching three types of digital human products in 2023, and collaborating with multiple companies to implement solution projects in financial and marketing scenarios in 2024. (140 characters)

Jul 21, 2025

46.2k

JD.com responds to investing in three robotics companies in one day: highly valuing embodied intelligence and other technologies

This morning, the robotics industry saw a new trend of capital inflow — Qianxun Intelligent announced a 600 million yuan Pre-A+ round financing, Luxi Dynamics (LimX Dynamics) disclosed a new strategic financing round, and Zhongqing Robotics officially announced an A1 round financing. All three companies were led by JD.com Group. This series of actions marks that JD.com is accelerating its investment layout in the field of embodied intelligence, drawing widespread attention from the market on the technological transformation of intelligent supply chain. Regarding the intensive investment activities, a relevant person in charge of JD.com Group responded that the company is continuously increasing its investment in embodied

Jul 21, 2025

46.2k

OpenAI Plans to Launch 1 Million GPUs by the End of 2025 to Demonstrate New Vision for Technological Expansion

OpenAI announced that it will deploy over 1 million GPUs by 2025, launching a hundredfold expansion plan. At the same time, it disclosed the Stargate project, which has a budget of 50 billion USD, and will build the world's largest AI training cluster in Texas. The project is chaired by Masayoshi Son, CEO of SoftBank, with OpenAI responsible for operations. It has received support from technology giants such as Microsoft and NVIDIA, aiming to strengthen AI infrastructure layout and enhance technological competitiveness.

Jul 21, 2025

46.2k

AI Gist - The Ultimate Prompt Management Tool is Now Live! Multi-language + Smart Optimization, the Efficiency Booster for AI Developers!

AI Gist is an open-source AI prompt management tool that supports variable replacement, Jinja templates, and multi-language functionality, helping developers efficiently manage prompts. Its core features include dynamic template generation, AI-optimized prompts, cloud backup, and multi-view management, making it particularly suitable for scenarios such as chatbots that require frequent adjustments to prompts. The tool supports multiple languages including Chinese, English, and Japanese, and is released under the Apache 2.0 open-source license. It has received over 100 GitHub stars within two weeks. Future plans include optimizing AI algorithms and adding support for localization.

Jul 21, 2025

46.2k