Xiaomi Open Sources 309 Billion Parameter MiMo-V2-Flash Large Model, Inferencing Speed Outperforms Mainstream Competitors, API as Low as $0.1 per Million Tokens

Xiaomi has officially entered the high-performance open-source large model arena. Recently, the company released a new foundational language model, MiMo-V2-Flash, and simultaneously open-sourced the model weights and inference code under the MIT license. This model highlights "ultra-fast speed and high efficiency," showing remarkable performance in inference, code generation, and agent tasks. Test results show that its response speed even surpasses popular domestic models such as Douba, DeepSeek, and Yuanbao, drawing widespread attention from the developer community.

MiMo-V2-Flash adopts a sparse activation architecture, with a total parameter count of 309 billion, but only activates 15 billion parameters per inference, significantly reducing computational costs while maintaining strong capabilities. This design enables it to consistently rank among the top open-source models in various public benchmark tests, balancing performance and cost efficiency.

User test feedback shows that MiMo-V2-Flash's response speed is "unbelievably fast"—under the same hardware conditions, its generation latency is significantly lower than competitors like DeepSeek, especially showing more significant advantages in multi-turn dialogues and complex logical reasoning scenarios. A developer commented: "It's not just slightly faster, it's an order of magnitude faster."

Xiaomi MiMo-V2-Flash Large Model Launch, Response Speed Surprises Netizens, Faster Than DeepSeek

To accelerate ecosystem implementation, Xiaomi also launched highly competitive API pricing: only $0.1 per million input tokens and $0.3 per output token, along with a limited-time free trial. This price is far lower than mainstream commercial models, offering a cost-effective alternative for small and medium enterprises and independent developers.

Notably, MiMo-V2-Flash is not only aimed at developers; its general capabilities are also suitable for daily AI assistant scenarios, seamlessly integrating into Xiaomi's "people, car, home, and all ecosystem" terminals such as smartphones, smart homes, and in-car systems. On the day of the model's release, Xiaomi will hold the "All Ecosystem Partner Conference," expected to further disclose detailed application plans for MiMo-V2-Flash in areas such as edge-cloud collaboration, device-side deployment, and multimodal expansion.

Xiaomi MiMo-V2-Flash Large Model Launch, Response Speed Surprises Netizens, Faster Than DeepSeek

In today's fiercely competitive domestic large model market, Xiaomi has chosen to enter with a combination of "high performance, true open source, and low barriers," not only demonstrating its long-term commitment to AI strategy but also potentially reshaping the performance and cost expectations of open-source models. When a massive model with 309 billion parameters can also be "lightning fast," the arena of large models has added another formidable player to watch out for.

Xiaomi Open Sources 309 Billion Parameter MiMo-V2-Flash Large Model, Inferencing Speed Outperforms Mainstream Competitors, API as Low as $0.1 per Million Tokens

Related Recommendations

Xiaomi MiMo Large Model Payment Function Launches, the Paid Era is About to Begin!

AI Glasses First Included in National Subsidy Scope, Smart Wearable Devices Encounter Policy Dividend

Xiaomi Large Model MiMo Public Testing Extended, Users Can Enjoy Free Experience Until 2026!

Xiaomi Launches New Generation MoE Large Model MiMo-V2-Flash to Support AGI Development

Xiaomi Launches Embodied Large Model MiMo-Embodied and Opens Source