Sub-millimeter Precision Alignment: Xiaomi Open Sources the Full Post-Training Process of VLA Large Model

Xiaomi has recently announced the open-sourcing of the full post-training process for its visual-language-action (VLA) large model, Xiaomi-Robotics-0, in real-world settings. This move marks an important step for Xiaomi in the field of embodied intelligence, aiming to enable robots to quickly master complex operational skills with minimal data.

20 Hours to Master "Needle-in-a-Straw"

Based on a pre-trained foundation, the research team used only about 20 hours of task data for real-robot post-training, enabling the robot to perform the high-difficulty action of accurately placing earphones into a case. This process requires extremely high spatial perception accuracy and must overcome displacement interference caused by very low surface roughness.

The model must align within sub-millimeter tolerances and be able to correct action deviations in real-time. This "smooth and continuous" execution capability demonstrates the outstanding potential of Xiaomi-Robotics-0 in handling high-precision assembly tasks.

Open Source Ecosystem Drives Productivity Evolution

To make this model truly a "ready-to-use" tool, Xiaomi not only opened the model weights but also released the technical report and source code. This end-to-end open-source model greatly reduces the barriers for developers entering the field of embodied intelligence.

Previously, the model performed well on international authoritative platforms, ranking among the top downloads globally. With the release of the post-training process, global developers will be able to jointly optimize the perception and execution logic of robots, accelerating the integration of AI robots into real-life production and daily life.

Project Website: https://robotics.xiaomi.com/xiaomi-robotics-0.html
Open Source Code: https://github.com/XiaomiRobotics/Xiaomi-Robotics-0

Xiaomi Open Sources Full Post-Training Process for VLA Large Model, Enabling Robotic Sub-Millimeter-Level Operations

Xiaomi, following the open-source VLA model Xiaomi-Robotics-0 in February, recently disclosed its full real-robot post-training process to address the 'last mile' from lab to production. In demos, robots with this model showed fine manipulation skills after just 20 hours of training, advancing AI robots as out-of-the-box productivity tools.....

Xiaomi Launches the Most Powerful Model Series MiMo-V2.5, Official Public Testing Begins

Xiaomi released the MiMo-V2.5 series of large models on April 23 and initiated public testing. The series includes four models, with the core models MiMo-V2.5-Pro and MiMo-V2.5 being open-sourced globally, demonstrating its commitment to promoting an open AI ecosystem. This update is not only a product iteration but also a comprehensive upgrade of the technology foundation, featuring flagship performance that supports a context length of up to one million and complex task processing.

Xiaomi MiMo-V2.5 Shocking Beta Test: 4.3 Hours of Manual Compiler Development, Long-Range Intelligent Agent Achieves a Full Leap

Xiaomi released the MiMo-V2.5 series of large models, including MiMo-V2.5, V2.5-Pro, and accompanying TTS and ASR models, marking an upgrade from "usable" to "user-friendly." The flagship model MiMo-V2.5-Pro has reached competitive levels with top models such as Claude Opus4.6 and GPT-5.4 in terms of general intelligent agent capabilities and software engineering. Its core advantage lies in high instruction adherence and self-correction capabilities.

The Explosion of the Domestic Agent Ecosystem! Xiaomi MiMo-V2 Joins the Top Framework Hermes and Opens a 14-Day Free Trial

Xiaomi's self-developed large model MiMo-V2 series has officially joined the globally top-tier open-source Agent framework Hermes Agent, achieving a strong combination. Developers can directly call Xiaomi's flagship model through Nous Portal after updating the framework. At the same time, Xiaomi has launched a two-week free trial "family pack" activity to reward developers.

Xiaomi Open Sources Major Project! OmniVoice Covers 600+ Languages for Zero-Shot Speech Cloning TTS: WER Only 0.84%, 40 Times Faster, Small Languages Can Also Be Resurrected Easily

Xiaomi Kaldi team open-sources the OmniVoice model, supporting over 600 languages. It achieves SOTA performance in multiple metrics on Chinese and multilingual TTS benchmark tests. The Chinese WER is as low as 0.84%, and the multilingual performance surpasses mainstream commercial models, achieving a new breakthrough in speech synthesis.