Xiaomi officially launched its future exploration plan for smart home, Xiaomi Miloco (Xiaomi Local Copilot), which deeply integrates large model technology into the whole-home smart system, trying to break through the limitations of traditional pre-set rules, and create a more intelligent human-computer interaction experience with natural language and scenario understanding.
Differently from past home logic based on fixed scenarios, Miloco allows users to express complex needs through common speech. For example: "Turn on the desk lamp and music while reading", the system can automatically understand the intention and adjust the status of related devices. Its core relies on Xiaomi's self-developed edge-side vision-language model Xiaomi MiMo-VL-Miloco-7B, combining real-time video for dynamic perception, understanding, and response.

To enhance the security and trustworthiness of the system, Miloco processes all visual data locally, ensuring user privacy stays on the device. In addition, the solution supports cross-ecosystem and cross-brand device collaboration, further breaking down the barriers between smart home devices.
Miloco adopts a four-layer open architecture
