Zhiyuan Robotics announced the open source of the general embodiment base large model GO-1 (Genie Operator-1), which is also the world's first embodiment intelligent model using the Vision-Language-Latent-Action (ViLLA) architecture. This open source aims to lower the technical barriers of embodiment intelligence, allowing more developers to participate in the application and development of this cutting-edge technology. The release of this model follows the open source of the AgiBot World embodiment intelligent million real-machine dataset in January this year.

The core of the GO-1 model is the ViLLA architecture, which enables robots to better understand human intentions and perform more precise actions. Compared with traditional Vision-Language-Action (VLA) architectures, ViLLA successfully connects images, text input, and robot actions by introducing implicit action tokens. The architecture is divided into three layers: the first is the VLM multimodal understanding layer, built on InternVL-2B, capable of processing various information such as visual, force, and language. The second is the Latent Planner implicit planner, which can achieve high-level understanding of complex tasks. Finally, the Action Expert action expert generates continuous high-precision action sequences through diffusion models to ensure that the robot can perform complex manipulation tasks.
In addition, Zhiyuan Robotics also launched the Genie Studio development platform, providing developers with a comprehensive solution, including data collection, model training, simulation evaluation, etc. This platform not only integrates the GO-1 model but also provides video training solutions and a unified training framework, greatly improving development efficiency and helping the rapid deployment of embodiment intelligent technology.
Although the GO-1 model is pre-trained based on data from the AgiBot G1 robot, it has been verified and tested on multiple robot platforms, showing its good portability. This model has achieved excellent performance on multiple mainstream simulation platforms, demonstrating its ability to adapt to different robots.
Zhiyuan Robotics encourages developers to visit the GitHub repository to download the GO-1 model and start their journey in embodiment intelligence development. Whether you are an experienced AI researcher or a beginner, GO-1 will provide them with strong technical support.
GitHub:
https://github.com/OpenDriveLab/AgiBot-World
Huggingface:
https://huggingface.co/agibot-world/GO-1
Key Points:
🌟 The world's first open-source ViLLA architecture model GO-1 is officially released.
🔧 Genie Studio development platform provides a full-process solution to help developers.
🤖 The GO-1 model has been tested on multiple platforms, showing good portability.
