On Monday, chip giant NVIDIA announced new infrastructure and artificial intelligence models at the NeurIPS AI conference in San Diego, California. This move aims to accelerate the development of foundational technologies for Physical AI, a field that includes robots and autonomous vehicles capable of perceiving the real world and interacting with it.
The First Autonomous Reasoning Vision-Language Model Debuts
NVIDIA released Alpamayo-R1, an open-source reasoning vision-language model (VLAM) designed for autonomous driving research. The company claims this is the first vision-language action model focused on autonomous driving. Vision-language models can process text and images simultaneously, allowing vehicles to "see" their surroundings and make decisions based on perceptual information.
Alpamayo-R1 is based on NVIDIA's Cosmos-Reason model, a reasoning model that can "think" and make decisions before responding. NVIDIA stated that technologies like Alpamayo-R1 are essential for companies aiming for Level 4 full autonomy and hope that such reasoning models will give autonomous vehicles "common sense," enabling them to handle complex driving decisions more like human drivers.
This new model is now available on GitHub and Hugging Face platforms.

Cosmos Cookbook: Accelerating Developer Application Deployment
In addition to the new vision model, NVIDIA launched a new set of step-by-step guides, reasoning resources, and post-training workflows on GitHub, collectively known as Cosmos Cookbook. These guides cover data preparation, synthetic data generation, and model evaluation, aiming to help developers better use and train Cosmos models to meet specific application scenario needs.
Focusing on the Next Wave of AI: Physical AI
This release comes as NVIDIA is fully pushing forward in the field of Physical AI, viewing it as a new application direction for its advanced AI GPUs.
