Opening the 'Third Eye'! Google Equips Robots with Gemini 1.5 Pro for Enhanced Memory-Based Navigation

AIbase

Published in AI News · 4 minute read · Jul 16, 2025

In the world of technology, Google DeepMind has dropped another bombshell—they've installed a system called Gemini 1.5 Pro on a robot. This isn't just an upgrade; it gives the robot the superpower of memory navigation, essentially giving it a "sky eye" for robots.

Imagine a robot operating in an area of nearly 9,000 square feet, performing 57 different tasks with a success rate of 90%. These aren't simple tasks, like finding a place to draw; the robot not only understood but also led you to a large whiteboard. This operation is even more reliable than a human.

The amazing part of this system is that it can handle multimodal long context windows, meaning the robot can not only remember key locations but also understand human instructions, video guidance, and even reason with common sense. Like the example of the Google employee, the robot not only understood "the place to draw" but also knew to look for a place with a large whiteboard.

Moreover, these robots have already familiarized themselves with office environments in previous projects, understanding spatial layouts through "multimodal instruction navigation demonstrations." The DeepMind team has also used a hierarchical visual-language-action (VLA) technology that allows robots to understand written, drawing commands, and gesture instructions.

The core of this system is that it allows robots to move freely in complex spaces without constant human guidance. They can remember the environment, understand instructions, and complete tasks in their own way. This capability makes robots more flexible and useful in practical applications.

In summary, Google DeepMind's technology not only makes robots smarter but also enables them to better serve humans in the real world. It's as if opening a new door for robots, allowing them to enter our lives and become our partners in work and exploration of the world. Perhaps in the future, robots will no longer be cold machines but intelligent companions in our daily lives.

Alphabet May Acquire Cloud Security Firm Wiz for $23 Billion

Google's parent company, Alphabet, is preparing for a major acquisition. According to The Wall Street Journal, Alphabet is in deep negotiations with cloud security company Wiz, with the deal potentially valued at $23 billion, which could become the largest acquisition in Alphabet's history.Wiz specializes in providing comprehensive cloud security solutions, capable of collecting data from major cloud platforms and conducting security risk analysis. Industry insiders believe this move aims to str

OpenDiLoCo: An Open-Source Solution for Distributed AI Training with Low Communication Costs and Global Reach!

In this era of AI explosion, Large Language Models (LLMs) have become the super engines driving the application of machine learning. However, training these giants requires immense computational resources. Imagine if we could train these models efficiently on distributed devices around the world? That's the surprise that OpenDiLoCo brings to us.Traditional distributed training methods require frequent communication and a lot of bandwidth, which limits the scale and efficiency of training. Howeve

Jeff Bezos and SoftBank Lead $300M Funding Round for Skild AI, Aiming to Develop Robot Brain

Skild AI recently announced a $300 million funding round, aiming to create a general artificial intelligence system known as the 'Robot Brain'. This startup, originated from Carnegie Mellon University, is focused on developing intelligent systems that can be adapted to various machines and robotic devices. The leaders of this round of funding include Jeff Bezos through his venture company Bezos Explorations, SoftBank Group, Lightspeed Venture Partners, and Coatue. Following this round of funding

Google's Eureka AI Model Leaks Early, Impressive Text Generation Capabilities Draw Attention

In the field of artificial intelligence, a new storm is brewing. Google is set to launch a new AI model called "Eureka," which has sparked widespread attention in the AI community. The Eureka model has made a striking impression in the LMSYS Arena, a platform used for comparing AI performance.Comparison of "AI Word Use" between Eureka and Claude3.5 The Eureka model is attracting attention for its exceptional text writing abilities. Early evaluations show that Eureka surpasses other models in gen

Unleashing the Power! Google's Gemini Set to Launch Five New Features: Imagen3, Custom GPT, and More Note: The translation is crafted to maintain the excitement and highlight the key features mentioned in the original title. Unleashing the Power suggests the significant impact of the upcoming features, while and More leaves room for additional features not explicitly named in the original title.

Recently, Google announced a series of new features and products at the Google I/O conference, which are highly anticipated. Although most of them have not been released yet, we can get a glimpse of some ongoing development work. Google plans to release five Gemini products on July 15th and July 18th. Let's dive in and find out!According to reverse engineering of the front-end code and related leak information, we can initially understand that the upcoming Gemini products may include: a new vers

Microsoft & MIT Unveil New Era of Reasoning: 67M-Parameter Model Rivals GPT-4

In this era of information overload, we interact with smart devices every day. Have you ever wondered how these seemingly intelligent creatures know to "take an umbrella because it's raining"? Behind this is a profound transformation in causal reasoning.A group of researchers from renowned academic institutions such as Microsoft and MIT have jointly developed a groundbreaking machine learning training strategy. This strategy not only overcomes the shortcomings of large machine learning models in