Google DeepMind and Boston Dynamics have joined forces to significantly upgrade Spot, the renowned quadruped robot dog, endowing it with advanced intelligence capabilities. By incorporating the Gemini Robotics-ER 1.6 model into Spot, the companies are shifting beyond traditional pre-programmed scripts towards a more sophisticated form of 'embodied reasoning.' This evolution allows Spot to autonomously assess complex environments, such as disorganized rooms or intricate factory floors, and determine the appropriate actions to take.
The new model is described by Google DeepMind as a 'reasoning-first' system, enabling the robot to navigate through various facilities while interpreting physical data. For example, Spot can now read analog gauges, effectively connecting digital artificial intelligence with physical actions in real-world scenarios.
One of the standout features of this upgrade is the introduction of 'agentic vision.' Unlike previous iterations that only provided flat images, Spot can now zoom in on intricate details, utilize coded estimations for measurements, and apply contextual knowledge to interpret its surroundings. This is particularly beneficial for tasks involving industrial inspections, where Spot can accurately monitor gauges or verify the status of chemical sight glasses.
Marco da Silva, the Vice President and General Manager of Spot at Boston Dynamics, stated, 'Capabilities like instrument reading and more reliable task reasoning will enable Spot to see, understand, and react to real-world challenges completely autonomously.' This advancement not only enhances Spot's industrial utility but also opens up possibilities for performing more human-like tasks.
Human-like Tasks and Safety Measures
In a recent demonstration, Spot showcased its ability to execute tasks that resemble human actions, such as reading a handwritten to-do list, organizing shoes, and even walking a dog using a leash. These capabilities illustrate the robot's growing versatility beyond heavy industrial applications, which primarily focus on detecting gas leaks, counting pallets, and identifying puddles.
To ensure the safety of its operations, Google has integrated a benchmark known as ASIMOV, designed to prevent the robot from making potentially dangerous errors, such as placing objects too close to edges. This focus on safety is paramount as robots become more autonomous in their tasks.
The Challenge of Touch
Despite these advancements, a significant challenge remains: the robot's ability to understand touch. Most AI models, including those that power Spot, are predominantly trained on data from the internet, which lacks the tactile information necessary for understanding the physical properties of objects. Consequently, Spot currently relies heavily on its visual input to interact with its environment, which limits its dexterity and nuanced interactions.
Availability of the Gemini Robotics-ER 1.6 Model
The Gemini Robotics-ER 1.6 model is now accessible to developers through the Gemini API and Google AI Studio. Google DeepMind has also provided a developer Colab, offering examples and configurations for utilizing the model in various embodied reasoning tasks. For existing customers of Boston Dynamics, the transition to the Gemini-powered AIVI-Learning model has been fully operational since April 8, 2026.
As the technology continues to evolve, it sets the stage for a future where robots like Spot can seamlessly integrate into both industrial and domestic environments, enhancing productivity and safety while performing complex tasks autonomously.
Source: eWEEK News