Google DeepMind and the Rise of Robots: A Challenge for AI Governance

⚡

Key Takeaways

1The International Federation of Robotics predicts 575,000 industrial robots installed by 2025, highlighting the growing importance of physical AI.

2Google DeepMind has launched Gemini Robotics and Gemini Robotics-ER to enhance embodied robotics, featuring advanced vision and planning capabilities.

3Robot safety is a crucial issue, requiring rigorous controls to prevent dangerous behaviors in physical environments.

💡Why it matters — The increasing integration of AI into physical systems presents security and governance challenges, impacting both industry and society.

Physical AI and Its Growing Implications

Governance around physical artificial intelligence is becoming increasingly complex as autonomous AI systems integrate into robots, sensors, and industrial equipment. The question is not just whether AI agents can perform tasks, but how their actions are tested, monitored, and interrupted when they interact with real-world systems.

Industrial robots already provide a significant foundation for this discussion. The International Federation of Robotics reported that 542,000 industrial robots were installed worldwide in 2024, more than double the annual level recorded a decade earlier. It forecasts that installations will reach 575,000 units in 2025 and exceed 700,000 units by 2028.

Market researchers are also applying the label physical AI to a broader group of systems, including robotics, edge computing, and autonomous machines. Grand View Research estimated the global physical AI market at $81.64 billion in 2025 and predicts it will reach $960.38 billion by 2033, although this category depends on how vendors define intelligence in physical systems.

From Model Output to Physical Action

The governance challenge differs from purely software automation, as physical systems can operate around workplaces, infrastructures, and human users. They can also be connected to equipment requiring clear safety boundaries. A model output can become a robot movement or a machine instruction. This makes safety boundaries and escalation paths integral to system design.

Google DeepMind's robotics work is a recent example of how AI models are adapted to this environment. The company introduced Gemini Robotics and Gemini Robotics-ER in March 2025, describing them as models built on Gemini 2.0 for robotics and embodied AI. Gemini Robotics is a vision-language-action model designed to directly control robots, while Gemini Robotics-ER focuses on embodied reasoning, including spatial understanding and task planning.

A robot using this type of model may need to identify an object, understand an instruction, and plan a sequence of movements. It must also assess whether the task has been completed correctly. This creates a control problem that includes both the model's behavior and the mechanical limits of the system.

Google DeepMind stated that useful robots must possess generality, interactivity, and dexterity. Generality covers unknown objects and environments. Interactivity relates to human input and changing conditions. Dexterity refers to physical tasks requiring precise movements.

In its launch documents, Google DeepMind indicated that Gemini Robotics could follow natural language instructions and perform multi-step manipulation tasks. Examples include folding paper, organizing objects into a bag, and manipulating unseen objects during training.

The technical requirements for physical AI are broader than language understanding. Systems require visual perception and spatial reasoning. They also need task planning and success detection. In robotics, success detection is crucial because the system must decide whether a task has been completed, whether it should retry, or whether it should stop.

Gemini Robotics-ER 1.6, introduced in April 2026, demonstrates how these functions are integrated into new models. The company describes the model as supporting spatial logic, task planning, and success detection, with the ability to reason through intermediate steps and decide whether to proceed or retry.

Google's developer documentation indicates that Gemini Robotics-ER 1.6 is available in preview via the Gemini API. The documentation describes it as a vision-language model that brings the agentic capabilities of Gemini to robotics. These capabilities include visual interpretation, spatial reasoning, and planning from natural language commands.

Google AI Studio provides a development environment for working with Gemini models, while the Gemini API offers a way to integrate these models into applications. In the context of embodied AI, this brings developers' testing and incentives closer to building agentic applications.

Safety Controls Integrated into System Design

Governance becomes more complex when these systems can call tools, generate code, or trigger actions. Controls must define what data the system can access, what tools it can use, which actions require human approval, and how activity is logged for review.

McKinsey's 2026 research on trust in AI highlights the same issue in enterprise AI more generally. It revealed that about one-third of organizations reported maturity levels of three or more in strategy, governance, and governance of agentic AI, even as AI systems take on increasingly autonomous functions.

In robotics, safety also includes the physical behavior of the machine. Google DeepMind described robot safety as a multi-level issue, covering low-level controls such as collision avoidance, force limits, and stability, as well as high-level reasoning about the safety of a requested action in a given context.

The company also introduced ASIMOV, a dataset for assessing semantic safety in robotics and embodied AI. Google DeepMind stated that this dataset was designed to test whether systems can understand safety-related instructions and avoid dangerous behaviors in physical environments.

The same controls used for software agents become more challenging to manage when systems are connected to robots, sensors, or industrial equipment. This includes access rights, audit trails, and refusal behaviors. It also includes escalation paths and testing.

Governance frameworks such as the NIST AI Risk Management Framework and ISO/IEC 42001 provide structures for managing risks and responsibilities related to AI throughout the system lifecycle. In physical AI, these controls must account for model behavior, connected machines, and the operational environment.

Google DeepMind has also collaborated with robotics companies as part of its embodied AI development. In March 2025, the company announced a partnership with Apptronik on humanoid robots using Gemini 2.0, and cited Agile Robots, Agility Robotics, Boston Dynamics, and Enchanted Tools among trusted testers for Gemini Robotics-ER.

The 2026 update also mentioned work with Boston Dynamics involving robotic tasks such as reading instruments. This type of use case relies on visual understanding, task planning, and reliable assessment of physical conditions.

Physical AI applies to industrial inspection, manufacturing, and logistics. It also applies to facilities and warehouses. These environments require systems capable of interpreting real-world conditions and acting within defined limits. The governance question is how these limits are established before autonomous systems are allowed to make or execute decisions.

Google DeepMind and the Rise of Robots: A Challenge for AI Governance

Le brief IA que les pros lisent chaque soir

Physical AI and Its Growing Implications

From Model Output to Physical Action

Safety Controls Integrated into System Design

Brief IA — L'actualité IA en français