Beyond The Backflip: Investing In Physical AI

Written by Jan Gerlo | Mar 19, 2026 8:00:00 AM

Executive Summary

The era of traditional robotics is evolving into a new paradigm: Physical AI. This isn't just a machine following a script, it is the integration of intelligence into a physical body—like a humanoid—allowing it to see, think, and act in the real world. As labor shortages and hazardous environments challenge industrial growth, the question is no longer if you should adopt Physical AI, but how to do so without adding unnecessary overhead and complexity. By moving beyond ‘dumb’ automation to Physical AI, businesses can finally automate the messy, unpredictable tasks that were previously impossible to touch. In the race for efficiency, the winner won't be the one with the flashiest hardware, but the one who builds a foundation for intelligent motion today.

What is Physical AI?

Physical AI (also known as Embodied AI) refers to systems that combine sensor technology, machine learning, and physical actuation to operate autonomously in the real world. Unlike software-only AI, Physical AI systems perceive their physical surroundings, reason about unpredictable variability, and adapt their physical actions in real time.

The Simplicity filter

The first rule of investing in Physical AI is resisting the urge to innovate just for the sake of it. Many industrial tasks are still best handled by conventional robots—machines that do one thing with perfect, millisecond precision. If a task is predictable and highly structured, adding AI can actually hurt your bottom line by making the process slower, more expensive, and prone to ‘thinking’ errors. Strategic value comes from applying Physical AI where things are unpredictable. Otherwise, you’re paying a premium to make a reliable process less certain.

Return on Investment: When Does Physical AI Make Sense?

Physical AI delivers Return on Investment when variability and unpredictability create challenges that conventional automation cannot solve. In stable, repetitive environments such as structured manufacturing floors or fixed production lines, industrial robots and automated robotic arms often provide faster ROI due to reliability and mechanical simplicity.

However, in dynamic industrial environments—where workflows change, objects vary, and human-like adaptability is required—AI-powered robots can unlock new operational capabilities rather than merely optimize existing ones. The investment case strengthens when Physical AI enables tasks that were previously impossible to automate, not just marginally more efficient.

Market Context: Why the Investment Debate Is Accelerating

Investment decisions around Physical AI are not happening in a vacuum. According to research published by Morgan Stanley, the global humanoid robot market alone could reach $5 trillion by 2050, spanning hardware, software, services, and supply chains. As labor shortages, workforce dynamics, and production limitations intensify across industrial environments, market demand for AI-powered robots continues to rise.

This projected growth does not mean every company should rush to deploy humanoid robots. It reinforces the importance of disciplined experimentation today, so that when variability demands it, organizations are ready with the right capabilities.

Humanoid Robots vs Industrial Robots: Do You Really Need a Humanoid?

Business leaders must balance a humanoid’s versatility against its high maintenance. Because our world—from stairs to door handles—was built for people, humanoid robots can step into existing workspaces without requiring a total facility renovation. In the near future they will likely also be able to benefit from decades of human data (like video and motion capture) to help them learn. However, they are power-hungry and mechanically complex. For high-volume, repetitive tasks, a specialized robotic arm is often the smarter investment- it’s more reliable, has fewer parts to break, and often offers a faster ROI.

The body advantage: intelligence is physical

A major element in thinking is the ‘Embodiment Hypothesis’—the idea that true intelligence isn't just code in a cloud, it grows from interacting with the physical world. Unlike a chatbot on a server, a robot’s ‘brain’ is limited by its physical body. When looking for opportunities, seek out systems where the AI is purpose-built for that specific machine. The most successful robots aren't just ‘computers on wheels’, they are integrated systems where the software aligns with the physical limits and strengths of the hardware perfectly.

While generative artificial intelligence and Large Language Models dominate headlines, Physical AI operates in a fundamentally different domain — the messy physics of the real world. Similar to how autonomous vehicles must integrate perception, planning, and actuation, AI-powered robots must coordinate hardware and intelligence seamlessly.

The proprietary data moat: why you can’t just ‘download a brain’

It’s a mistake to assume that because AI can write emails or create art, it can easily learn to move. While digital AI very often learns by ‘ingesting’ the internet, there is no ‘internet of movement’ to copy. Physical AI requires domain-specific policies, meaning a robot needs data from your specific use case in your specific environment to be effective. This creates a massive bottleneck: unlike computer vision, which uses billions of existing web images, robotics requires manual, human-led demonstrations. This is why the career pages of leading robotics firms are currently flooded with ‘Robot Operator’ or ‘Teleoperation Engineer’ roles—data collection is the new manual labor. Your true competitive advantage isn’t the hardware you buy, but the library of proprietary demonstrations you build in-house. This local experience is a domain-specific asset that no competitor can simply download or buy.

Moravec’s paradox: rethink what is simple

Industrial leaders must internalize a somewhat counterintuitive rule called ‘Moravec’s paradox’: what is hard for people is easy for AI, and what is easy for humans can be incredibly difficult for robots. A computer can solve complex calculus or beat a chess grandmaster in seconds - tasks that take humans years to master Yet that same AI struggles to do what a toddler does effortlessly: navigate a cluttered room to pick up a single grape.

This gap exists because we can simulate physics really well, making it ‘easy’ to train a robot to do a backflip in a digital world where only gravity and the floor matter. However, we cannot yet simulate the infinite, messy variety of the real world. A backflip is just about the robot’s own body, but clearing a cluttered work area without breaking a component requires generalization—the ability to handle the unpredictable. The real industrial revolution isn't a robot that can perform a choreographed stunt, but one that can reliably navigate the unstructured reality of your workspace.

The memory gap: thinking in long sequences

Robots are great at ‘in the moment’ actions but often struggle with sequences that require memory - or what we call ‘long-horizon tasks’ that require a system to plan and execute many interdependent actions over time, while maintaining memory of past states and actions. For example, a robot might forget a container is full the moment the lid is closed because it isn't ‘remembering’ what it saw five seconds ago. Real intelligence is measured by continuity. When evaluating whether to invest in Physical AI, look for use cases that require ‘temporal awareness’—the ability to string together several steps without human help. This can often be what separates a high-tech gimmick from a true industrial solution.

Sensor Technology in Physical AI: Vision, LiDAR, and Tactile Systems

To navigate a busy workspace, a robot must understand where it is and what it’s touching. There is a major debate on how to achieve this: some systems rely only on cameras to mimic human vision, while others add ‘extra senses’ like lasers (LiDAR) to map distances or tactile sensors to feel weight. The takeaway is that a robot’s brain is only as good as its data. Whether you choose a vision-only setup or a mix of sensors, the goal is high-quality information. In any messy environment, a reliable way to ‘sense’ the surroundings is the foundation of a good investment.

Conclusion: the 5-year latency

Physical AI today is exactly where language models were five years ago. The technology is moving rapidly from the lab to the factory floor, and the window to gain a competitive edge is closing. Before diving in, ask yourself: is my problem one of scale (which favors simple robots) or variability (which demands Physical AI)? Avoid ‘pilot purgatory’—testing tech forever without a plan to scale. The companies that fail to experiment with physical AI today will find themselves at a massive structural disadvantage by 2030. The future is being trained right now, don't wait until it’s already passed you by.

Source:

World Economic Forum: Physical AI White Paper (2025)

View full post