Let's get physical (again) - Capgemini starts a heavyweight fight with a bantam-weight AI argument 

By Chris Middleton April 17, 2026

Your browser doesn’t support HTML5 audio

Excerpt:

Physical AI is a must-buy proposition. But, must we really? Putting a new Capgemini report's claims through their paces. Round one, seconds out!

Anyone who is not convinced that physical AI is the focus of a new hype cycle should think again, as a second report in as many days lumbers into the business arena – like a kickboxing robot, jabbing for your eyeballs.

In the ring this time is 140-page Physical AI – Taking Human-Robot Collaboration to a New Level, from a team of analysts at the Capgemini Research Institute. So, will this heavyweight report land a killer blow? Or will it trip over its own feet to sympathetic applause, like the Juniper Research contender did previously.

The introduction quotes Rebecca Young, Strategic Advisor at enterprise physical AI provider Dexterity – a company whose own tagline is ‘The future is physical: robots that think, arms that move, intelligence that ships’. Young says:

The last decade of AI was about information. The coming decade will be about action.

An effective marcomms message, no doubt, which will reach the ears of many a tired and overstressed leader who has struggled to make traditional AI work, let alone to repay the sunk costs. But the flaw in that message is simple: it should be ‘decades’, in the plural, not ‘decade’, because the latter suggests something immediate and solvable, and implies that enterprises should buy now to get ahead of an urgent trend.

My advice is: save your money for now.

Risky

As I suggested in my previous report on this subject, the last thing the robotics sector needs right now is a risky mix of hype plus impatient capital that demands a rapid return on investment. That is because creating safe, secure, reliable, dextrous and generally intelligent robots that can interact with the physical world, understand human intentions, and turn verbal commands into autonomous actions will be a complex, expensive, and above all, decades-long journey.

Not least of the reasons is that, while the Web can be scaped for billions of tokens of textual and 2D image data to train a Large Language or Diffusion model, usable data about the physical world – of the kind that can train an intelligent robot – needs to be 3D and gathered at source, or at the very least accurately simulated with a detailed knowledge pf physics.

Hence there is a colossal data gap in intelligent robotics, one that is equivalent to 100,000 years – in other words, it is data of a such volume that would take a human 100,000 years to consume.

While some tasks can be simulated – for example, running, jumping, or picking up a simple box – complex manual dexterity is tough to model virtually and tends to create actions that are approximate where they need to be pinpoint accurate. And this is why robot labs worldwide are full of bored people using teleoperation to train robots to pick up and manipulate objects.

Like the Juniper Research paper, this new report confirms that analysts and marketers are no longer saying that robots themselves are ‘physical AIs’ – the means by which AIs of different types will become mobile and interact with the physical world. Instead, they suggest that Physical AI is a new branch of artificial intelligence that many businesses urgently need to embrace – including those in manufacturing, warehousing/logistics, construction, agriculture, healthcare and eldercare, and the energy sector.

But it isn’t. For AI to become physical it needs a body or a housing, after all, and the mix of systems contained within our notional robot, or which will be used to train it, is extensive, diverse, and largely well established. This is why hype of this nature is so unhelpful: it sets buyers off on a fruitless, tactical search to acquire this year’s must-have tool – one that almost certainly doesn’t exist yet, or is only at a primitive stage of its development.

Drilling down

But let’s leave that aside for now and look in more depth at the report itself.

The inclusion of eldercare reveals a key subtext for the report. Capgemini rightly suggests that the demographic time bomb created by the postwar baby boom and today’s ageing populations and falling birthrates will combine to create significant labour gaps in the above industries and beyond.

That much is true, and the statistics have long been available. Let’s stick with healthcare and eldercare to underline the point: US citizens aged sixty-five and over will rise from fifteen percent of the population to 24% by 2060. There, health expenditure is likely to hit $5.7 trillion in 2026, almost twice the value of the British economy, up from $3.3 trillion a decade ago.

In the UK, more than 40% of national health spending is already devoted to people over sixty-five, according to estimates produced by healthcare organisation the Nuffield Trust. The number of citizens aged over sixty-five will increase from twelve million to seventeen million by 2035, while one in twelve will be eighty or over by 2039, according to the UK Government Office for National Statistics (ONS).

These broad demographic trends are shared by all developed economies and will combine to create significant labour gaps in many industries. For example, Japan faces a projected shortfall of 570,000 care workers alone by 2040. Now apply that scale of absence to farming, manufacturing, and the energy sector, and you can see the oncoming problem.

Thus, robots are proposed to fill the labour gap as that time bomb ticks. But creating them demands a diverse mix of software and hardware systems. As a diginomica report of mine discussed earlier this year with a focus on humanoid robots, those innovations include World Foundation Models, Visual Language Action Models, Large Language Models, Large Behavior Models, simulation, and digital twins, plus advanced sensors, chips, computer vision systems, edge computing, and more.

The Capgemini eport accurately explains:

Physical AI represents a fundamental shift: from robots that follow fixed, pre‐programmed paths to robots that can generalize across tasks, perceive and navigate complex environments, make context-aware decisions, and adapt to real-world variation. This enables robots to function in far more diverse and dynamic environments, expanding their reach across nearly every major industry and unlocking solutions to problems that earlier automation couldn’t address.

It also draws an accurate distinction between structured environments and unstructured ones. In the former, the layout, tasks, and conditions are predictable and consistent, allowing robots to follow fixed paths and routines with little variation. Examples of these include assembly lines and controlled warehouse aisles.

By contrast, unstructured environments are variable and unpredictable, and robots must be able to adapt to change and uncertainty to be truly useful human equivalents. Examples include shops, hospitals, farms – which are among the most diverse and unpredictable environments on Earth – and construction sites. Capgemini's thesis:

To understand the impact of physical AI on robotics and the value it can potentially unlock, this report draws on a global survey of 1,678 executives across 15 industries, complemented by in-depth interviews with experts across the physical AI and robotics ecosystem.

The result is that “physical AI is at an inflection point”, claim the authors, which again makes it sound like an imminent buying decision. But can a process that will, in the real world, take decades really be described as an “inflection point”?

It is more the case that we understand the problems and the long-term need, and we have the right technologies in place to reach the goal of intelligent, general-purpose robots at some point in the future. As a result, we are now embarking on a journey that will take us well into the second half of this century. That is the reality, and most senior figures in robotics would say so – believe me, I have spoken to dozens of them over the past few years.

Must-buy?

But Capgemini persists in its attempt to make this sound like an urgent must-buy:

Multi-modal foundation models are re-defining robot intelligence by enabling generalization across tasks and environments. These advances are allowing robots to adapt to unfamiliar situations without task specific re-programming, extending deployment into unstructured environments – messy, dynamic settings that earlier robotic systems could not handle.

In parallel, advances in simulation are shortening robot training cycles, while an AI‐robot‐data flywheel is accelerating improvement with every real-world deployment. Combined with falling costs of key hardware components such as sensors, actuators, and electric motors, and commercial models such as robotics-as-a-service (RaaS), these shifts
are lowering barriers to adoption.”

Fine: so, go and adopt that affordable, intelligent, intuitive, general-purpose robot, then, and report back to Capgemini about how useful it currently is – for the cost of a high-end family car. The report adds:

At the same time, demographic and economic pressures – including ageing workforces and persistent labour shortages – are intensifying demand for robotic systems capable of taking on roles that are increasingly hard to staff. Record venture capital investment into physical AI and robotics is adding to the momentum behind these shifts.

Well, the latter is certainly the case, but that is as much a problem as it is a solution. Then the report claims that physical AI is, apparently, already “a game-changer across multiple dimensions”, arguing:

Physical AI marks a step change from earlier automation. By enabling robots to interpret context, adapt in real time, and operate in unstructured environments, physical AI promotes them from passive tools to active collaborators in the workspace – opening the door to a re-imagined work environment, in which humans, robots, and AI agents work in tandem.

At the same time, physical AI allows robotics to scale as a shared intelligence platform, with learning and capabilities compounding across deployments. In doing so, physical AI extends the agentic paradigm into the real world, enabling robots to act as embodied AI agents capable of planning, orchestrating, and executing complex physical tasks. Over two-thirds (67%) of executives view it as game-changing for their industry and most believe it will become a critical driver of competitiveness.

All this may be true at the far end of the medium term, but it just isn’t true yet. And while in many ways this is an excellent, cogent, well-written analysis of the broad challenges and opportunities in, ahem, ‘physical AI’, its implicit ‘buy now’ message is leading enterprise decision-makers up a very long, muddy, and danger-filled garden path.

On that point, take farming as an example: task-specific robots that can, in an automated fashion, pick specific crops or weed fields accurately have long been available, but the reliable, safe, super-intelligent, general purpose humanoid that can be told to go and bring the sheep back to their pen, then drive a tractor, milk a cow, and help the farmer irrigate a field just doesn’t exist, and nor is it likely to for decades to come.

However, a humanoid that can be tele-operated by a remote overseas worker is already being offered, as an earlier diginomica report of mine explained. But it’s unlikely to be an experiment that works for long. But Capgemini plows ahead, and the tone of the report suggests that such problems have already been solved:

Physical AI’s value is multi-faceted. Executives expect the strongest gains in productivity, efficiency, and quality, alongside greater operational resilience and flexibility as adaptive robots help organizations manage volatility and re-configure operations quickly.

Physical AI also improves workplace safety and reduces physical strain, as robots increasingly take on hazardous and physically demanding tasks. Beyond operational impact, physical AI is opening new growth avenues: nearly four in ten executives expect new revenue opportunities, and 60% believe it will enable robotics in areas that were previously impossible or impractical.

High-impact use cases span hazardous operations, micro‐logistics, pick‐and‐place, and field inspection, alongside sector‐specific applications such as dynamic assembly in manufacturing, healthcare and eldercare in the public sector, and disaster-damage assessment in insurance.

And then comes the inevitable hype and ersatz urgency. The report says:

There is a growing imperative to adopt physical AI. Physical AI adoption is well underway: nearly eight in ten organizations (79%) are already engaging, with 27% deploying or scaling, and 65% expecting to reach scale within five years. The primary catalysts are structural: labor shortages (74%) and rising labor costs (69%).

Hmmm, really? But then it adds:

In the near term, growth will come from familiar, proven form factors for task‐specific applications. As foundation models mature and adoption deepens across industries, entirely new categories of robots are likely to emerge – purpose-built for varied environments, complex tasks, and new modes of human collaboration.

My take

Ah. At the last minute, then, Capgemini’s supposed prize fighter of a report pulls its killer punch and, without saying it too clearly, finally acknowledges the reality.

While the demographic needs are real, much of this stuff is speculative and many years in the future, while general-purpose humanoids, such as Boston Dynamics’ electric Atlas, Agility Robotics’ Digit, and Hexagon Robotics AEON are being experimentally deployed in automotive factories, largely to carry out rudimentary, repeatable manual tasks.