Why Robotics is a Pre-Paradigm Field

Towards a Grand Unified Theory of Robotics

Jun 06, 2026

Madame Lavoisier (on the right) while assisting her husband and his assistant Armand Seguin (in suit, at left) on his scientific research of human respiration. (Picture in public domain, caption from Wikipedia.)

Thomas Kuhn wrote The Structure of Scientific Revolutions in 1962, introducing the notion of a paradigm shift in science: when a field of science converges around an organizing principle that changes everything. The discovery of DNA. The periodic table of elements. The germ theory of disease. In robotics, we are in the stage of pre-science. We have identified the problem to solve: we want to make an embodied agent act intelligently in the physical world. There are ideas for paradigms swirling around, and there have been since the beginning: Subsumption architecture. Sequential composition. Neural networks. MDPs and POMDPs. And there have been pieces of theories: SLAM and the Rao-Blackwellized Particle Filter. Motion planning. Diffusion Policies. Data+VLA+RL.

But we don’t yet have a grand unified theory of robotics. I’ve been thinking about this since I started my faculty position at Brown in 2013. When I was a postdoc, I had to be laser-focused on becoming the best person in my age group in my area: language understanding for robotics. So my research was all about language understanding. (Our AAAI 2011 paper just won the AAAI 2026 classic paper award! ) As a faculty member, I had the privilege of broadening my focus. People want to talk to robots about everything they can see and everything they can do, so we need models that connect to everything they can see and everything they can do. We need a grand unified theory of robotics.

The closest I found was POMDPs. Partially Observable Markov Decision Processes. A POMDP is the simplest model I know that captures everything a robot can do: a latent state of the physical world, observations the robot can see that provide noisy information about that state, actions that change the underlying state of the world, and a goal in the form of a reward function. Unfortunately, POMDPs are undecidable in the general case. They are too challenging a problem for our computers to solve. They provide a model, but they do not provide computational leverage. I realized that POMDPs are like Python: Python is undecidable, too. But we can still use it to write useful programs and tools. So a lot of my group’s paper boiled down to writing about how to introduce structure into a POMDP to enable efficient learning and inference. I embarked on a quest to figure out how to integrate SLAM into a POMDP, but got stuck on what the reward function should be. And I framed all of my group’s work around constructing pieces of a Human-Robot Collaborative POMDP.

Meanwhile, Deep Learning happened. Is Deep Learning a paradigm? Maybe. Certainly, it’s an unbelievably effective way to perform function approximation. We might wish there were more theory to explain why it works, how long it will take to train, and what size gradient steps to take. We still have to form training recipes and model structure around what we think is important and what not: a giant unstructured multi-layer perceptron is not the universal solution to our problems.

The current bet many of us are making is more specific than deep learning alone. The working hypothesis is that data plus Vision-Language-Action models plus reinforcement learning equals a generalist robot: pretrain a VLA on internet-scale vision and language, fine-tune it on as much teleoperated robot data as you can collect, and then let RL close the remaining gap through autonomous practice. Physical Intelligence, Skild, Figure, 1X, Generalist, Google DeepMind, and Tesla are all placing some version of this bet, with different weightings on each term. A secondary bet, gaining traction fast, is that RL eventually eats data and VLA’s lunch entirely: that once you have a policy good enough to collect its own experience, scaling self-generated data through RL dominates whatever imitation learning can offer. This might be the paradigm. It might also be dephlogisticated air.

Kuhn delves into the story of Joseph Priestley and Lavoisier and the discovery of oxygen. Joseph Priestley isolated oxygen in 1774 by heating mercuric oxide. He noticed it made candles burn brighter and kept mice alive longer. But Priestley interpreted what he saw through the lens of phlogiston theory, the prevailing belief that combustion worked by releasing a substance called phlogiston. He discovered oxygen but couldn’t see it for what it was. He called it “dephlogisticated air” and went to his grave defending phlogiston.

Antoine Lavoisier heard about Priestley’s experiments, repeated them, and saw something completely different. Not because the experiments were different. But because Lavoisier was willing to throw out phlogiston entirely. He reframed combustion as a combination with a new element, named it oxygen, and built the modern theory of chemistry around it. Same data. Different paradigm.1

This is where robotics is today, with one important caveat. Jessica Hodgins pushed back on me here, and she’s right: Priestley had a wrong theory that prevented him from seeing what his experiments revealed. Most roboticists aren’t operating under wrong theories in that sense. SLAM, function approximation, and motion planning are all pieces of a critical recipe that, as of May 2026, still have an important place in the toolbelt of roboticists looking to solve specific problems.

But the choice of which tool to reach for encodes an implicit theory of what embodied intelligence is. SLAM researchers act as if intelligence centrally requires explicit state estimation. Learning researchers act as if intelligence is a function approximation at a sufficient scale. Controls researchers act as if intelligence reduces to optimization under known dynamics. These are paradigm-level commitments masquerading as tool choices, and any of them could be as wrong as phlogiston. We are all looking at the same robot, in the same physical world, and the tools we reach for reveal what we think the robot fundamentally is.

The engineering question is what we need to build to make a general robot. The science question is what embodied intelligence actually is. The first is making progress.

For the second, we’re waiting for our Lavoisier.

Lavoisier went to his grave during the French Revolution in 1794, guillotined after his conviction as a tax collector. The mathematician Lagrange remarked, “It took them only an instant to cut off that head, and a hundred years may not produce another like it.”

Nishanth Kumar

Jun 7Edited

Really cool article - I loved it! This gives words to something I’ve been thinking about for a while (though in my head I used the example of physics and theories of light before Einstein to serve as an analog to modern robotics).

Two things I find interesting after reading the conclusion:

1. Perhaps we’re waiting for a Lavoisier, but perhaps we also simply don’t have enough evidence/data currently to even approach the right theory (I imagine Lavoisier didn’t come up with the building blocks of the periodic table merely from Priestleys oxygen experiment, but rather from a whole bunch of other empirical evidence; perhaps we don’t have that “critical mass” yet?). Curious what you think about this.

2. It is interesting that we’re making seemingly rapid progress on engineering and very little understanding/science progress. Do you think engineering progress is fundamentally limited by science progress? I’d probably be inclined to say yes to this, but as a counterpoint, it feels like we’ve made relatively little progress on the science of deep learning for instance, but we’ve made very impressive empirical progress in building vision and language systems that really work and do useful things.

1 reply by Stefanie Tellex

Thomas Riedel (Droid Boy)

Jun 8

I dont understand yet, why the problem of giving the element the right name, oxigen, is the same problem we have with naming physical AI.

11 more comments...

Discussion about this post

Ready for more?