NeuMagic

When you see dancers on TikTok, they’ve choreographed and rehearsed every move. Like windup toys on a slick linoleum floor, just press play on the boombox, roll the cameras, and they’ll bust out some kind of high speed macarena.

Far more enlightened than a TikTok dancer, I am a salsa dancer. We are social dancers. We create in the moment, a different partner for every song. Each measure could bring a new combination of movements, possibly new to both of us. No choreography.

If you dig deep, you can find videos of robots on TikTok doing the macarena and lots of other choreographed sequences, but I can’t dance with them. It isn’t a problem with speed, or dexterity. They even have great vision to see what I do. What they can’t do is something you and I take for granted all the time.

I lift my partner’s hand to her temple on the count of three to send her into two quick spins, but the guy next to me is leading his partner into a dip. Faster than Shakira’s shimmy, my brain imagines how this is going to play out. Halfway through the second spin, this dipshit’s unwitting partner is going get her jugular impaled by my partner’s stiletto. I cut off the second spin and pull my partner in close to make space for the stray head. Everyone looks incredibly suave and competent, as if we had planned it.

We didn’t plan it. What happened, is that my mind raced forward, imagining how the motion would play out for all of the dancers. A multitude of simulations were fired up, rapidly prioritized and the winner was executed. This imagination is exactly what robots are missing.

The language models are successful because they could be trained on everything humans have ever written. The models needed for robots to navigate our world don’t have enough training data to learn every situation. They need imagination. They need to rapidly analyze a lot of possible scenarios for motion, and then execute on one of them.

Incidentally, what we call “muscle memory” is when we have pre-trained specific movements, so that in the moment, our brain can focus on a much tighter set of simulations. Like Shakira rehearsing hip rolls in front of a mirror.

NeuMagic invented spatial imagination for robots. Their system can watch a couple hot Shakira videos and then generate millions of expert-level motions at ludicrous speed. This serves as the raw material for Large Spatial Models (LSMs) that fuel the Synthetic Spatial Imagination in the robot. This gives AI systems an ability to virtually simulate, refine, and generalize behaviors across dynamic, unseen situations in faster-than-real time.

That last bit is what matters the most. If a robot Shakira is going to keep up with me, she’s going to need a fast imagination.

This was shot in one take, no rehearsal, no choreography, I’d never even heard this track before.