Member-only story
When AI Gets a Body: Foundational Models and What It Means for the Future of Robotics
Demystifying Field AI’s $405M and Genesis AI’s $105M rounds through my lens as a robotics engineer
In my last year at Cruise (before we got completely absorbed into General Motors), I encountered a construction zone scenario that crystallized the fundamental limitation of current robotics AI: that robots could see isolated elements but missed the big picture.
Our L4 perception system was actually excellent at individually detecting cones and barriers (with 90%+ accuracy), but it easily missed what those objects meant together. A combination of cones and barriers forming a loose line typically signals a closed lane to human drivers — that we intuitively understand its intent to redirect traffic or block entry. But perception systems don’t infer intent. If even one cone is missed or the gaps appear just wide enough, the system might interpret the scene as a drivable lane.
This kind of failure mode unfortunately had catastrophic implications, as the vehicle could drive directly into an active work zone with construction workers, open manholes, heavy machinery or debris, while simultaneously blocking traffic behind it until human safety operators intervened.
