nxted

Egocentric data

Egocentric data is video recorded from the first-person point of view of the person performing a task - what a robot’s own camera would see - usually with depth, hand pose and a 6-DoF trajectory. It is the scarce ingredient for teaching robots manipulation, because there is no web-scale corpus of physical actions.

Egocentric (first-person) capture aligns the viewpoint with a robot’s head- or wrist-mounted sensors, which is why policies trained on it transfer. Large research efforts such as Ego4D and Ego-Exo4D were built to capture this viewpoint at scale.