LeRobot vs RLDS vs HDF5: Robotics Dataset Formats Explained
The three formats most robot-learning stacks use - LeRobot, RLDS and HDF5 - explained, with how to choose and convert between them.
TL;DR. LeRobot, RLDS and HDF5 are the three formats most robot-learning pipelines use. LeRobot is Hugging Face's standard (Parquet + MP4 + JSON), RLDS is the TensorFlow-Datasets format behind Open X-Embodiment, and HDF5 is the long-standing scientific container used by ALOHA and robomimic. Pick the one your training code already reads; good vendors ship all three.
The three formats
- LeRobot. Hugging Face's open robotics library and dataset format. Stores episodes as Parquet (state/action) plus MP4 video and JSON metadata. Strong tooling, growing ecosystem, easy to share on the Hugging Face Hub.
- RLDS. "Reinforcement Learning Datasets", built on TFDS. It is the format behind the cross-embodiment Open X-Embodiment / RT-X collection, so it is the natural choice if you train on or alongside that data.
- HDF5. A general scientific container (HDF Group) used by ALOHA and robomimic. Flexible and self-describing, with mature libraries in every language.
How to choose
- Match your training code. If you fine-tune an OpenVLA-style model, RLDS fits; if you use the LeRobot trainer, LeRobot; if you use ALOHA/robomimic pipelines, HDF5.
- Consider sharing. LeRobot integrates with the Hugging Face Hub for distribution.
- Consider scale. All three handle large episode counts; RLDS shards well for TFDS pipelines.
Converting between them
Episodes are conceptually the same - observations, actions, rewards/labels per timestep - so conversion is mostly a schema-mapping exercise. LeRobot and the Open X-Embodiment tooling include converters, and HDF5's flat structure makes export straightforward. The cost is in getting metadata (camera calibration, control frequency, action space) consistent.
What to ask a data vendor
Ask for episodes in your target format with action segmentation, success/failure labels, camera calibration and control-frequency metadata - not just video. nxted Capture ships in LeRobot, RLDS and HDF5 by default. To plan a purchase, see how to buy robotics training data.
FAQ
What is the difference between LeRobot, RLDS and HDF5? LeRobot is Hugging Face's Parquet+MP4+JSON robotics format, RLDS is the TFDS format behind Open X-Embodiment, and HDF5 is a general scientific container used by ALOHA and robomimic. All store per-timestep observations and actions.
Which robotics dataset format should I use? The one your training stack already reads. If you have no constraint, LeRobot has the friendliest tooling and sharing; RLDS is best alongside Open X-Embodiment data.
Can datasets be converted between these formats? Yes - the underlying episode structure is the same, so conversion is schema mapping. The work is in keeping calibration and action-space metadata consistent.
nxted delivers all three formats: see nxted Capture or request a Test Kit.
Physical-AI data specialists at OFORO LTD (UK). We write about egocentric data, robotics dataset formats, RLHF and data governance. See what we build.