SLAM Data Pipeline - Process Human-Collected Egocentric Data

Upload Video

Video File

Supports MP4, MOV, AVI. Maximum file size: 500MB

SLAM Mode

CNN mode is more robust for low-texture indoor scenes

Target FPS

Lower = faster processing

Max Image Size

Resize to save processing time

Camera Calibration

Upload your camera calibration YAML file from the calibration tool

Processing Status

Waiting to start...

Processing Complete!

Dataset ready in LeRobot format:

✓ Camera poses (6DOF) for every frame
✓ Sparse 3D point cloud reconstruction
✓ Ready for robot learning and VLA training

Download Dataset

Related Tools

Need to calibrate your camera first? Use our camera calibration tool to get accurate intrinsics before running SLAM.

Go to Camera Calibration Tool →

About This Pipeline

This SLAM pipeline processes egocentric video collected from humans (e.g., wearable cameras) and produces a ready-to-use dataset for human-to-robot transfer learning.

What you get:

6DOF camera pose (x, y, z + quaternion) for each frame
Sparse 3D point cloud of the environment
Output in LeRobot format - compatible with existing robot learning code
All original video data preserved with added SLAM metadata

Available modes:

🗺️ CNN-based (DINOv2 + LoFTR)

✓ Better for low-texture indoor scenes
✓ More robust matching for repetitive patterns
✓ Recommended for human demonstration videos
Requires GPU for reasonable speed

📍 ORB-SLAM3 (Traditional)

✓ Mature open-source implementation
✓ Faster on CPU
Good baseline for comparison
Can struggle with low-texture areas

Source code available on GitHub. The pipeline is open-source and can be run locally.