A cost-effective, distributed approach to scaling human demonstration data for robot learning — collecting thousands of trajectories at a fraction of traditional in-lab costs.
Scale分布式采集成本优势Imitation LearningCrowdsourcing
🎯
What is Human Data Collection Pipeline?
为什么我们需要分布式人类数据收集
💡HDCP enables distributed collection of human demonstration data from crowd workers around the world, instead of relying solely on expensive in-lab expert demonstrations. This dramatically reduces costs while increasing data volume and behavioral diversity.
💰
Ultra Low Cost
Collect 1,000+ demonstrations for hundreds of dollars. Cost per trajectory is typically 10–100× lower than in-lab collection.
📈
Massive Scale
Parallel collection from hundreds of workers simultaneously. Scale to tens of thousands of trajectories in weeks, not months.
🌍
Greater Diversity
Many demonstrators naturally capture variation in styles and approaches, improving model generalization.
⚡
Fast Turnaround
Launch a data collection task and get completed trajectories back within days. Iterate on task design quickly.
🖥️
No Hardware Required
Workers use their own browser. No robot hardware needed for teleoperation data collection in simulation.
🔄
Continuous Data Flow
Keep the pipeline running to continuously add new data for ongoing model improvement over time.
📊
Cost Advantage: Traditional vs HDCP
Comparing approaches for 1,000 demonstration trajectories
Factor
Traditional In-Lab
HDCP Distributed
Expert Labor Cost
$50k – $150k
$500 – $2,000
Hardware Investment
$10,000+
$0
Time to Complete
3 – 6 months
1 – 2 weeks
Demonstrator Diversity
1 – 5 people
50 – 200 people
Scaling to 10k trajectories
Prohibitive
Straightforward
Example HDCP Cost Breakdown — 1,000 trajectories
Worker payment per trajectory$0.50 – $1.50
Platform fees (MTurk, Prolific, etc.)+20% markup
Quality filtering (automated)~$50
Total Estimated Cost$600 – $1,800
100×
Cost reduction vs. in-lab
10×
Faster time to dataset
20×
More diverse demonstrations
⚙️
The Pipeline Step-by-Step
从任务设计到最终数据集的完整流程 — click any step to expand
01
Setup
Task & Environment Design
Define goals, success conditions, and build the simulation environment.
Define task goal, success conditions, and reward function
Build simulation environment with proper camera views and rendering
Create reproducible environment resets for each trajectory
Define action space (joint angles, gripper commands, deltas)
02
Interface
Worker Interface Development
Web-based teleoperation UI accessible from any browser.
Web-based teleoperation UI — keyboard, mouse, and joystick support
Clear instructions, tutorial video, and practice trials before recording
Real-time visual feedback on task progress and success signal
One-click submission when complete; no installation required
03
Calibrate
Pilot & Calibration
Small batch (n=50–100) to validate the interface and calibrate pricing.
Check if workers understand instructions correctly and complete tasks
Measure average completion time per trajectory
Set fair price — target ~$10–15 per hour for workers
Identify common failure modes and misunderstandings
Find which workers produce consistent high-quality data early
04
Collect
Large-Scale Parallel Collection
100–500 workers collecting in parallel on crowdsourcing platforms.
Launch HITs on MTurk, Prolific, or similar crowdsourcing platforms
Release batches gradually (e.g., 100 HITs at a time) to maintain quality control
Auto-save trajectories every few seconds to cloud storage
Store raw video + states + actions separately for flexibility
Monitor a real-time progress dashboard; track per-worker statistics
05
Filter
Automated Quality Filtering
Remove failed, outlier, and duplicate trajectories automatically.
Filter out timeout, failure, and too-short trajectories
Outlier detection based on trajectory length and success rate distributions
Clustering to remove duplicate or near-identical behavior patterns
Keep workers with >60% success rate; block poor performers early
Typical retention: 70–85% of all collected trajectories pass filtering
06
Process
Data Processing & Format Conversion
Normalize and convert raw trajectories into a training-ready dataset.
Resample observations to a consistent frequency (e.g., 10 Hz)
Extract action tensors from raw teleoperation input (joint angles, gripper)
Normalize observations and actions to zero-mean, unit-variance
Split into train / validation / test sets (e.g., 80 / 10 / 10)
Convert to framework dataset format (RLDS, HDF5, JSON, etc.)
07
Train
Train & Iterate
Train your policy, evaluate performance, identify gaps, and repeat.
Train policy via imitation learning (BC, DiffusionPolicy, ACT, etc.)
Evaluate success rate and generalization on held-out test set
Identify under-represented scenarios and task failure modes
Collect additional targeted data for those gaps
Repeat until target performance is reached — the flywheel compounds
✅
Key Success Factors
Best practices for getting high-quality data
🎮 Make It Easy to Control
Support multiple input modalities (mouse, keyboard, gamepad) with automatic sensitivity adjustment for each worker's setup.
📝 Clear Instructions
A 60-second video tutorial is worth 1,000 words. Show exactly what to do, what counts as success, and what to avoid.
⚡ Early Quality Filtering
Check quality after the first few trajectories from each worker. Block poor performers early to save money and maintain dataset quality.
💰 Fair Payment
Pay at least $10/hour. Better pay attracts better workers who produce higher quality data — this directly impacts model performance.
🔄 Allow Multiple Attempts
Workers improve with practice. Allowing retries produces better trajectories and reduces frustration-driven abandonment.
🎯 Auto-Reset Environment
One-click reset for failed attempts. Low-friction workflows keep workers engaged and completing more high-quality demonstrations.
⚠️Common pitfall: Trying to collect too-complex tasks from non-expert workers. Start with simple, atomic tasks completable in 1–2 minutes. Chain simpler skills together instead of one complex monolithic task.
🤔
When Should You Use This Approach?
适用场景和不适用场景
✓ Good For
Simulation-based tasksAny task where you need many demonstrations in a sim environment
Imitation learningBehavior cloning needs massive, diverse demonstration data
Multi-task skill collectionCollect many different skills from different workers in parallel
Finetuning pre-trained policiesAdding diverse trajectories to improve generalization
✗ Less Suitable For
Real-world physical robotsPhysical hardware still needs in-lab collection
Ultra-precision tasksTasks requiring expert-level precision may still need domain experts
Safety-critical tasksWhen failure is extremely expensive, prefer expert-supervised collection
Very long-horizon tasksTasks over 10 minutes are hard for crowd workers — split into smaller steps
🚀
Getting Started
Your first data collection project checklist
1
Start small
Begin with a simple task taking 1–2 minutes per demonstration. Don't start with your most complex task.
2
Build the web interface
Use HTML5/JavaScript so workers just click a link — no installation needed. Three.js or Unity WebGL work well.
3
Make a tutorial
Record a 60-second screencast showing how to do the task. This single step dramatically improves data quality.
4
Run a pilot of 50 trajectories
Check results, see where workers struggle, adjust instructions and difficulty before scaling.
5
Scale up in batches
Release 100 trajectories at a time, monitor quality continuously, and retain your best workers.
6
Process and train
Run quality filters, convert to your dataset format, start training, and identify gaps for the next iteration.
💡The XRollout platform provides built-in tools for distributed human data collection. Join our community to get access to the infrastructure.
📄
Original Document
Download the original PDF
This article summarizes the Human Data Collection Pipeline approach originally published at Physical Intelligence (π).