VITRA: Scalable Vision-Language-Action Model Pretraining for Robotic Manipulation with Real-Life Human Activity Videos - View it on GitHub
Star
53
Rank
453155