🤖 Video-Centric Approaches to General Robotic Manipulation Learning Junming Wang Methods: Year Venue Paper Title Link 2024.11 arXiv VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation ---- 2024.11 arXiv Grounding Video Models to Actions through Goal Conditioned Exploration Project Page 2024.11 Microsoft IGOR: Image-GOal Representations Project Page 2024.11 Physical Intelligence π0: A Vision-Language-Action Flow Model for General Robot Control Project Page 2024.10 arXiv VideoAgent: Self-Improving Video Generation Project Page 2024.10 arXiv Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset Project Page 2024.10 CoRL OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation Project Page 2024.10 CoRL Differentiable Robot Rendering Project Page 2024.10 arXiv Latent Action Pretraining from Videos Project Page 2024.10 arXiv Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation Project Page 2024.10 arXiv VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model ---- 2024.10 arXiv GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation Project Page 2024.09 arXiv DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control Project Page 2024.09 arXiv Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation Project Page 2024.09 NeurIPS Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation Project Page 2024.07 CoRL Flow as the Cross-Domain Manipulation Interface Project Page 2024.07 arXiv This&That: Language-Gesture Controlled Video Generation for Robot Planning Project Page 2024.06 arXiv ARDuP: Active Region Video Diffusion for Universal Policies ---- 2024.06 CoRL Dreamitate: Real-World Visuomotor Policy Learning via Video Generation Project Page 2024.05 ECCV Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation Project Page 2023.12 RSS Any-point Trajectory Modeling for Policy Learning Project Page 2023.10 arXiv Learning to Act from Actionless Videos through Dense Correspondences Project Page 2023.10 arXiv Video Language Planning Project Page