Skip to content

Latest commit

 

History

History
307 lines (254 loc) · 30.7 KB

README.md

File metadata and controls

307 lines (254 loc) · 30.7 KB

🤖 Video-Centric Approaches to General Robotic Manipulation Learning

Junming Wang


Methods:

Year Venue Paper Title Link
2024.11 arXiv VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation ----
2024.11 arXiv Grounding Video Models to Actions through Goal Conditioned Exploration Project Page
2024.11 Microsoft IGOR: Image-GOal Representations Project Page
2024.11 Physical Intelligence π0: A Vision-Language-Action Flow Model for General Robot Control Project Page
2024.10 arXiv VideoAgent: Self-Improving Video Generation Project Page
2024.10 arXiv Robots Pre-train Robots: Manipulation-Centric Robotic Representation from Large-Scale Robot Dataset Project Page
2024.10 CoRL OKAMI: Teaching Humanoid Robots Manipulation Skills through Single Video Imitation Project Page
2024.10 CoRL Differentiable Robot Rendering Project Page
2024.10 arXiv Latent Action Pretraining from Videos Project Page
2024.10 arXiv Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation Project Page
2024.10 arXiv VLM See, Robot Do: Human Demo Video to Robot Action Plan via Vision Language Model ----
2024.10 arXiv GR-2: A Generative Video-Language-Action Model with Web-Scale Knowledge for Robot Manipulation Project Page
2024.09 arXiv DynaMo: In-Domain Dynamics Pretraining for Visuo-Motor Control Project Page
2024.09 arXiv Gen2Act: Human Video Generation in Novel Scenarios enables Generalizable Robot Manipulation Project Page
2024.09 NeurIPS Closed-Loop Visuomotor Control with Generative Expectation for Robotic Manipulation Project Page
2024.07 CoRL Flow as the Cross-Domain Manipulation Interface Project Page
2024.07 arXiv This&That: Language-Gesture Controlled Video Generation for Robot Planning Project Page
2024.06 arXiv ARDuP: Active Region Video Diffusion for Universal Policies ----
2024.06 CoRL Dreamitate: Real-World Visuomotor Policy Learning via Video Generation Project Page
2024.05 ECCV Track2Act: Predicting Point Tracks from Internet Videos enables Generalizable Robot Manipulation Project Page
2023.12 RSS Any-point Trajectory Modeling for Policy Learning Project Page
2023.10 arXiv Learning to Act from Actionless Videos through Dense Correspondences Project Page
2023.10 arXiv Video Language Planning Project Page