Sonny George & Alex Danilkovas
Noetic
This workspace relies on the following software and libraries:
ros-noetic-interbotix-xsarm
ros-noetic-gazebo-ros
rospy
gym
torch
numpy
- Gazebo
- RViz
Altough a ROS package, this repository is meant to be used as a workspace, object-picker
supports the Interbotix px100 robot arm and includes:
- Training RL policies (PPO or SAC) for picking up a "T"-shaped object.
- Initializing policies with IL (Behavioral Cloning).
- Running trained policies on the real robot.
- Pre-trained models from our experiments.
To see more details on both our process and general information (via our lab notebook), click here
Video Walkthrough:
The src
of our package (object-picker
) is structured as follows:
📁 src
├── 📄 config.py # global constants and parameters
├── 📄 env.py # training-env code (reward logic & training primitives)
├── 📄 gazebo.py # code handling topic and service comms with gazebo sim
├── 📄 run.py # loads and runs policies on real px100
├── 📄 train.py # main entrypoint for training
└── 📄 utils.py # generic helper functions
ℹ️ Run training: To run the training script with the parameters specified in the if __name__ == '__main__':
block, you must:
- Launch ROS with:
roslaunch
- Start the Gazebo simulation with:
Or (depending on whether you want the Gazebo GUI or RViz to open)
roslaunch interbotix_xsarm_gazebo xsarm_gazebo.launch robot_model:=px100 use_position_controllers:=true gui:=false use_rviz:=true
roslaunch interbotix_xsarm_gazebo xsarm_gazebo.launch robot_model:=px100 use_position_controllers:=true gui:=true
- Start the training script with:
Or,
rosrun object-picker train.py
cd
into thesrc
directory and run:python train.py
ℹ️ Run real px100: To run the trained models on the real px100 robot arm:
- Run:
Or,
rosrun object-picker run.py
cd
into thesrc
directory and run:python run.py
The reward function is stateful and is as follows:
- 'Assuming lifting position' term term:
- 'Get low' term: Z-value of gripper
- 'Get close' term: Only once the gripper is low (and not before), the negative of the distance between the gripper and the object (since the forklift must get low before it can insert).
- Once the gripper is both low and close (and not before), the 'get low' term is removed and replaced with a constant
1.0
(to prevent the 'get low' term from discouraging the lifting up of the object).
- Once the gripper is both low and close (and not before), the 'get low' term is removed and replaced with a constant
- 'Lifting the object' term: The negative of the distance between the object and the goal position (raised in the air).
Node Created | Function |
---|---|
px100-training |
Orchestrate policy training by: 1. subscribing to the joint-state topics 2. publishing to joint-control topics |
Topics and Their Messages | Function |
---|---|
/px100/waist_controller/state |
publish waist position |
/px100/waist_controller/command |
receive target waist position commands |
/px100/shoulder_controller/state |
publish shoulder position |
/px100/shoulder_controller/command |
receive target shoulder position commands |
/px100/elbow_controller/state |
publish elbow position |
/px100/elbow_controller/command |
receive target elbow position commands |
/px100/wrist_angle_controller/state |
publish wrist angle position |
/px100/wrist_angle_controller/command |
receive target wrist angle position commands |
/px100/right_finger_controller/state |
publish right finger position |
/px100/right_finger_controller/command |
receive target right finger position commands |
/px100/left_finger_controller/state |
publish right finger position |
/px100/left_finger_controller/command |
receive target right finger position commands |