BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation

Rutav Shah, Albert Yu, Yifeng Zhu, Yuke Zhu^†, Roberto Martín-Martín^†
^† Equal Advising

Setup

Installing ROS

Follow the guide provided at the official ROS wiki to install ROS Noetic on your system.

Setting up Python Environment

git clone [email protected]:UT-Austin-RobIn/BUMBLE.git
conda create -y -n bumble python=3.9
conda activate bumble
cd BUMBLE
python -m pip install -r requirements.txt
python -m pip install -r rospy_requirements.txt
git clone https://github.com/mjd3/tracikpy.git
python -m pip install tracikpy/
python -m pip install -e .

Installing GSAM:

Download the SAM-HQ weights from the original repository. We use the weights of the model: ViT-B HQ-SAM model
Set the environemnt variable:

export SAM_CKPT_PATH=/path/to/sam_hq_vit_b.pth

Installing GSAM:

git clone [email protected]:IDEA-Research/Grounded-Segment-Anything.git
cd Grounded-Segment-Anything/GroundingDINO && python setup.py build && python setup.py install
cd ../../
python -m pip install -e Grounded-Segment-Anything/segment_anything/

Setting up VLM API

Set the environment variable OPENAI_API_KEY to your OpenAI API key.

Setting up Building Occupancy Map

We use Pal's rosservice (change_map) to set the map for the Tiago robot (See set_floor_map inside bumble/tiago/ros_restrict). You should add similar functionality to set the 2D occupancy map for your ROS packages programmatically.

After setting the map, configure the landmark locations for the GoToLandmark skill. You can add your own landmarks or use the provided landmarks.
Example of landmark image structure:

bumble/tiago/skills/landmark_images/{BUILDING_NAME}_landmark_images{FLOOR_NUM}/{BUILDING_NAME}{FLOOR_NUM}_{LANDMARK_INDEX}.jpg

Note: The provided landmarks correspond to university buildings used in the experiments and are mapped to the relevant building occupancy maps.

Usage

To run the main script (rw_eval.py) for BUMBLE, run the following command:

python rw_eval.py --run_vlm --add_selection_history --add_past --exec --method ours --floor_num <FLOOR_NUM> --bld <BUILDING_NAME> --eval_id 2 --n_eval 1 --run_dir <PATH_TO_EXP_DIR>

To prevent long-term memory from being added, remove the --add_past flag.

Citation

@article{shah2024bumble,
   title={BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation},
   author={Shah, Rutav and Yu, Albert and Zhu, Yifeng and Zhu, Yuke and Mart{\'\i}n-Mart{\'\i}n, Roberto},
   journal={arXiv preprint arXiv:2410.06237},
   year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
bumble		bumble
.gitignore		.gitignore
.nojekyll		.nojekyll
Acknowledgements.md		Acknowledgements.md
README.md		README.md
requirements.txt		requirements.txt
rospy_requirements.txt		rospy_requirements.txt
rw_eval.py		rw_eval.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation

Setup

Installing ROS

Setting up Python Environment

Installing GSAM:

Setting up VLM API

Setting up Building Occupancy Map

Usage

Citation

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

UT-Austin-RobIn/BUMBLE

Folders and files

Latest commit

History

Repository files navigation

BUMBLE: Unifying Reasoning and Acting with Vision-Language Models for Building-wide Mobile Manipulation

Setup

Installing ROS

Setting up Python Environment

Installing GSAM:

Setting up VLM API

Setting up Building Occupancy Map

Usage

Citation

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages