Paper | Video | Project Page
Code release for our ECCV 2022 paper "ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild." by Wang Zhao, Shaohui Liu, Hengkai Guo, Wenping Wang and Yong-Jin Liu.
[Introduction] ParticleSfM is an offline structure-from-motion system for videos (image sequences). Inspired by Particle video, our method connects pairwise optical flows and optimizes dense point trajectories as long-range video correpondences, which are used in a customized global structure-from-motion framework with similarity averaging and global bundle adjustment. In particular, for dynamic scenes, the acquired dense point trajectories can be fed into a specially designed trajectory-based motion segmentation module to select static point tracks, enabling the system to produce reliable camera trajectories on in-the-wild sequences with complex foreground motion.
Contact Wang Zhao ([email protected]), Shaohui Liu ([email protected]) and Hengkai Guo ([email protected]) for questions, comments and reporting bugs.
If you are interested in potential collaboration or internship at ByteDance, please feel free to contact Hengkai Guo ([email protected]).
- Install dependencies:
- Set up Python environment with Conda:
conda env create -f particlesfm_env.yaml
conda activate particlesfm
- Build our point trajectory optimizer and global structure-from-motion module.
- The path to your customized python executable should be set here.
- (Optional) Add another gcc search path (e.g. gcc 9) here to compile gmapper correctly.
git submodule update --init --recursive
sudo apt-get install libhdf5-dev
bash scripts/build_all.sh
- Download pretrained models for MiDaS, RAFT and our motion segmentation module (download script).
bash scripts/download_all_models.sh
- Download two example in-the-wild sequences [Google Drive] from DAVIS: snowboard and train:
bash ./scripts/download_examples.sh
- Example command to run the reconstruction (e.g. on snowboard):
python run_particlesfm.py --image_dir ./example/snowboard/images --output_dir ./outputs/snowboard/
You can also alternatively use the command for the workspace with the images
folder inside below. This option will write all the output in the same workspace.
python run_particlesfm.py --workspace_dir ./example/snowboard/
- Visualize the outputs with either the COLMAP GUI or your customized visualizer. We also provide a visualization script:
python -m pip install open3d pycolmap
python visualize.py --input_dir ./outputs/snowboard/sfm/model --visualize
The results below are expected (left: snowboard; right: train):
-
Given an image sequence, put all the images in the same folder. The sorted ordering of the names should be consistent with its ordering in the sequence.
-
Use the following command to run our whole pipeline:
python run_particlesfm.py --image_dir /path/to/the/image/folder/ \ --output_dir /path/to/output/workspace/
This will sequentially run
optical flow -> point trajectory -> motion seg -> sfm
. The final results will be saved inside the image data folder with COLMAP output format.If you have the prior information that the scene to be reconstructed is fully static, you can skip the motion segmentation module with
--assume_static
. Conversely, if you only want to run the motion segmentation, attach--skip_sfm
to the command.To speed up
- Use "--skip_path_consistency" to skip the path consistency optimization of point trajectories
- Try higher down-sampling ratio for optimizing point trajectories: e.g. "--sample_ratio 4"
-
Visualize the outputs using COLMAP GUI (Download the COLMAP Binary and import the data sequence directory) or just your customized visualizer.
- Download the Sintel dataset. You also need to download the groundtruth camera motion data and the generated motion mask to evaluate the pose and motion segmentation.
- Prepare the sequences:
python scripts/prepare_sintel.py --src_dir /path/to/your/sintel/training/final/ \
--tgt_dir /path/to/the/data/root/dir/want/to/save/
- Run ParticleSfM reconstructions:
python run_particlesfm.py --root_dir /path/to/the/data/root/dir/
- To evaluate the camera poses:
python ./evaluation_evo/eval_sintel.py --input_dir /path/to/the/data/root/dir/ \
--gt_dir /path/to/the/sintel/training/data/camdata_left/ \
--dataset sintel
This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.
- To evaluate the motion segmentation:
python ./motion_seg/eval_traj_iou.py --root_dir /path/to/the/data/root/dir/ \
--gt_dir /path/to/the/sintel/rigidity/
-
Download the test split of ScanNet dataset, extract the data from .sens data using the official script.
-
Prepare the sequences:
python scripts/prepare_scannet.py --src_dir /path/to/your/scannet/test/scans_test/ \
--tgt_dir /path/to/the/data/root/dir/want/to/save/
We use the first 20 sequences of test split and perform downsampling with stride 3, resize the image to 640x480.
- Run ParticleSfM reconstructions:
python run_particlesfm.py --root_dir /path/to/the/data/root/dir/ \
--flow_check_thres 3.0 --assume_static
- To evaluate the camera poses:
python ./evaluation_evo/eval_scannet.py --input_dir /path/to/the/data/root/dir/ \
--gt_dir /path/to/the/scannet/test/scans_test/ \
--dataset scannet
This will output a txt file with detailed error metrics. Also, the camera trajectories are plotted and saved inside each data sequence folder.
-
Download the Flyingthings3D dataset from the official website. We need the RGB images (finalpass) and optical flow data.
-
Download the generated binary motion labels from here or GoogleDrive, and unpack this archive into the root directory of the FlyingThings3D dataset. We thank the authors of MPNet for kindly sharing it.
-
Prepare the training data:
python ./scripts/prepare_flyingthings3d.py --src_dir /path/to/your/flyingthings3d/data/root/
- To launch the training, configure your config file inside
./motion_seg/configs/
and then run:
cd ./motion_seg/
python train_seq.py ./configs/your-config-file
cd ..
@inproceedings{zhao2022particlesfm,
author = {Zhao, Wang and Liu, Shaohui and Guo, Hengkai and Wang, Wenping and Liu, Yong-Jin},
title = {ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild},
booktitle = {European conference on computer vision (ECCV)},
year = {2022}
}
- DynaSLAM. Bescos et al. DynaSLAM: Tracking, Mapping and Inpainting in Dynamic Scenes. IROS 2018.
- TrianFlow. Zhao et al. Towards Better Generalization: Joint Depth-Pose Learning without PoseNet. CVPR 2020.
- VOLDOR. Min et al. VOLDOR-SLAM: For the times when feature-based or direct methods are not good enough. ICRA 2021.
- DROID-SLAM. Teed et al. DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras. NeurIPS 2021.
This project could not be possible without the great open-source works from COLMAP, Theia, hloc, RAFT, MiDaS and OANet. We sincerely thank them all.