Skip to content

20240219-PointMamba: A Simple State Space Model for Point Cloud Analysis

License

Notifications You must be signed in to change notification settings

fowlerovski/PointMamba

 
 

Repository files navigation

PointMamba

A Simple State Space Model for Point Cloud Analysis

Dingkang Liang1 *, Xin Zhou1 *, Xinyu Wang1 *, Xingkui Zhu1 , Wei Xu1, Zhikang Zou2, Xiaoqing Ye2, and Xiang Bai1

1 Huazhong University of Science & Technology, 2 Baidu Inc.

(*) equal contribution

ArXiv (arXiv:2402.10739)

Abstract

Transformers have become one of the foundational architectures in point cloud analysis tasks due to their excellent global modeling ability. However, the attention mechanism has quadratic complexity and is difficult to extend to long sequence modeling due to limited computational resources and so on. Recently, state space models (SSM), a new family of deep sequence models, have presented great potential for sequence modeling in NLP tasks. In this paper, taking inspiration from the success of SSM in NLP, we propose PointMamba, a framework with global modeling and linear complexity. Specifically, by taking embedded point patches as input, we proposed a reordering strategy to enhance SSM's global modeling ability by providing a more logical geometric scanning order. The reordered point tokens are then sent to a series of Mamba blocks to causally capture the point cloud structure. Experimental results show our proposed PointMamba outperforms the transformer-based counterparts on different point cloud analysis datasets, while significantly saving about 44.3% parameters and 25% FLOPs, demonstrating the potential option for constructing foundational 3D vision models. We hope our PointMamba can provide a new perspective for point cloud analysis.

Overview

Main Results

Task Dataset Config Acc.(Scratch) Download (Scratch) Acc.(pre-train) Download (Fine-tune)
Pre-training ShapeNet pretrain.yaml N.A. here
Classification ScanObjectNN finetune_scan_objbg.yaml 88.30% here 90.71% here
Classification ScanObjectNN finetune_scan_objonly.yaml 87.78% here 88.47% here
Classification ScanObjectNN finetune_scan_hardest.yaml 82.48% here 84.87% here
Part Segmentation ShapeNetPart part segmentation 85.8% mIoU here 86.0% mIoU here

Getting Started

Environment

This codebase was tested with the following environment configurations. It may work with other versions.

  • Ubuntu 20.04
  • CUDA 11.7
  • Python 3.9
  • PyTorch 1.13.1 + cu117

Installation

We recommend using Anaconda for the installation process:

# Create virtual env and install PyTorch
$ conda create -n pointmamba python=3.9
$ conda activate pointmamba
(pointmamba) $ pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

# Install basic required packages
(pointmamba) $ pip install -r requirements.txt

# Chamfer Distance & emd
(pointmamba) $ cd ./extensions/chamfer_dist && python setup.py install --user
(pointmamba) $ cd ./extensions/emd && python setup.py install --user

# PointNet++
(pointmamba) $ pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"

# GPU kNN
(pointmamba) $ pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl

# Mamba
(pointmamba) $ pip install causal-conv1d>=1.1.0
(pointmamba) $ pip install mamba-ssm

Datasets

See DATASET.md for details.

Usage

Pre-train

CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/pretrain.yaml --exp_name <name>

Classification on ScanObjectNN

Training from scratch.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --scratch_model --config cfgs/finetune_scan_objbg.yaml --exp_name <name>

Training from pre-training.

CUDA_VISIBLE_DEVICES=<GPU> python main.py --finetune_model --config cfgs/finetune_scan_objbg.yaml --ckpts <path/to/pre-trained/model> --exp_name <name>

Part Segmentation on ShapeNetPart

Training from scratch.

cd part_segmentation
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/config.yaml --log_dir <name>

Training from pre-training.

cd part_segmentation
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config cfgs/config.yaml --ckpts <path/to/pre-trained/model> --log_dir <name>

To Do

  • Release code.
  • Release checkpoints.
  • Semantic segmentation.

Acknowledgement

This project is based on Point-BERT (paper, code), Point-MAE (paper, code), Mamba (paper, code), Causal-Conv1d (code). Thanks for their wonderful works.

Citation

If you find this repository useful in your research, please consider giving a star ⭐ and a citation

@article{liang2024pointmamba,
      title={PointMamba: A Simple State Space Model for Point Cloud Analysis}, 
      author={Dingkang Liang and Xin Zhou and Xinyu Wang and Xingkui Zhu and Wei Xu and Zhikang Zou and Xiaoqing Ye and Xiang Bai},
      journal={arXiv preprint arXiv:2402.10739},
      year={2024}
}

About

20240219-PointMamba: A Simple State Space Model for Point Cloud Analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 91.9%
  • Cuda 7.3%
  • C++ 0.8%