This repository contains the official implementation of the following paper:
YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-Time Object Detection
Yuming Chen, Xinbin Yuan, Ruiqi Wu, Jiabao Wang, Qibin Hou, Ming-Ming Cheng
Under review
[Homepage (TBD)] [Paper] [知乎 (TBD)] [集智书童] [Poster (TBD)] [Video (TBD)]
- 📄 Table of Contents
- ✨ News 🔝
- 🛠️ Dependencies and Installation 🔝
- 👼 Quick Demo 🔝
- 🤖 Training and Evaluation 🔝
- 🏡 Model Zoo 🔝
- 🏗️ Supported Tasks 🔝
- 📖 Citation 🔝
- 📜 License 🔝
- 📮 Contact 🔝
- 🤝 Acknowledgement 🔝
✨ News 🔝
Future work can be found in todo.md.
- Aug, 2023: Our code is publicly available!
🛠️ Dependencies and Installation 🔝
We provide a simple scrpit
install.sh
for installation, or refer to install.md for more details.
-
Clone and enter the repo.
git clone https://github.com/FishAndWasabi/YOLO-MS.git cd YOLO-MS
-
Run
install.sh
.bash install.sh
-
Activate your environment!
conda activate YOLO-MS
👼 Quick Demo 🔝
python demo/image_demo.py ${IMAGE_PATH} ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
# for sam output
python demo/sam_demo.py ${IMAGE_PATH} ${CONFIG_FILE} ${CHECKPOINT_FILE} --sam_size ${SAM_MODEL_SIZE} --sam_model ${SAM_MODEL_PATH}
You could run python demo/image_demo.py --help
to get detailed information of this scripts.
Detailed arguments
positional arguments:
img Image path, include image file, dir and URL.
config Config file
checkpoint Checkpoint file
optional arguments:
-h, --help show this help message and exit
--out-dir OUT_DIR Path to output file
--device DEVICE Device used for inference
--show Show the detection results
--deploy Switch model to deployment mode
--tta Whether to use test time augmentation
--score-thr SCORE_THR
Bbox score threshold
--class-name CLASS_NAME [CLASS_NAME ...]
Only Save those classes if set
--to-labelme Output labelme style label file
--sam_size Default: vit_h, Optional: vit_l, vit_b
--sam_model Path of the sam model checkpoint
🤖 Training and Evaluation 🔝
-
Training
1.1 Single GPU
python tools/train.py ${CONFIG_FILE} [optional arguments]
1.2 Multi GPU
CUDA_VISIBLE_DEVICES=x bash tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
You could run
python tools/train.py --help
to get detailed information of this scripts.Detailed arguments
positional arguments: config train config file path optional arguments: -h, --help show this help message and exit --work-dir WORK_DIR the dir to save logs and models --amp enable automatic-mixed-precision training --resume [RESUME] If specify checkpoint path, resume from it, while if not specify, try to auto resume from the latest checkpoint in the work directory. --cfg-options CFG_OPTIONS [CFG_OPTIONS ...] override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file. If the value to be overwritten is a list, it should be like key="[a,b]" or key=a,b It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" Note that the quotation marks are necessary and that no white space is allowed. --launcher {none,pytorch,slurm,mpi} job launcher --local_rank LOCAL_RANK
-
Evaluation
1.1 Single GPU
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
1.2 Multi GPU
CUDA_VISIBLE_DEVICES=x bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [optional arguments]
You could run
python tools/test.py --help
to get detailed information of this scripts.Detailed arguments
positional arguments: config test config file path checkpoint checkpoint file optional arguments: -h, --help show this help message and exit --work-dir WORK_DIR the directory to save the file containing evaluation metrics --out OUT output result file (must be a .pkl file) in pickle format --json-prefix JSON_PREFIX the prefix of the output json file without perform evaluation, which is useful when you want to format the result to a specific format and submit it to the test server --tta Whether to use test time augmentation --show show prediction results --deploy Switch model to deployment mode --show-dir SHOW_DIR directory where painted images will be saved. If specified, it will be automatically saved to the work_dir/timestamp/show_dir --wait-time WAIT_TIME the interval of show (s) --cfg-options CFG_OPTIONS [CFG_OPTIONS ...] override some settings in the used config, the key-value pair in xxx=yyy format will be merged into config file. If the value to be overwritten is a list, it should be like key="[a,b]" or key=a,b It also allows nested list/tuple values, e.g. key="[(a,b),(c,d)]" Note that the quotation marks are necessary and that no white space is allowed. --launcher {none,pytorch,slurm,mpi} job launcher --local_rank LOCAL_RANK
-
Deployment
# Build docker images
docker build docker/mmdeploy/ -t mmdeploy:inside --build-arg USE_SRC_INSIDE=true
# Run docker container
docker run --gpus all --name mmdeploy_yoloms -dit mmdeploy:inside
# Convert ${CONFIG_FILE}
python tools/misc/print_config.py ${O_CONFIG_FILE} --save-path ${CONFIG_FILE}
# Copy local file into docker container
docker cp deploy.sh mmdeploy_yoloms:/root/workspace
docker cp ${DEPLOY_CONFIG_FILE} mmdeploy_yoloms:/root/workspace/${DEPLOY_CONFIG_FILE}
docker cp ${CONFIG_FILE} mmdeploy_yoloms:/root/workspace/${CONFIG_FILE}
docker cp ${CHECKPOINT_FILE} mmdeploy_yoloms:/root/workspace/${CHECKPOINT_FILE}
# Start docker container
docker start mmdeploy_yoloms
# Attach docker container
docker attach mmdeploy_yoloms
# Run the deployment shell
sh deploy.sh ${DEPLOY_CONFIG_FILE} ${CONFIG_FILE} ${CHECKPOINT_FILE} ${SAVE_DIR}
# Copy the results to local
docker cp mmdeploy_yoloms:/root/workspace/${SAVE_DIR} ${SAVE_DIR}
- DEPLOY_CONFIG_FILE: Config file for deployment.
- O_CONFIG_FILE: Original config file of model.
- CONFIG_FILE: Converted config file of model.
- CHECKPOINT_FILE: Checkpoint of model.
- SAVE_DIR: Save dir.
-
Test FPS
4.1 Deployed Model
# Copy local file into docker container docker cp ${DATA_DIR} mmdeploy_yoloms:/root/workspace/${DATA_DIR} docker cp fps.sh mmdeploy_yoloms:/root/workspace # Start docker container docker start mmdeploy_yoloms # Attach docker container docker attach mmdeploy_yoloms # In docker container # Run the FPS shell python mmdeploy/tools/profiler.py ${DEPLOY_CONFIG_FILE} \ ${CONFIG_FILE} \ ${DATASET} \ --model ${PROFILER_MODEL} \ --device ${DEVICE}
4.2 Undeployed Model
python tools/analysis_tools/benchmark.py ${CONFIG_FILE} --checkpoint ${CHECKPOINT_FILE} [optional arguments]
-
Test FLOPs and Params
python tools/analysis_tools/get_flops.py ${CONFIG_FILE} --shape 640 640 [optional arguments]
🏡 Model Zoo 🔝
- YOLOv5-MS
- YOLOX-MS
- YOLOv6-MS
- YOLOv7-MS
- PPYOLOE-MS
- YOLOv8-MS
- YOLO-MS (Based on RTMDet)
1. YOLO-MS
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | 🔗 |
---|---|---|---|---|---|---|---|---|---|---|
XS | 640 | 300 | 4.5 | 8.7 | 43.1 | 24.0 | 47.8 | 59.1 | [config] | [model] |
XS* | 640 | 300 | 4.5 | 8.7 | 43.4 | 23.7 | 48.3 | 60.3 | [config] | [model] |
S | 640 | 300 | 8.1 | 15.6 | 46.2 | 27.5 | 50.6 | 62.9 | [config] | [model] |
S* | 640 | 300 | 8.1 | 15.6 | 46.2 | 26.9 | 50.5 | 63.0 | [config] | [model] |
- | 640 | 300 | 22.0 | 40.1 | 50.8 | 33.2 | 54.8 | 66.4 | [config] | [model] |
-* | 640 | 300 | 22.2 | 40.1 | 50.8 | 33.2 | 54.8 | 66.4 | [config] | [model] |
* refers to with SE attention
2. YOLOv6
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | 🔗 |
---|---|---|---|---|---|---|---|---|---|---|
t | 640 | 400 | 9.7 | 12.4 | 41.0 | 21.2 | 45.7 | 57.7 | [config] | [model] |
t-MS | 640 | 400 | 8.1 | 9.6 | 43.5 (+2.5) | 26.0 | 48.3 | 57.8 | [config] | [model] |
3. YOLOv8
Model | Resolution | Epoch | Params(M) | FLOPs(G) | |
|
|
|
Config | 🔗 |
---|---|---|---|---|---|---|---|---|---|---|
n | 640 | 500 | 2.9 | 4.4 | 37.2 | 18.9 | 40.5 | 52.5 | [config] | [model] |
n-MS | 640 | 500 | 2.9 | 4.4 | 40.3 (+3.1) | 22.0 | 44.6 | 53.7 | [config] | [model] |
🏗️ Supported Tasks 🔝
- Object Detection
- Instance Segmentation (TBD)
- Rotated Object Detection (TBD)
- Object Tracking (TBD)
- Detection in Crowded Scene (TBD)
- Small Object Detection (TBD)
📖 Citation 🔝
If you find our repo useful for your research, please cite us:
@misc{chen2023yoloms,
title={YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection},
author={Yuming Chen and Xinbin Yuan and Ruiqi Wu and Jiabao Wang and Qibin Hou and Ming-Ming Cheng},
year={2023},
eprint={2308.05480},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
This project is based on the open source codebase MMYOLO.
@misc{mmyolo2022,
title={{MMYOLO: OpenMMLab YOLO} series toolbox and benchmark},
author={MMYOLO Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmyolo}},
year={2022}
}
📜 License 🔝
Licensed under a Creative Commons Attribution-NonCommercial 4.0 International for Non-commercial use only. Any commercial use should get formal permission first.
📮 Contact 🔝
For technical questions, please contact chenyuming[AT]mail.nankai.edu.cn
.
For commercial licensing, please contact cmm[AT]nankai.edu.cn
and andrewhoux[AT]gmail.com
.
🤝 Acknowledgement 🔝
This repo is modified from open source real-time object detection codebase MMYOLO. The README file is referred to LED and CrossKD