Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
tools		tools
vis		vis
dist_train.sh		dist_train.sh
readme.md		readme.md

readme.md

Mask-RCNN Spotter

This code repository contains the implementation of a simple Mask-RCNN based Text Spotter. Many advanced text spotters are built based on such framework, e.g.,

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes (ECCV 2018)
Towards Unconstrained End-to-End Text Spotting (ICCV 2019)
All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting (AAAI 2020)
...

Preparing Dataset

Original images can be downloaded from: Total-Text , ICDAR2013 , ICDAR2015, ICDAR2017_MLT.

The formatted training datalists can be found in demo/text_spotting/datalist

Train On Your Own Dataset

1.Download the pre-trained model, which was well trained on SynthText and COCO-Text.

2.Modify the paths (ann_file, img_prefix, work_dir, etc..) in the config files.

3.Modify the paths in training scripting and run the following bash command in the command line

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/mask_rcnn_spot/
bash dist_train.sh

Notice:We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add --no-validate command.

Train From Scratch

If you want to re-implement the model's performance from scratch, please following these steps:

1.End-to-End pre-training using the SynthText and COCO-Text. See demo/text_spotting/mask_rcnn_spot/configs/mask_rcnn_r50_conv6_e2e_pretrain.py for more details.

2.Fine-tune model on the mixed real dataset (include:ICADR2013, ICDAR2015, ICDAR2017-MLT, Total-Text). See demo/text_spotting/mask_rcnn_spot/configs/mask_rcnn_r50_conv6_e2e_finetune_ic13.py for more details.

Notice:We provide the implementation of online validation, if you want to close it to save training time, you may modify the startup script to add --no-validate command.

Offline Inference and Evaluation

We provide a demo of forward inference and evaluation. You can modify the parameter (iou_constraint, lexicon_type, etc..) in the testing script, and start testing:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/mask_rcnn_spot/tools/
bash test_ic13.sh

The offline evaluation tool can be found in davarocr/demo/text_spotting/evaluation/.

Visualization

We provide a script to visualize the intermediate output results of the model. You can modify the paths (test_dataset, config_file, etc..) in the script, and start generating visualization results:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/mask_rcnn_spot/tools/
python vis.py

Some visualization results are shown:

Trained Model Download

All of the models are re-implemented and well trained in the based on the opensourced framework mmdetection.

Note: The following trained model based on mask_rcnn_r50_fpn+res32+bilstm+attention uses only synthtext pre-training, and does not use random crop, color jitter, mix-train strategy, so the reported performance is slightly worse than that of mask_rcnn_r50_fpn+conv6+bilstm+attention.

Results on various datasets and trained models download:

Pipeline	Pretrained-Dataset	Links
mask_rcnn_r50_fpn+conv6+bilstm+attention	SynthText COCO-Text	cfg , pth (Access Code: ngPI)
mask_rcnn_r50_fpn+res32+bilstm+attention	SynthText	cfg , pth (Access Code: QVYc)

Dataset	Backbone	Pretrained	Finetune	Test Scale	End-to-End			Word Spotting			Links
Dataset	Backbone	Pretrained	Finetune	Test Scale	General	Weak	Strong	General	Weak	Strong	Links
ICDAR2013	ResNet-50 Conv-6x	SynthText COCO-Text	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-1440	82.1	85.6	86.1	85.6	89.9	90.5	cfg , pth (Access Code: Vum3)
ICDAR2013	ResNet-50 ResNet-32	SynthText	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-1440	82.7	86.0	86.6	86.1	90.4	91.1	cfg , , pth (Access Code: Y266)
ICDAR2015	ResNet-50 Conv-6x	SynthText COCO-Text	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-2000	66.3	75.3	78.4	66.7	78.1	81.7	cfg , pth (Access Code: Vum3)
ICDAR2015	ResNet-50 ResNet-32	SynthText	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-2000	62.9	72.2	75.7	63.5	75.0	79.1	cfg , pth (Access Code: IdJA)

Dataset	Backbone	Pretrained	Finetune	Test Scale	End-to-End		Word Spotting		Links
Dataset	Backbone	Pretrained	Finetune	Test Scale	None	Full	None	Full	Links
Total-Text	ResNet-50 Conv-6x	SynthText COCO-Text	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-1350	63.6	72.2	66.1	76.5	cfg , pth (Access Code: Vum3)
Total-Text	ResNet-50 ResNet-32	SynthText	ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text	L-1350	62.8	71.5	65.2	75.8	cfg , pth (Access Code: CyB3)

Citation:

@inproceedings{He_2017,
  title={Mask R-CNN},
  author={He, Kaiming and Gkioxari, Georgia and Dollar, Piotr and Girshick, Ross},
  booktitle={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017}
}

License

This project is released under the Apache 2.0 license

Copyright

If there is any suggestion and problem, please feel free to contact the author with [email protected] or [email protected].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mask_rcnn_spot

mask_rcnn_spot

readme.md

Mask-RCNN Spotter

Preparing Dataset

Train On Your Own Dataset

Train From Scratch

Offline Inference and Evaluation

Visualization

Trained Model Download

Citation:

License

Copyright

Files

mask_rcnn_spot

Directory actions

More options

Directory actions

More options

Latest commit

History

mask_rcnn_spot

Folders and files

parent directory

readme.md

Mask-RCNN Spotter

Preparing Dataset

Train On Your Own Dataset

Train From Scratch

Offline Inference and Evaluation

Visualization

Trained Model Download

Citation:

License

Copyright