This code repository contains the implementations of the paper Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting (AAAI 2020).
Original images can be downloaded from: Total-Text , ICDAR2013 , ICDAR2015, ICDAR2017_MLT.
The formatted training datalist and test datalist can be found in demo/text_spotting/datalist/
.
1.Download the pre-trained model, which was well trained on SynthText and COCO-Text.
2.Modify the paths (ann_file
, img_prefix
, work_dir
, etc..) in the config files.
3.Modify the paths in training script and run the following bash command in the command line
cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/
bash dist_train.sh
Notice:We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add
--no-validate
command.
If you want to re-implement the model's performance from scratch, please following these steps:
1.End-to-End pre-training using the SynthText and COCO-Text. See demo/text_spotting/text_perceptron_spot/configs/tp_r50_e2e_pretrain.py
for more details.
2.Fine-tune model on the mixed real dataset (include:ICADR2013, ICDAR2015, ICDAR2017-MLT, Total-Text). See demo/text_spotting/text_perceptron_spot/configs/tp_r50_e2e_finetune_ic13.py
for more details.
Notice:We provide the implementation of online validation, if you want to close it to save training time, you may modify the startup script to add
--no-validate
command.
We provide a demo of forward inference and evaluation. You can modify the parameter (iou_constraint
, lexicon_type
, etc..) in the testing script, and start testing:
cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/tools/
bash test_ic13.sh
The offline evaluation tool can be found in davarocr/demo/text_spotting/evaluation/
.
We provide a script to visualize the intermediate output results of the model. You can modify the paths (test_dataset
, config_file
, etc..) in the script, and start generating visualization results:
cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/text_perceptron_spot/tools/
python vis.py
Some visualization results are shown:
All of the models are re-implemented and well trained in the based on the opensourced framework mmdetection.
Results on various datasets and trained models download:
Pipeline | Pretrained-Dataset | Links |
tp_r50_fpn+conv6+bilstm+attention | SynthText COCO-Text |
Dataset | Backbone | Pretrained | Finetune | Test Scale | End-to-End | Word Spotting | Links | ||||
General | Weak | Strong | General | Weak | Strong | ||||||
ICDAR2013 (Reported) |
ResNet-50-3stages-enlarge | SynthText | - | L-1440 | 85.8 | 90.7 | 91.4 | 88.5 | 94.0 | 94.9 | - |
ICDAR2013 | ResNet-50 | SynthText COCO-Text |
ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text |
L-1440 | 87.4 | 90.6 | 91.2 | 90.9 | 93.8 | 94.2 | |
ICDAR2015 (Reported) |
ResNet-50-3stages-enlarge | SynthText | - | L-2000 | 65.1 | 76.6 | 80.5 | 67.9 | 79.4 | 84.1 | - |
ICDAR2015 | ResNet-50 | SynthText COCO-Text |
ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text |
L-2000 | 70.3 | 77.0 | 80.0 | 70.8 | 79.8 | 83.2 |
Dataset | Backbone | Pretrained | Finetune | Test Scale | End-to-End | Word Spotting | Links | ||
None | Full | None | Full | ||||||
Total-Text (Reported) |
ResNet-50 | SynthText | - | L-1350 | - | - | 69.7 | 78.3 | - |
Total-Text | ResNet-50 | SynthText COCO-Text |
ICDAR2013 ICDAR2015 ICDAR2017_MLT Total-Text |
L-1350 | 70.7 | 77.3 | 73.9 | 81.8 |
@inproceedings{qiao2020text,
title={Text Perceptron: Towards End-to-End Arbitrary-Shaped Text Spotting},
author={Qiao, Liang and Tang, Sanli and Cheng, Zhanzhan and Xu, Yunlu and Niu, Yi and Pu, Shiliang and Wu, Fei},
booktitle={Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI)},
pages={11899-11907},
year={2020}
}
This project is released under the Apache 2.0 license
If there is any suggestion and problem, please feel free to contact the author with [email protected] or [email protected].