A PyTorch implementation of Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks
Accuracy 95.32% on test dataset after 721,000 steps
-
Python 2.7
-
PyTorch
-
h5py
In Ubuntu: $ sudo apt-get install libhdf5-dev $ sudo pip install h5py
-
Protocol Buffers 3
-
LMDB
-
Visdom
-
Clone the source code
$ git clone https://github.com/potterhsu/SVHNClassifier-PyTorch $ cd SVHNClassifier-PyTorch
-
Download SVHN Dataset format 1
-
Extract to data folder, now your folder structure should be like below:
SVHNClassifier - data - extra - 1.png - 2.png - ... - digitStruct.mat - test - 1.png - 2.png - ... - digitStruct.mat - train - 1.png - 2.png - ... - digitStruct.mat
-
(Optional) Take a glance at original images with bounding boxes
Open `draw_bbox.ipynb` in Jupyter
-
Convert to LMDB format
$ python convert_to_lmdb.py --data_dir ../data
-
(Optional) Test for reading LMDBs
Open `read_lmdb_sample.ipynb` in Jupyter
-
Train
$ python train.py --data_dir ../data --logdir ./logs
-
Retrain if you need
$ python train.py --data_dir ./data --logdir ./logs_retrain --restore_checkpoint ./logs/model-100.tar
-
Evaluate
$ python eval.py --data_dir ./data ./logs/model-100.tar
-
Visualize
$ python -m visdom.server $ python visualize.py --logdir ./logs
-
Clean
$ rm -rf ./logs or $ rm -rf ./logs_retrain
##How to recognize one image