Optical character recognition software written in C.
- gcc
- GTK3
- SDL2
- SDL2_image
If you plan to train the network you will also need:
- python
- xelatex
- pdftoppm
make
./ocr
- Generate the custom dataset used to train the neural network
cd dataset # This can take a while ./generate_dataset.sh
- (optional) Adjust training parameters in
src/ocr_train.c
- Use the
--train
option when launching the OCR./ocr --train
This will output the neural network in output/ocr_network_eX
after each epoch.
You can save the pre-computed dataset to avoid wasting time before each
training (see details about this in src/ocr_train.c
).