Scripts for IMPACT ground-truth generation.
Install go
, make sure that you have both python2 and python3
installed.
- Make sure that
$HOME/go/bin
is in yourPATH
- install the
impgtt
helper tools:cd impgtt && go install
- Install python3-venv if you are on debian or ubuntu
- Install libtk
Use make setup py2=my-python2 py3=my-python3
to setup the
tools. This will
- install ocropus into the
env/2
virtual environment, - install calamari into the
env/3
virtual environment and - download the calamari OCR-models into the
model
folder.
If impgtt is for any reason not installed at its default location
$HOME/go/bin
you can set it: make setup py2=my-python2 py3=my-python3
.
There are various scripts in the scripts
directory:
scripts/run.bash
runs the whole segmentation and alignment process (ie. runs the following three scripts in the right order)scripts/segment.bash
segments the GT into regions and the regions into lines usingocropus-nlbin
scripts/predict.bash
runs the ocr-recognition (usingcalamari-predict
and the linesscripts/align.bash
algins the ocred lines with the ground-truth linesimpgtt/...
IMPACTGT-tools: helper tools for the scripts.
General usage: script/run.bash [-nobin] [-imgext EXT] IN [OUT]
From this repositorie's root directory run the segmentation over the
data in the IN
directory using bash scripts/run.bash IN
. The
result will be written to the segmented/IN
directory. You can use
the -imgext EXT
option to set the extension of the input images,
i.e. bash script/run.bash -imgext .sau.png IN
runs the segmentation
over all the .sau.png
image files.
Currently it is not possible to run the scripts outside of this
repositorie's root directory. This is due to the fact that the
run.bash
script assumes to find ocorpus
and calamari
installed
in the env/2
and env/3
directories.