- 🔗 ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition using VNOnDB
- 📁 ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition Database (HANDS-VNOnDB2018)
ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition using HANDS-VNOnDB (VNOnDB in short) database is the first attempt to bring together researchers working on handwritten text recognition and provide them a proper benchmark to compare their approaches on the tasks of transcribing Vietnamese online handwritten text. The goal of this competition is to encourage the studies on Vietnamese online handwritten text recognition and analyze the different approaches of the participants.
This competition (VNOnDB2018) is organized in the framework of the ICFHR 2018 competitions by Nakagawa Laboratory of Tokyo University of Agriculture and Technology, Department of Computer and Information Sciences.
In order to share the ideas and systems for other researchers, we encourage all participants to present their approaches in a conference paper at ICFHR 2018 and also publish their source codes after the competition results have been announced.
Task 1: Word level (VNOnDB-Word)
In task 1, the segmented handwritten words and their ground truth are provided. We verified and eliminated the words which contain the long-distance delayed strokes such as the delayed strokes written after finished other words, or even a sentence. Thus, task 1 is used to evaluate the performance of recognizers with short-distance delayed strokes since in this task, there are only short-distance delayed strokes.
Task 2: Text line level (VNOnDB-Line)
In task 2, the text lines and their ground truth are provided. In this task, there is both long-distance, and short-distance delayed strokes which is appropriate for evaluating the robustness of systems with different kinds of delayed strokes.
Task 3: Paragraph level (VNOnDB-Paragraph)
In task 3, there are the handwritten text, which usually contains multiple text lines, and the paragraph level ground truth, which is a long sequence of characters. Task 3 is suitable for measuring the limitation of recognition system on the long sequences with many delayed strokes.
Task 1: Word level (VNOnDB-Word)
Public test set | Secret test set | Paper/Source | Code | |||
---|---|---|---|---|---|---|
CER | WER | CER | WER | |||
MyScriptTask1 Segmentation+Feedforward Neural Network (FNN) & BLSTM+CTC Syllable-based unigram VTB + others |
2.91 | 6.46 | 6.01 | 12.66 | ||
IVTOVTask1 2 BLSTM layers + CTC/Dictionary/VTB |
2.92 | 6.47 | 7.31 | 15.38 | ||
GoogleTask1 Multi LSTM layers + CTC/Character & word n-gram |
6.09 | 13.18 | 9.81 | 20.45 |
Task 2: Text line level (VNOnDB-Line)
Public test set | Secret test set | Paper/Source | Code | |||
---|---|---|---|---|---|---|
CER | WER | CER | WER | |||
MyScriptTask2_1 Segmentation+ FNN & BLSTM+CTC Syllable-based trigram/VTB |
1.02 | 2.02 | 1.02 | 3.39 | ||
MyScriptTask2_2 Segmentation+FNN & BLSTM+CTC Syllable-based trigram/VTB + others |
1.57 | 4.02 | 1.71 | 5.16 | ||
IVTOVTask2 2 BLSTM layers + CTC/Dictionary/VTB |
3.24 | 14.11 | 5.65 | 21.07 | ||
GoogleTask2 Multi LSTM layers + CTC Character & word n-gram/Other |
6.86 | 19 | 10.26 | 27.05 |
Task 3: Paragraph level (VNOnDB-Paragraph)
Public test set | Secret test set | Paper/Source | Code | |||
---|---|---|---|---|---|---|
CER | WER | CER | WER | |||
MyScriptTask3_1 Segmentation+FNN & BLSTM+CTC word-based trigram/VTB |
0.78 | 1.38 | 1.92 | 5.81 | ||
MyScriptTask3_2 Segmentation+FNN & BLSTM+CTC syllable-based trigram/VTB + others |
1.32 | 3.4 | 2.62 | 7.74 | ||
MyScritpTask3_3 Segmentation+FNN & BLSTM+CTC with Post-processing for Paragraph word-based trigram/VTB |
0.4 | 1.05 | 3.69 | 7.84 | ||
IVTOVTask3 2 BLSTM layers + CTC/VTB/Dictionary |
3.75 | 16.09 | 7.31 | 24.07 |
Given an image of a handwritten line, participants are required to create an OCR model to transcribe the image into text.
Model | WER | Method | Reference | Code |
---|---|---|---|---|
CRNN | 0.1 | Blog Post | Official |
📁 Open sources
- pbcquoc/vietocr
- miendinh/VietnameseOCR (2018)
python,tensorflow