Some words are not recognized correctly with the language file tessdata_best\eng.trainedata #4318

ProgramacionDk · 2024-09-18T08:36:27Z

Current Behavior

FGO073

FGO037

FG101

FG114

FGO037
FG184

FG095
FG184

resultado.txt

Expected Behavior

FG073

FG037

FG101

FG114

FG037
FG184

FG095
FG184

Suggested Fix

No response

tesseract -v

tesseract v5.4.0.20240606
leptonica-1.84.1
libgif 5.2.1 : libjpeg 8d (libjpeg-turbo 3.0.1) : libpng 1.6.43 : libtiff 4.6.0 : zlib 1.3 : libwebp 1.4.0 : libopenjp2 2.5.2
Found AVX2
Found AVX
Found FMA
Found SSE4.1
Found libarchive 3.7.4 zlib/1.3.1 liblzma/5.6.1 bz2lib/1.0.8 liblz4/1.9.4 libzstd/1.5.6

Operating System

Windows 10

Other Operating System

No response

uname -a

No response

Compiler

No response

CPU

No response

Virtualization / Containers

No response

Other Information

Some words are not recognized correctly, for example, the word FG073 is recognized as FGO073.

I run tesseract with attached image and english trained data tessdata_best\eng.traineddata downloaded from https://github.com/tesseract-ocr/tessdata_best.
The trained data tessdata_fast\eng.traineddata it works fine.

tesseract v5.4.0.20240606 compiled by UB Mannheim
https://github.com/UB-Mannheim/tesseract/wiki

amitdo added the traineddata label Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some words are not recognized correctly with the language file tessdata_best\eng.trainedata #4318

Some words are not recognized correctly with the language file tessdata_best\eng.trainedata #4318

ProgramacionDk commented Sep 18, 2024

Some words are not recognized correctly with the language file tessdata_best\eng.trainedata #4318

Some words are not recognized correctly with the language file tessdata_best\eng.trainedata #4318

Comments

ProgramacionDk commented Sep 18, 2024

Current Behavior

Expected Behavior

Suggested Fix

tesseract -v

Operating System

Other Operating System

uname -a

Compiler

CPU

Virtualization / Containers

Other Information