Skip to content
This repository has been archived by the owner on Dec 18, 2019. It is now read-only.

OCR leaves me with a blank page and a couple of black bars #783

Closed
danielroehrig opened this issue Jun 13, 2018 · 3 comments
Closed

OCR leaves me with a blank page and a couple of black bars #783

danielroehrig opened this issue Jun 13, 2018 · 3 comments

Comments

@danielroehrig
Copy link

danielroehrig commented Jun 13, 2018

After an upgrade to version 1.2.4, scanning seems to work fine but after the ocr is done, I'm left with just a blank page and a couple of black bars. Every time I issue the "redo ocr" command, it shows me the document before returning to the blank image. I can just turn off OCR and everything works fine. My OS is an Ubuntu 18.04 and my OCR language is German.

Edit: paperwork also cannot figure out the orientation of the document any more and always assumes landscape.

@danielroehrig
Copy link
Author

danielroehrig commented Jun 13, 2018

Some diagnostics:
INFO paperwork.frontend.mainwindow.scan Failed to use OCR tool heuristic for orientation detection: 'KeyError' object has no attribute 'message'
INFO paperwork.frontend.mainwindow Redoing OCR on 20180613_0737_06 p1 INFO paperwork.frontend.mainwindow.scan Will use tool 'Tesseract (sh)'
WARNING paperwork.frontend.util.jobs Job OCR:0 took 2208ms and is unstoppable ! (maximum allowed: 500ms)
Edit: tesseract version: 4.00+git24-0e00fe6-1.2

@jflesch jflesch added this to the support milestone Jun 13, 2018
@jflesch
Copy link
Member

jflesch commented Jun 13, 2018

AFAIK, Tesseract 4.00 is still in beta. Also it is currently not supported by PyOCR (see bug openpaperwork/pyocr#99 ).
I have no idea why some distributions (like Debian testing for instance) have already packaged Tesseract 4.

@jflesch jflesch closed this as completed Jun 13, 2018
@jflesch
Copy link
Member

jflesch commented Jun 13, 2018

My advise: For now, try to switch back to Tesseract 3.05. Everything should work fine.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants