Skip to content
This repository has been archived by the owner on Jan 9, 2024. It is now read-only.

add tests and CI #7

Open
bertsky opened this issue Nov 29, 2019 · 2 comments
Open

add tests and CI #7

bertsky opened this issue Nov 29, 2019 · 2 comments
Assignees

Comments

@bertsky
Copy link
Contributor

bertsky commented Nov 29, 2019

I imagine this can fail in many ways. Do you have good example data? Or rather, create them artificially by re-ordering segments in good GT ad-hoc?

As for negative tests, we could probably use kant_aufklaerung_1784 from OCR-D/assets because of its bad tokenization, plus some bags/filegrps without text or with missing text.

@mikegerber mikegerber self-assigned this Dec 5, 2019
@mikegerber
Copy link
Member

Sorry for not responding to this. This came up again because @stweil fixed Python 3.10 compatibility in #13.

@bertsky Do you use this or is this "just" interest in overall quality of OCR-D tools?

@bertsky
Copy link
Contributor Author

bertsky commented Jun 22, 2023

The latter. I never found a use-case for myself. Messy line orderings are not rare, but they do not seem to come with correct region text.

Also, since, with https://github.com/bertsky/nmalign I wrote a general-purpose tool for (purely textual) forced alignment.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants