Skip to content

Commit

Permalink
🐛 bug fix on OCR generation
Browse files Browse the repository at this point in the history
  • Loading branch information
eddableheath committed Dec 6, 2024
1 parent ad634b2 commit 3f6c10d
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/arc_spice/data/multieurlex_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ def extract_articles(

def _make_ocr_data(text: str) -> list[tuple[Image.Image, str]]:
text_split = text.split()
text_split = [text for text in text_split if text not in ("", " ")]
text_split = [text for text in text_split if text not in ("", " ", None)]
generator = GeneratorFromStrings(text_split, count=len(text_split))
return list(generator)

Expand Down

0 comments on commit 3f6c10d

Please sign in to comment.