Releases: hplt-project/OpusCleaner
Releases · hplt-project/OpusCleaner
v0.4.2
v0.4.1
Version 0.4.0
The release that should have been released way earlier.
What's Changed
- Extract all (two…) files from the zip archive in parallel by @jelmervdl in #118
- Cancelable parallel by @jelmervdl in #117
- Add missing docs & types and
--time
support by @jelmervdl in #120 - Update remove_empty_lines.json by @jindrahelcl in #125
- Improve num_mismatch filter by @jelmervdl in #123
- Revert "Update remove_empty_lines.json" by @jelmervdl in #126
- Whitespace normalization filter by @jindrahelcl in #128
- delete redundant def, import instead by @jindrahelcl in #131
- Fix mismatching of sentence final punctuation by @XapaJIaMnu in #100
- Calling datasets.main() without arguments uses default data path @jindrahelcl in #136
- Fixing remove_empty_lines for good by @jindrahelcl in #137
Full Changelog: v0.3.1...v0.4.0
v0.3.1
Bugfixes.
Full Changelog: v0.3.0...v0.3.1
Version 0.3
- Monolingual support