Switch to ICU tokenizer #939
Open
firefoxci-taskcluster / clean-corpus-opus-ELRC-3075-wikipedia_health_v1-ru-en
succeeded
Nov 22, 2024 in 5m 50s
FirefoxCI (pull_request)
Clean opus ELRC-3075-wikipedia_health_v1 dataset ru-en use OpusCleaner true
Details
View task in Taskcluster | View logs in Taskcluster | View task group in Taskcluster
Task Status
Started: 2024-11-22T22:55:14.735Z
Resolved: 2024-11-22T22:56:20.880Z
Task Execution Time: 1 minute, 6 seconds, 145 milliseconds
Task Status: completed
Reason Resolved: completed
RunId: 0
Artifacts
- public/build/ELRC-3075-wikipedia_health_v1.en.zst
- public/build/ELRC-3075-wikipedia_health_v1.ru-en.filters.json
- public/build/ELRC-3075-wikipedia_health_v1.ru.zst
- public/logs/live_backing.log
- public/logs/live.log
[taskcluster 2024-11-22 22:55:14.903Z] Task ID: VfXCBJlxTkKlBWzbDw53qQ
[taskcluster 2024-11-22 22:55:14.903Z] Worker ID: 7251626393458226542
[taskcluster 2024-11-22 22:55:14.903Z] Worker Group: us-central1-b
[taskcluster 2024-11-22 22:55:14.903Z] Worker Node Type: projects/887720501152/machineTypes/n2-highmem-32
[taskcluster 2024-11-22 22:55:14.903Z] Worker Pool: translations-1/b-linux-large-gcp-300gb
[taskcluster 2024-11-22 22:55:14.903Z] Worker Version: 38.0.5
[taskcluster 2024-11-22 22:55:14.903Z] Public IP: 35.193.57.105
[taskcluster 2024-11-22 22:55:14.903Z] Hostname: translations-1-b-linux-large-gcp-300gb-xkvo4p88ttqw9znev2idoq
[taskcluster 2024-11-22 22:55:14.903Z] using cache "translations-level-1-checkouts-v3-7afeb851dd97df8f3607-KnyIE1GvSz67R9mjL97Now" -> /builds/worker/checkouts
[taskcluster 2024-11-22 22:55:18.043Z] Downloading artifact "public/image.tar.zst" from task ID: KnyIE1GvSz67R9mjL97Now.
[taskcluster 2024-11-22 22:55:22.136Z] Downloaded artifact successfully.
[taskcluster 2024-11-22 22:55:22.137Z] Downloaded 287.207 mb
[taskcluster 2024-11-22 22:55:22.138Z] Decompressing downloaded image
[taskcluster 2024-11-22 22:55:24.227Z] Loading docker image from downloaded archive.
[taskcluster 2024-11-22 22:55:39.388Z] Image 'public/image.tar.zst' from task 'KnyIE1GvSz67R9mjL97Now' loaded. Using image ID sha256:d31e1900b8212f46ff27eab4217df610f5d7a124bb4975b4b8ea07a64443f3ba.
[taskcluster 2024-11-22 22:55:39.552Z] === Task Starting ===
[setup 2024-11-22T22:55:47.790Z] run-task started in /builds/worker
[setup 2024-11-22T22:55:47.790Z] Invoked by command: --firefox_translations_training-checkout=/builds/worker/checkouts/vcs/ -- bash -c pip install -r $VCS_PATH/pipeline/clean/requirements/clean.txt && if [ ${USE_OPUSCLEANER} == "true" ]; then dir="clean/opuscleaner"; else dir="clean"; fi && $VCS_PATH/pipeline/${dir}/clean-corpus.sh $MOZ_FETCHES_DIR/ELRC-3075-wikipedia_health_v1 $TASK_WORKDIR/artifacts/ELRC-3075-wikipedia_health_v1 auto opus_ELRC-3075-wikipedia_health/v1 ${OPUSCLEANER_MODE} 2>&1
[setup 2024-11-22T22:55:47.790Z] Python version: 3.10.12
...(2605 lines hidden)...
[task 2024-11-22T22:56:13.019Z] 120550K .......... .......... .......... .......... .......... 94% 80.1M 0s
[task 2024-11-22T22:56:13.020Z] 120600K .......... .......... .......... .......... .......... 94% 120M 0s
[task 2024-11-22T22:56:13.020Z] 120650K .......... .......... .......... .......... .......... 94% 249M 0s
[task 2024-11-22T22:56:13.020Z] 120700K .......... .......... .......... .......... .......... 94% 94.2M 0s
[task 2024-11-22T22:56:13.021Z] 120750K .......... .......... .......... .......... .......... 94% 131M 0s
[task 2024-11-22T22:56:13.022Z] 120800K .......... .......... .......... .......... .......... 94% 43.3M 0s
[task 2024-11-22T22:56:13.022Z] 120850K .......... .......... .......... .......... .......... 94% 165M 0s
[task 2024-11-22T22:56:13.022Z] 120900K .......... .......... .......... .......... .......... 94% 259M 0s
[task 2024-11-22T22:56:13.023Z] 120950K .......... .......... .......... .......... .......... 94% 100M 0s
[task 2024-11-22T22:56:13.024Z] 121000K .......... .......... .......... .......... .......... 94% 38.9M 0s
[task 2024-11-22T22:56:13.024Z] 121050K .......... .......... .......... .......... .......... 94% 245M 0s
[task 2024-11-22T22:56:13.024Z] 121100K .......... .......... .......... .......... .......... 94% 249M 0s
[task 2024-11-22T22:56:13.025Z] 121150K .......... .......... .......... .......... .......... 94% 247M 0s
[task 2024-11-22T22:56:13.026Z] 121200K .......... .......... .......... .......... .......... 94% 35.3M 0s
[task 2024-11-22T22:56:13.026Z] 121250K .......... .......... .......... .......... .......... 94% 247M 0s
[task 2024-11-22T22:56:13.026Z] 121300K .......... .......... .......... .......... .......... 94% 233M 0s
[task 2024-11-22T22:56:13.027Z] 121350K .......... .......... .......... .......... .......... 94% 259M 0s
[task 2024-11-22T22:56:13.028Z] 121400K .......... .......... .......... .......... .......... 94% 39.6M 0s
[task 2024-11-22T22:56:13.028Z] 121450K .......... .......... .......... .......... .......... 94% 271M 0s
[task 2024-11-22T22:56:13.029Z] 121500K .......... .......... .......... .......... .......... 94% 112M 0s
[task 2024-11-22T22:56:13.029Z] 121550K .......... .......... .......... .......... .......... 94% 73.4M 0s
[task 2024-11-22T22:56:13.030Z] 121600K .......... .......... .......... .......... .......... 94% 71.6M 0s
[task 2024-11-22T22:56:13.030Z] 121650K .......... .......... .......... .......... .......... 94% 249M 0s
[task 2024-11-22T22:56:13.030Z] 121700K .......... .......... .......... .......... .......... 94% 146M 0s
[task 2024-11-22T22:56:13.031Z] 121750K .......... .......... .......... .......... .......... 95% 45.4M 0s
[task 2024-11-22T22:56:13.032Z] 121800K .......... .......... .......... .......... .......... 95% 204M 0s
[task 2024-11-22T22:56:13.032Z] 121850K .......... .......... .......... .......... .......... 95% 106M 0s
[task 2024-11-22T22:56:13.032Z] 121900K .......... .......... .......... .......... .......... 95% 259M 0s
[task 2024-11-22T22:56:13.033Z] 121950K .......... .......... .......... .......... .......... 95% 58.1M 0s
[task 2024-11-22T22:56:13.033Z] 122000K .......... .......... .......... .......... .......... 95% 215M 0s
[task 2024-11-22T22:56:13.034Z] 122050K .......... .......... .......... .......... .......... 95% 230M 0s
[task 2024-11-22T22:56:13.034Z] 122100K .......... .......... .......... .......... .......... 95% 194M 0s
[task 2024-11-22T22:56:13.035Z] 122150K .......... .......... .......... .......... .......... 95% 64.9M 0s
[task 2024-11-22T22:56:13.035Z] 122200K .......... .......... .......... .......... .......... 95% 196M 0s
[task 2024-11-22T22:56:13.035Z] 122250K .......... .......... .......... .......... .......... 95% 97.6M 0s
[task 2024-11-22T22:56:13.036Z] 122300K .......... .......... .......... .......... .......... 95% 72.3M 0s
[task 2024-11-22T22:56:13.036Z] 122350K .......... .......... .......... .......... .......... 95% 234M 0s
[task 2024-11-22T22:56:13.037Z] 122400K .......... .......... .......... .......... .......... 95% 207M 0s
[task 2024-11-22T22:56:13.037Z] 122450K .......... .......... .......... .......... .......... 95% 232M 0s
[task 2024-11-22T22:56:13.037Z] 122500K .......... .......... .......... .......... .......... 95% 101M 0s
[task 2024-11-22T22:56:13.038Z] 122550K .......... .......... .......... .......... .......... 95% 83.4M 0s
[task 2024-11-22T22:56:13.038Z] 122600K .......... .......... .......... .......... .......... 95% 170M 0s
[task 2024-11-22T22:56:13.038Z] 122650K .......... .......... .......... .......... .......... 95% 212M 0s
[task 2024-11-22T22:56:13.039Z] 122700K .......... .......... .......... .......... .......... 95% 52.4M 0s
[task 2024-11-22T22:56:13.039Z] 122750K .......... .......... .......... .......... .......... 95% 211M 0s
[task 2024-11-22T22:56:13.040Z] 122800K .......... .......... .......... .......... .......... 95% 168M 0s
[task 2024-11-22T22:56:13.040Z] 122850K .......... .......... .......... .......... .......... 95% 247M 0s
[task 2024-11-22T22:56:13.040Z] 122900K .......... .......... .......... .......... .......... 95% 151M 0s
[task 2024-11-22T22:56:13.041Z] 122950K .......... .......... .......... .......... .......... 95% 94.4M 0s
[task 2024-11-22T22:56:13.041Z] 123000K .......... .......... .......... .......... .......... 95% 202M 0s
[task 2024-11-22T22:56:13.042Z] 123050K .......... .......... .......... .......... .......... 96% 75.1M 0s
[task 2024-11-22T22:56:13.042Z] 123100K .......... .......... .......... .......... .......... 96% 219M 0s
[task 2024-11-22T22:56:13.042Z] 123150K .......... .......... .......... .......... .......... 96% 200M 0s
[task 2024-11-22T22:56:13.042Z] 123200K .......... .......... .......... .......... .......... 96% 190M 0s
[task 2024-11-22T22:56:13.043Z] 123250K .......... .......... .......... .......... .......... 96% 200M 0s
[task 2024-11-22T22:56:13.043Z] 123300K .......... .......... .......... .......... .......... 96% 89.8M 0s
[task 2024-11-22T22:56:13.043Z] 123350K .......... .......... .......... .......... .......... 96% 219M 0s
[task 2024-11-22T22:56:13.044Z] 123400K .......... .......... .......... .......... .......... 96% 146M 0s
[task 2024-11-22T22:56:13.044Z] 123450K .......... .......... .......... .......... .......... 96% 79.8M 0s
[task 2024-11-22T22:56:13.045Z] 123500K .......... .......... .......... .......... .......... 96% 226M 0s
[task 2024-11-22T22:56:13.045Z] 123550K .......... .......... .......... .......... .......... 96% 159M 0s
[task 2024-11-22T22:56:13.045Z] 123600K .......... .......... .......... .......... .......... 96% 210M 0s
[task 2024-11-22T22:56:13.045Z] 123650K .......... .......... .......... .......... .......... 96% 205M 0s
[task 2024-11-22T22:56:13.046Z] 123700K .......... .......... .......... .......... .......... 96% 192M 0s
[task 2024-11-22T22:56:13.046Z] 123750K .......... .......... .......... .......... .......... 96% 252M 0s
[task 2024-11-22T22:56:13.047Z] 123800K .......... .......... .......... .......... .......... 96% 67.5M 0s
[task 2024-11-22T22:56:13.047Z] 123850K .......... .......... .......... .......... .......... 96% 208M 0s
[task 2024-11-22T22:56:13.047Z] 123900K .......... .......... .......... .......... .......... 96% 177M 0s
[task 2024-11-22T22:56:13.047Z] 123950K .......... .......... .......... .......... .......... 96% 164M 0s
[task 2024-11-22T22:56:13.048Z] 124000K .......... .......... .......... .......... .......... 96% 85.0M 0s
[task 2024-11-22T22:56:13.048Z] 124050K .......... .......... .......... .......... .......... 96% 228M 0s
[task 2024-11-22T22:56:13.049Z] 124100K .......... .......... .......... .......... .......... 96% 85.9M 0s
[task 2024-11-22T22:56:13.049Z] 124150K .......... .......... .......... .......... .......... 96% 199M 0s
[task 2024-11-22T22:56:13.049Z] 124200K .......... .......... .......... .......... .......... 96% 112M 0s
[task 2024-11-22T22:56:13.050Z] 124250K .......... .......... .......... .......... .......... 96% 65.8M 0s
[task 2024-11-22T22:56:13.050Z] 124300K .......... .......... .......... .......... .......... 97% 195M 0s
[task 2024-11-22T22:56:13.051Z] 124350K .......... .......... .......... .......... .......... 97% 189M 0s
[task 2024-11-22T22:56:13.051Z] 124400K .......... .......... .......... .......... .......... 97% 79.6M 0s
[task 2024-11-22T22:56:13.052Z] 124450K .......... .......... .......... .......... .......... 97% 59.5M 0s
[task 2024-11-22T22:56:13.052Z] 124500K .......... .......... .......... .......... .......... 97% 189M 0s
[task 2024-11-22T22:56:13.053Z] 124550K .......... .......... .......... .......... .......... 97% 242M 0s
[task 2024-11-22T22:56:13.053Z] 124600K .......... .......... .......... .......... .......... 97% 80.6M 0s
[task 2024-11-22T22:56:13.053Z] 124650K .......... .......... .......... .......... .......... 97% 143M 0s
[task 2024-11-22T22:56:13.054Z] 124700K .......... .......... .......... .......... .......... 97% 151M 0s
[task 2024-11-22T22:56:13.055Z] 124750K .......... .......... .......... .......... .......... 97% 52.8M 0s
[task 2024-11-22T22:56:13.055Z] 124800K .......... .......... .......... .......... .......... 97% 165M 0s
[task 2024-11-22T22:56:13.055Z] 124850K .......... .......... .......... .......... .......... 97% 131M 0s
[task 2024-11-22T22:56:13.056Z] 124900K .......... .......... .......... .......... .......... 97% 191M 0s
[task 2024-11-22T22:56:13.056Z] 124950K .......... .......... .......... .......... .......... 97% 66.9M 0s
[task 2024-11-22T22:56:13.057Z] 125000K .......... .......... .......... .......... .......... 97% 82.1M 0s
[task 2024-11-22T22:56:13.057Z] 125050K .......... .......... .......... .......... .......... 97% 192M 0s
[task 2024-11-22T22:56:13.058Z] 125100K .......... .......... .......... .......... .......... 97% 178M 0s
[task 2024-11-22T22:56:13.058Z] 125150K .......... .......... .......... .......... .......... 97% 202M 0s
[task 2024-11-22T22:56:13.058Z] 125200K .......... .......... .......... .......... .......... 97% 81.5M 0s
[task 2024-11-22T22:56:13.059Z] 125250K .......... .......... .......... .......... .......... 97% 205M 0s
[task 2024-11-22T22:56:13.059Z] 125300K .......... .......... .......... .......... .......... 97% 158M 0s
[task 2024-11-22T22:56:13.059Z] 125350K .......... .......... .......... .......... .......... 97% 218M 0s
[task 2024-11-22T22:56:13.060Z] 125400K .......... .......... .......... .......... .......... 97% 115M 0s
[task 2024-11-22T22:56:13.060Z] 125450K .......... .......... .......... .......... .......... 97% 165M 0s
[task 2024-11-22T22:56:13.060Z] 125500K .......... .......... .......... .......... .......... 97% 182M 0s
[task 2024-11-22T22:56:13.060Z] 125550K .......... .......... .......... .......... .......... 97% 172M 0s
[task 2024-11-22T22:56:13.061Z] 125600K .......... .......... .......... .......... .......... 98% 168M 0s
[task 2024-11-22T22:56:13.061Z] 125650K .......... .......... .......... .......... .......... 98% 187M 0s
[task 2024-11-22T22:56:13.061Z] 125700K .......... .......... .......... .......... .......... 98% 236M 0s
[task 2024-11-22T22:56:13.061Z] 125750K .......... .......... .......... .......... .......... 98% 169M 0s
[task 2024-11-22T22:56:13.062Z] 125800K .......... .......... .......... .......... .......... 98% 157M 0s
[task 2024-11-22T22:56:13.062Z] 125850K .......... .......... .......... .......... .......... 98% 177M 0s
[task 2024-11-22T22:56:13.062Z] 125900K .......... .......... .......... .......... .......... 98% 229M 0s
[task 2024-11-22T22:56:13.063Z] 125950K .......... .......... .......... .......... .......... 98% 165M 0s
[task 2024-11-22T22:56:13.063Z] 126000K .......... .......... .......... .......... .......... 98% 167M 0s
[task 2024-11-22T22:56:13.063Z] 126050K .......... .......... .......... .......... .......... 98% 192M 0s
[task 2024-11-22T22:56:13.063Z] 126100K .......... .......... .......... .......... .......... 98% 179M 0s
[task 2024-11-22T22:56:13.064Z] 126150K .......... .......... .......... .......... .......... 98% 179M 0s
[task 2024-11-22T22:56:13.064Z] 126200K .......... .......... .......... .......... .......... 98% 163M 0s
[task 2024-11-22T22:56:13.064Z] 126250K .......... .......... .......... .......... .......... 98% 231M 0s
[task 2024-11-22T22:56:13.064Z] 126300K .......... .......... .......... .......... .......... 98% 237M 0s
[task 2024-11-22T22:56:13.065Z] 126350K .......... .......... .......... .......... .......... 98% 213M 0s
[task 2024-11-22T22:56:13.065Z] 126400K .......... .......... .......... .......... .......... 98% 157M 0s
[task 2024-11-22T22:56:13.065Z] 126450K .......... .......... .......... .......... .......... 98% 208M 0s
[task 2024-11-22T22:56:13.065Z] 126500K .......... .......... .......... .......... .......... 98% 185M 0s
[task 2024-11-22T22:56:13.066Z] 126550K .......... .......... .......... .......... .......... 98% 212M 0s
[task 2024-11-22T22:56:13.066Z] 126600K .......... .......... .......... .......... .......... 98% 159M 0s
[task 2024-11-22T22:56:13.066Z] 126650K .......... .......... .......... .......... .......... 98% 213M 0s
[task 2024-11-22T22:56:13.066Z] 126700K .......... .......... .......... .......... .......... 98% 192M 0s
[task 2024-11-22T22:56:13.067Z] 126750K .......... .......... .......... .......... .......... 98% 177M 0s
[task 2024-11-22T22:56:13.067Z] 126800K .......... .......... .......... .......... .......... 98% 170M 0s
[task 2024-11-22T22:56:13.067Z] 126850K .......... .......... .......... .......... .......... 98% 174M 0s
[task 2024-11-22T22:56:13.068Z] 126900K .......... .......... .......... .......... .......... 99% 185M 0s
[task 2024-11-22T22:56:13.068Z] 126950K .......... .......... .......... .......... .......... 99% 191M 0s
[task 2024-11-22T22:56:13.068Z] 127000K .......... .......... .......... .......... .......... 99% 163M 0s
[task 2024-11-22T22:56:13.068Z] 127050K .......... .......... .......... .......... .......... 99% 266M 0s
[task 2024-11-22T22:56:13.069Z] 127100K .......... .......... .......... .......... .......... 99% 170M 0s
[task 2024-11-22T22:56:13.069Z] 127150K .......... .......... .......... .......... .......... 99% 195M 0s
[task 2024-11-22T22:56:13.069Z] 127200K .......... .......... .......... .......... .......... 99% 166M 0s
[task 2024-11-22T22:56:13.069Z] 127250K .......... .......... .......... .......... .......... 99% 196M 0s
[task 2024-11-22T22:56:13.070Z] 127300K .......... .......... .......... .......... .......... 99% 181M 0s
[task 2024-11-22T22:56:13.070Z] 127350K .......... .......... .......... .......... .......... 99% 194M 0s
[task 2024-11-22T22:56:13.070Z] 127400K .......... .......... .......... .......... .......... 99% 189M 0s
[task 2024-11-22T22:56:13.070Z] 127450K .......... .......... .......... .......... .......... 99% 218M 0s
[task 2024-11-22T22:56:13.071Z] 127500K .......... .......... .......... .......... .......... 99% 231M 0s
[task 2024-11-22T22:56:13.071Z] 127550K .......... .......... .......... .......... .......... 99% 221M 0s
[task 2024-11-22T22:56:13.071Z] 127600K .......... .......... .......... .......... .......... 99% 189M 0s
[task 2024-11-22T22:56:13.071Z] 127650K .......... .......... .......... .......... .......... 99% 173M 0s
[task 2024-11-22T22:56:13.072Z] 127700K .......... .......... .......... .......... .......... 99% 197M 0s
[task 2024-11-22T22:56:13.072Z] 127750K .......... .......... .......... .......... .......... 99% 199M 0s
[task 2024-11-22T22:56:13.072Z] 127800K .......... .......... .......... .......... .......... 99% 162M 0s
[task 2024-11-22T22:56:13.072Z] 127850K .......... .......... .......... .......... .......... 99% 236M 0s
[task 2024-11-22T22:56:13.073Z] 127900K .......... .......... .......... .......... .......... 99% 242M 0s
[task 2024-11-22T22:56:13.073Z] 127950K .......... .......... .......... .......... .......... 99% 228M 0s
[task 2024-11-22T22:56:13.073Z] 128000K .......... .......... .......... .......... .......... 99% 176M 0s
[task 2024-11-22T22:56:13.073Z] 128050K .......... .......... .......... .......... .......... 99% 202M 0s
[task 2024-11-22T22:56:13.074Z] 128100K .......... .......... .......... .......... .......... 99% 186M 0s
[task 2024-11-22T22:56:13.074Z] 128150K .......... .......... .......... ......... 100% 172M=0.7s
[task 2024-11-22T22:56:13.074Z]
[task 2024-11-22T22:56:13.074Z] 2024-11-22 22:56:13 (188 MB/s) - ‘/builds/worker/.local/lib/python3.10/site-packages/opuscleaner/filters/large.bin’ saved [131266198/131266198]
[task 2024-11-22T22:56:13.074Z]
[task 2024-11-22T22:56:13.075Z] + echo '### Generating cleaning config: opus_ELRC-3075-wikipedia_health/v1.ru-en.filters.json'
[task 2024-11-22T22:56:13.075Z] ### Generating cleaning config: opus_ELRC-3075-wikipedia_health/v1.ru-en.filters.json
[task 2024-11-22T22:56:13.075Z] + filter_path=/builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json
[task 2024-11-22T22:56:13.075Z] + python3 generate_filters.py /builds/worker/fetches/ELRC-3075-wikipedia_health_v1 ru en opus_ELRC-3075-wikipedia_health/v1 /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json custom
[task 2024-11-22T22:56:13.106Z] Using filter /builds/worker/checkouts/vcs/pipeline/clean/opuscleaner/configs/ru-en/opus_ELRC-3075-wikipedia_health-v1.filters.json
[task 2024-11-22T22:56:13.110Z] + test -s /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json
[task 2024-11-22T22:56:13.110Z] + echo '### Cleaning /builds/worker/fetches/ELRC-3075-wikipedia_health_v1 with filter /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json'
[task 2024-11-22T22:56:13.110Z] ### Cleaning /builds/worker/fetches/ELRC-3075-wikipedia_health_v1 with filter /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json
[task 2024-11-22T22:56:13.111Z] + opuscleaner-clean --parallel 32 --batch-size=50000 --input=- /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru-en.filters.json ru en
[task 2024-11-22T22:56:13.111Z] ++ zstdmt -dc /builds/worker/fetches/ELRC-3075-wikipedia_health_v1.ru.zst
[task 2024-11-22T22:56:13.111Z] + paste /dev/fd/63 /dev/fd/62
[task 2024-11-22T22:56:13.111Z] + cut -f2
[task 2024-11-22T22:56:13.111Z] + tee /dev/fd/63
[task 2024-11-22T22:56:13.111Z] ++ zstdmt -dc /builds/worker/fetches/ELRC-3075-wikipedia_health_v1.en.zst
[task 2024-11-22T22:56:13.111Z] + zstdmt
[task 2024-11-22T22:56:13.112Z] ++ cut -f1
[task 2024-11-22T22:56:13.112Z] ++ zstdmt
[task 2024-11-22T22:56:14.005Z] + echo '### Checking length of the files'
[task 2024-11-22T22:56:14.005Z] ### Checking length of the files
[task 2024-11-22T22:56:14.005Z] + test -s /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru.zst
[task 2024-11-22T22:56:14.005Z] + test -s /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.en.zst
[task 2024-11-22T22:56:14.005Z] ++ zstdmt -dc /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.ru.zst
[task 2024-11-22T22:56:14.005Z] ++ wc -l
[task 2024-11-22T22:56:14.009Z] + new_len_src=3423
[task 2024-11-22T22:56:14.010Z] ++ zstdmt -dc /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1.en.zst
[task 2024-11-22T22:56:14.010Z] ++ wc -l
[task 2024-11-22T22:56:14.013Z] + new_len_trg=3423
[task 2024-11-22T22:56:14.014Z] ++ zstdmt -dc /builds/worker/fetches/ELRC-3075-wikipedia_health_v1.ru.zst
[task 2024-11-22T22:56:14.014Z] ++ wc -l
[task 2024-11-22T22:56:14.018Z] + orig_len_src=4073
[task 2024-11-22T22:56:14.018Z] + [[ 3423 -ge 1 ]]
[task 2024-11-22T22:56:14.018Z] + [[ 3423 -ge 1 ]]
[task 2024-11-22T22:56:14.018Z] + [[ 3423 = \3\4\2\3 ]]
[task 2024-11-22T22:56:14.018Z] + echo '### Filtered length: 3423 / 4073'
[task 2024-11-22T22:56:14.018Z] ### Filtered length: 3423 / 4073
[task 2024-11-22T22:56:14.018Z] + echo '### Clean /builds/worker/fetches/ELRC-3075-wikipedia_health_v1 is written to /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1'
[task 2024-11-22T22:56:14.018Z] ### Clean /builds/worker/fetches/ELRC-3075-wikipedia_health_v1 is written to /builds/worker/artifacts/ELRC-3075-wikipedia_health_v1
[task 2024-11-22T22:56:14.018Z] + echo '###### Done: Cleaning corpus with OpusCleaner'
[task 2024-11-22T22:56:14.018Z] ###### Done: Cleaning corpus with OpusCleaner
[fetches 2024-11-22T22:56:14.019Z] removing /builds/worker/fetches
[fetches 2024-11-22T22:56:14.019Z] finished
[taskcluster 2024-11-22 22:56:19.544Z] === Task Finished ===
[taskcluster 2024-11-22 22:56:20.160Z] Successful task run with exit code: 0 completed in 65.258 seconds
Loading