Skip to content

Actions: huggingface/datatrove

All workflows

Actions

Loading...
Loading

Showing runs from all workflows
1,297 workflow runs
1,297 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

fix missing language tokenizer
Secret Leaks #166: Commit ea3adf9 pushed by guipenedo
November 28, 2024 14:50 22s multilingual
November 28, 2024 14:50 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #337: Pull request #285 synchronize by guipenedo
November 28, 2024 13:00 2m 32s multilingual
November 28, 2024 13:00 2m 32s
fix for no .remove file
Secret Leaks #165: Commit 42b1e10 pushed by guipenedo
November 28, 2024 13:00 19s multilingual
November 28, 2024 13:00 19s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #336: Pull request #285 synchronize by guipenedo
November 27, 2024 23:45 3m 48s multilingual
November 27, 2024 23:45 3m 48s
remove progress message
Secret Leaks #164: Commit f7a0267 pushed by guipenedo
November 27, 2024 23:45 21s multilingual
November 27, 2024 23:45 21s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #335: Pull request #285 synchronize by guipenedo
November 27, 2024 23:43 2m 49s multilingual
November 27, 2024 23:43 2m 49s
add local version
Secret Leaks #163: Commit 6412f31 pushed by guipenedo
November 27, 2024 23:43 22s multilingual
November 27, 2024 23:43 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #334: Pull request #285 synchronize by guipenedo
November 27, 2024 15:15 2m 39s multilingual
November 27, 2024 15:15 2m 39s
add dependency
Secret Leaks #162: Commit 0b5591a pushed by guipenedo
November 27, 2024 15:15 22s multilingual
November 27, 2024 15:15 22s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #333: Pull request #285 synchronize by guipenedo
November 27, 2024 15:12 2m 24s multilingual
November 27, 2024 15:12 2m 24s
updated work_tokenizer assignments and added burmese
Secret Leaks #161: Commit a2ceb48 pushed by guipenedo
November 27, 2024 15:11 22s multilingual
November 27, 2024 15:11 22s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Secret Leaks #160: Commit fe81883 pushed by guipenedo
November 27, 2024 14:55 24s main
November 27, 2024 14:55 24s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Test & Check Code Quality #332: Commit fe81883 pushed by guipenedo
November 27, 2024 14:55 1m 55s main
November 27, 2024 14:55 1m 55s
[fixbug]: Fixed the issue in MinhashBuildIndex where get_datafolder w…
Test & Check Code Quality #331: Pull request #307 opened by Youggls
November 27, 2024 14:53 1m 54s Youggls:main
November 27, 2024 14:53 1m 54s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #330: Pull request #285 synchronize by guipenedo
November 27, 2024 09:30 2m 34s multilingual
November 27, 2024 09:30 2m 34s
network limiting
Secret Leaks #159: Commit cf4668a pushed by guipenedo
November 27, 2024 09:30 17s multilingual
November 27, 2024 09:30 17s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #329: Pull request #285 synchronize by guipenedo
November 27, 2024 09:28 2m 54s multilingual
November 27, 2024 09:28 2m 54s
network limiting
Secret Leaks #158: Commit ee313a4 pushed by guipenedo
November 27, 2024 09:28 17s multilingual
November 27, 2024 09:28 17s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #328: Pull request #285 synchronize by guipenedo
November 26, 2024 18:22 2m 47s multilingual
November 26, 2024 18:22 2m 47s
giving up. just printing now
Secret Leaks #157: Commit c1ba400 pushed by guipenedo
November 26, 2024 18:22 19s multilingual
November 26, 2024 18:22 19s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #327: Pull request #285 synchronize by guipenedo
November 26, 2024 18:18 2m 41s multilingual
November 26, 2024 18:18 2m 41s
giving up. just printing now
Secret Leaks #156: Commit 04ca4d5 pushed by guipenedo
November 26, 2024 18:18 18s multilingual
November 26, 2024 18:18 18s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #326: Pull request #285 synchronize by guipenedo
November 26, 2024 18:02 2m 55s multilingual
November 26, 2024 18:02 2m 55s
stupid logspath
Secret Leaks #155: Commit da5e004 pushed by guipenedo
November 26, 2024 18:02 17s multilingual
November 26, 2024 18:02 17s
FineWeb-2: multilingual, numpy 2.0, minhash improvements
Test & Check Code Quality #325: Pull request #285 synchronize by guipenedo
November 26, 2024 17:56 2m 40s multilingual
November 26, 2024 17:56 2m 40s