Using local clean for large-scale data prediction #45

sjy-eyujun · 2024-03-23T05:51:51Z

Hello, thank you for your excellent work. I now want to predict EC numbers for large-scale data locally, with approximately 100 million sequences. How can I improve my speed? I used it in the cluster and tested it using 10000 sequences, with a single CPU time of one hour. I tried to split the file into 200000 lines per file, but except for the first file, the prediction speed of the remaining files significantly slowed down.

Best,
SJY

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using local clean for large-scale data prediction #45

Using local clean for large-scale data prediction #45

sjy-eyujun commented Mar 23, 2024

Using local clean for large-scale data prediction #45

Using local clean for large-scale data prediction #45

Comments

sjy-eyujun commented Mar 23, 2024