Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using local clean for large-scale data prediction #45

Open
sjy-eyujun opened this issue Mar 23, 2024 · 0 comments
Open

Using local clean for large-scale data prediction #45

sjy-eyujun opened this issue Mar 23, 2024 · 0 comments

Comments

@sjy-eyujun
Copy link

Hello, thank you for your excellent work. I now want to predict EC numbers for large-scale data locally, with approximately 100 million sequences. How can I improve my speed? I used it in the cluster and tested it using 10000 sequences, with a single CPU time of one hour. I tried to split the file into 200000 lines per file, but except for the first file, the prediction speed of the remaining files significantly slowed down.

Best,
SJY

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant