Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for databases for MSA generation #3

Open
rakeshr10 opened this issue Oct 19, 2024 · 1 comment
Open

Request for databases for MSA generation #3

rakeshr10 opened this issue Oct 19, 2024 · 1 comment

Comments

@rakeshr10
Copy link

Hi, Nice work! Can you provide links to the databases as well as the MSA generation scripts

Regards
Rakesh

@CongLabCode
Copy link
Owner

Dear Rakesh,

Thank you for your interests in our work. Currently, we do not have organized MSA generation scripts for users to use but we are making it and plan to release it in the future. For databases, we in fact used SRA and assembled genomes downloaded from NCBI, the total data will be around 100TB. We aligned SRA reads directly to the human protein sequences instead of assembling them. We do not think we can provide download for SRA reads. But you can download them from NCBI following the codes we provided in the supplementary material.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants