-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add human (and animal) reference genome to prepared databases #2717
Comments
On the "raw" side 1 there are both GRCh38.p14 and T2T-CHM13v2.0 signatures in wort, would that work? Footnotes
|
Yep! Those should be plenty. |
Repo to sketch hg38, including all unmapped chromosomes: https://github.com/ctb/2024-human-sketch |
note: decontaminating human WGS samples, #3151 |
download at: |
added here - #3422 - should add the t2t ones, too, though. |
@ccbaumler suggests adding more animal genomes over in #3422 (comment): Rather than doing these piecemeal, I think we should come up with a set of accessions we care about and then use directsketch to get them, so for now I'm punting on that suggestion, but it is definitely the way we want to go! |
Adds common hosts and also hg38. Tackles #2717 <img width="667" alt="Screenshot 2024-12-07 at 9 17 40 AM" src="https://github.com/user-attachments/assets/bfeff595-1759-4569-8adb-1e950f75a03e"> ## Rendered preview:
Hi Titus et al,
Given the recent fiasco related to mapping reads to microbial databases without human references (links at bottom), it might be a good time to create a small human genome database for use with sourmash. A standalone database on the database page would be ideal, so that researchers can include with the other databases of interest.
Thanks for considering!
social media discussion: https://twitter.com/StevenSalzberg1/status/1686350449069244416
pre-print: https://doi.org/10.1101/2023.07.28.550993
The text was updated successfully, but these errors were encountered: