You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In #2154, we've been talking about how to include taxonomic information in zipfiles, and I've been trying to figure out how that would work at the command line.
But all the discussion happened in a now-closed issue and a now-merged PR ;). So here's a new issue!
Comments copied over from various other issues and PRs -
re: SOURMASH-TAXONOMY — would you consider GTDB-TAXONOMY and NCBI-TAXONOMY instead, with the default being gtdb?
OR, somewhere in database info/metadata (which we don’t have yet, but have talked about), add the default for that database? In this case, I'm thinking about database info/metadata as database version (e.g. gtdb-rs207), sourmash signature version, creation date, etc -- and then adding default-taxonomy.
which would load NCBI-TAXONOMY.csv from gtdb-xyz.zip.
Then we could potentially add --lins later on for #1813.
Alternative command-line switches would be --tax-type ncbi or something but I feel like --ncbi and --gtdb are probably simplest and easiest to remember.
ctb
changed the title
supporting multiple taxonomies with command-line switches
tax in zip files; supporting multiple taxonomies with command-line switches
Aug 31, 2022
found this comment from @bluegenes, buried in a different issue - it appears to be the original we-should-have-tax-in-zip idea -
Additional thought: It would be handy to include the taxonomy file inside each database file (possible with zip, sbt.zip, and sqldb and not needed for lca, right?). That would reduce extra download code and the need to link the correct taxonomy file with each database. For taxonomy functions with official databases, users could provide the database on the command line (instead of needing to find/download the taxonomy file), and we could automatically find it. I would imagine TAXONOMY.csv, complementary to manifest file. We would still allow alternate taxonomies, of course, but at least each db would come with the official set for that db?
In #2154, we've been talking about how to include taxonomic information in zipfiles, and I've been trying to figure out how that would work at the command line.
But all the discussion happened in a now-closed issue and a now-merged PR ;). So here's a new issue!
Comments copied over from various other issues and PRs -
From #2195 (comment), @bluegenes:
From #2012 (comment), I wrote:
which received @bluegenes endorsement:
Also sorta connects with #2186, searching/selecting on taxonomic lineages?
The text was updated successfully, but these errors were encountered: