-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] support local genome collections (including private genomes) #130
Conversation
update: it's aliiiiiive! I can now successfully run the
(taxonomies don't yet work, but I believe that will be easy.) |
Whew, this is converging on working. And to assuage @taylorreiter there is now a lot less Entertainingly, the only part that went super smoothly was the inclusion of private taxonomies. @bluegenes |
🎉 and tests pass! |
* change column names * remove old notebooks * fix mistake
So it turns out that there's no current need for separate I could leave it as it is, or I could switch things up and do something like |
update: I switched things around and I think it looks good this way. (There's now simply |
New! Shiny mkdocs docs. |
🎉 merging! |
This PR enables the use of collections of local genomes with genome-grist. See the documentation for details!
Documentation is being written here, and is usually synced to this branch from hackmd. There is new mkdocs-formatted documentation available on github pages at
dib-lab.github.io/genome-grist/
.There are many changes and incompatibilities with previous versions of genome-grist. Read on!
major changes and incompatibilities
sourmash_databases:
, instead ofsourmash_database_glob_pattern:
;sample
must be changed tosamples
;database_taxonomy
must be changed totaxonomies
acc
has been changed toident
andncbi_tax_name
has been changed todisplay_name
;minor changes that don't require explicit intervention:
minimap/
to tomapping/
;genbank/
output subdir has been renamed togather/
;gathertax/
output subdirectory has been removed;This PR also:
process
CLI entry pointgenbank_cache
genomes/
output subdirectory that holds the private+genbank genomes and genome info CSVs.related issues
Fixes #91 - supports custom genomes.
Fixes #13 - enabling private identifiers for genomes.
Fixes #9 - specifying genome list
Fixes #79 -
genbank_genomes
directory is now configurableFxies #132 - adds picklist to the gather step
notes to self and checklists
This means:
TODO:
acc
toidentifier
, andncbi_tax_name
toname
.database_taxonomy
and replace withtaxonomies
listcc @jessicalumian