Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add translation support to sourmash sketch fromfile #1912

Open
ctb opened this issue Mar 30, 2022 · 1 comment
Open

add translation support to sourmash sketch fromfile #1912

ctb opened this issue Mar 30, 2022 · 1 comment

Comments

@ctb
Copy link
Contributor

ctb commented Mar 30, 2022

right now, the fromfile format doesn't support a simple way to produce translated sequence - presumably we'd need to add a CDS column or something, or else build workflows (elsewhere) to do prodigal-style coding sequence extraction, although that would only work for bacteria and archaea, so a CDS column might still be necessary.

See @bluegenes comment too.

@bluegenes
Copy link
Contributor

I find myself using fromfile for everything these days, because it makes naming sketches properly so easy!!

So I would like us to support translate if we can -- perhaps as an additional param, e.g. -p k=10,k=7,scaled=200,protein,translate? Note it needs to have both of these, because we could alternatively translate into dayhoff.

I see your point about eukaryotes -- I would be happy to use a cds_filename column for this functionality.

current use case: a bunch of MAGs. Yes, I'll run prodigal-style translate separately, but for reasons I also want to build some 6-frame signatures

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants