.sam
to amino acids – can I just translate ?
#1556
-
Dear Nextclade Community, I have nucleotide sequences I have to translate. So far, I have been running Nextclade via the CLI In my workflow, my nucleotide sequences are already aligned in a I understand Nextclade assumes the nucleotide sequences are not aligned and will align them as a first step. Can I tell Nextclade to skip this step and take the existing alignment info? Looking forward to your responses, Gordon |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 1 reply
-
Hi Gordon, Can you tell us a little more about your use case? Why do you want to skip the alignment process? Just to save compute cycles or is there another reason? In any case, this is not possible currently. If we imagine how this could be implemented: note that Nextclade performs pairwise (reference) alignment and after alignment it also strips insertions, so that it can operate on the sequences in reference coordinates. This is a prerequisite for many of the underlying algorithms currently. So if you feed, let's say a Multiple Sequence Alignment (MSA) - this may or may not work. |
Beta Was this translation helpful? Give feedback.
-
Hi Ivan – pleased to find your reply so instantly. I'm working on V-Pipe, which currently outputs large 100MB+ aligned nucleotide files / What I need is to get the aligned amino acids files from my aligned nucleotides– so ideally, the output of nextclade. I've been running nextclade on small test data, ignoring my alignments by first getting the reference and other files with:
And then replacing the Then run
This works perfectly fine on small files, yet in my current setup, it takes quite a lot of memory/time to process whole files. I just assumed this was due to the realignment, and I hoped I could circumvent this. My . Nextclade's
Disclaimer: I recently joined Bioinformatics as a software engineer, so I may overlook some biological trivia. For context, we are trying to import short-read wastewater data into Loculus Database. So greetings from next door, Kind Regards, Gordon |
Beta Was this translation helpful? Give feedback.
-
Hi Gordon, I think it would be quite twisted to use nextclade to translate already aligned short reads. A simple script could translate those already aligned reads. You'd need to figure which ORFs you read falls into, the reading frame, and then use something like richard |
Beta Was this translation helpful? Give feedback.
-
Thank you for your guidance! Much appreciated! Yes, that sounds like exactly what I want. I was cautious about writing my custom solution for this, expecting I'd run into corner cases beyond my biological understanding. Hence, I looked for well-tested tools for the task so far. Thank you for the recommendation. I'll try this route, then. I appreciate any further suggestions you may have. Kind Regards, |
Beta Was this translation helpful? Give feedback.
Hi Gordon,
I think it would be quite twisted to use nextclade to translate already aligned short reads.
A simple script could translate those already aligned reads. You'd need to figure which ORFs you read falls into, the reading frame, and then use something like
Bio.Sequence.translate
from the biopython package to translate the sequence.richard