MACSE(?)-introduced exclamations in some alignments #112
-
Hello! I'm playing around with the codon alignments found here, and I am finding that a small number of them contain "!" characters. It appears that these characters were introduced through some pre-processing with MACSE, but I am not sure what would be the best way to handle these characters/columns in the alignment. I can't leave them alone because most other programs will see that as an illegal character and kill the run. My temptation is to simply replace the exclamations with Ns, but I was unsure if someone else might have a more justified way of handling these characters. Thanks in advance to anyone who might be able to provide some insight here! |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments
-
Hi Morgan, The '!' symbol originates from MACSE v2.0, which uses it to denote frameshifts within codon alignments. I'll add a note tomorrow in the README. Michael |
Beta Was this translation helpful? Give feedback.
-
Hi @mchaney1 As Michael already mentioned, you can safely replace them with "N". |
Beta Was this translation helpful? Give feedback.
-
Thanks for your response on this, Michael! That sounds familiar. Given that a frameshift is what’s denoted by that character, do you think it would be best to replace ‘!’ with ‘-‘ rather than with an ‘N’?
Morgan
From: Michael Hiller ***@***.***>
Sent: Thursday, October 26, 2023 3:43 PM
To: hillerlab/TOGA ***@***.***>
Cc: Chaney, Morgan ***@***.***>; Author ***@***.***>
Subject: EXT: Re: [hillerlab/TOGA] MACSE(?)-introduced exclamations in some alignments (Discussion #112)
Hi Morgan,
The '!' symbol originates from MACSE v2.0, which uses it to denote frameshifts within codon alignments.
It's a bit surprising to us that MACSE finds frameshifts, as TOGA already recognizes and masks frameshifted codons.
I'll add a note tomorrow in the README.
Michael
—
Reply to this email directly, view it on GitHub<#112 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AF3KWXK2X74A6GDU6CC7Z5LYBK4MVAVCNFSM6AAAAAA6RKD65KVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TGOJWG4ZDE>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
CAUTION: EXTERNAL SENDER Do not click any links, open any attachments, or REPLY to the message unless you trust the sender and know the content is safe.
|
Beta Was this translation helpful? Give feedback.
-
@mchaney1 I believe MACSE2.0 developers decided to use "!" marks instead of "-" to ensure the sequence lengths are divisible by 3. This is crucial as many (potential) downstream processes assume that the sequences in the codon alignments have lengths where length % 3 == 0. |
Beta Was this translation helpful? Give feedback.
-
Hi Morgan, I have now replaced ! by N in all codon alis. That should fix this problem once and for all. Thx |
Beta Was this translation helpful? Give feedback.
Hi Morgan,
The '!' symbol originates from MACSE v2.0, which uses it to denote frameshifts within codon alignments.
It's a bit surprising to us that MACSE finds frameshifts, as TOGA already recognizes and masks frameshifted codons.
I'll add a note tomorrow in the README.
Michael