Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Definition of Private Mutation/Deletion in Nextclade CLI ndjson #1523

Open
ryhisner opened this issue Sep 14, 2024 · 2 comments
Open

Definition of Private Mutation/Deletion in Nextclade CLI ndjson #1523

ryhisner opened this issue Sep 14, 2024 · 2 comments
Labels
needs triage Mark for review and label assignment t:ask Type: question, request of information 1

Comments

@ryhisner
Copy link

I'm trying to get some clarity on what the definition of a "private mutation" is in Nextclade CLI. I'm trying to document certain private deletions (which is an amazing addition to Nextclade, by the way), but it seems to be counting deletions that are possessed by entire Pango lineages as private deletions.

Just to give an example, only counting sequences with Nextclade qc scores of ≤ 5, I'm getting 9599 sequences categorized as R.1 (B.1.1.316.1), as having ∆A28271, which is a defining mutation of R.1. There have been 12875 R.1 sequences ever according to CovSpectrum, so this probably means just about all R.1 sequences have ∆A28271 listed as a private deletion by Nextclade. There are also 12736 Delta sequences with ∆A28271 listed as a private deletion as well.

Other lineages that have ∆A28271 as a defining mutation but which list a large number of sequences as having ∆A28271 as a private deletion include B.1.619 (601), B.1.214 (239), B.1.629 (58).

The Nextclade documentation describes private mutations like this:

image

Does this mean that there are no R.1 sequences at all on the reference tree? Is there a way to fix this? And how is it that so many Delta sequences end up registering ∆A28271 as a private deletion? There are zero Alpha sequences with ∆A28271 listed as a private deletion, so Nextclade gets usually gets this deletion exactly right. I'm not sure why it fails on some lineages.

I don't know if it matters, but I use Wuhan-1 as the reference strain when creating an ndjson file in Nextclade CLI.

@ryhisner ryhisner added needs triage Mark for review and label assignment t:ask Type: question, request of information 1 labels Sep 14, 2024
@ryhisner
Copy link
Author

After looking into it a bit more, I think the problem is that ∆A28271 isn't listed as being a defining mutation of R.1, B.1.619, B.1.214, or B.1.629. Is there a way this could be corrected?

I still have no idea why ∆A28271 appears so often as a private mutation in Delta sequences.

@ivan-aksamentov
Copy link
Member

ivan-aksamentov commented Sep 23, 2024

Hey Ryan, looks like everyone is busy at the moment and I am not very well familiar with these lineages to give a useful answer.

I'll ping @corneliusroemer in case he's got a minute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Mark for review and label assignment t:ask Type: question, request of information 1
Projects
None yet
Development

No branches or pull requests

2 participants