-
Notifications
You must be signed in to change notification settings - Fork 28
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
9f370d4
commit bd13876
Showing
8 changed files
with
6,354 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
## Unreleased | ||
|
||
Initial release for Nextclade v3! | ||
|
||
This dataset is converted from the corresponding older dataset for Nextclade v2. You can find old versions of datasets here: https://github.com/nextstrain/nextclade_data/tree/2023-08-17--15-51-24--UTC/data/datasets | ||
|
||
Read more about Nextclade datasets in the documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Nextclade dataset for "SARS-CoV-2 relative to BA.2" based on reference "BA.2" (sars-cov-2-21L/BA.2) | ||
|
||
|
||
## Dataset attributes | ||
|
||
| attribute | value | value friendly | | ||
| -------------------- | -------------------- | ---------------------------------------- | | ||
| name | sars-cov-2-21L | SARS-CoV-2 relative to BA.2 | | ||
| reference | BA.2 | BA.2 | | ||
|
||
|
||
## What is Nextclade dataset | ||
|
||
Read more about Nextclade datasets in Nextclade documentation: https://docs.nextstrain.org/projects/nextclade/en/stable/user/datasets.html | ||
|
||
|
||
### What are the SARS-CoV-2 clades? | ||
|
||
Nextclade was originally developed during COVID-19 pandemic, primarily focused on SARS-CoV-2. This section describes clades with application to SARS-CoV-2, but Nextclade can analyse other pathogens too. | ||
|
||
<figure> | ||
<a href="https://raw.githubusercontent.com/nextstrain/ncov-clades-schema/master/clades.svg"> | ||
<picture> | ||
<img | ||
src="https://raw.githubusercontent.com/nextstrain/ncov-clades-schema/master/clades.svg" | ||
alt="Illustration of phylogenetic relationships of SARS-CoV-2 clades, as defined by Nextstrain" | ||
/> | ||
</picture> | ||
</a> | ||
<figcaption> | ||
<small> | ||
Fig.1. Illustration of phylogenetic relationships of SARS-CoV-2 clades, as defined by Nextstrain (<a href="https://github.com/nextstrain/ncov-clades-schema/">source</a>) | ||
</small> | ||
</figcaption> | ||
</figure> | ||
|
||
Since its emergence in late 2019, SARS-CoV-2 has diversified into several different co-circulating variants. To facilitate discussion of these variants, we have grouped them into __clades__ which are defined by specific signature mutations. | ||
|
||
We currently define more than 30 clades (see [this blog post](https://nextstrain.org/blog/2021-01-06-updated-SARS-CoV-2-clade-naming) for details): | ||
|
||
- 19A and 19B emerged in Wuhan and have dominated the early outbreak | ||
- 20A emerged from 19A out of dominated the European outbreak in March and has since spread globally | ||
- 20B and 20C are large genetically distinct subclades 20A emerged in early 2020 | ||
- 20D to 20J have emerged over the summer of 2020 and include three "Variants of Concern" (VoC). | ||
- 21A to 21F include the VoC __delta__ and several Variants of Interest (VoI). | ||
- 21K onwards are different clades within the diverse VoC __omicron__. | ||
|
||
Within Nextstrain, we define each clade by its combination of signature mutations. You can find the exact clade definition in [github.com/nextstrain/ncov/defaults/clades.tsv](https://github.com/nextstrain/ncov/blob/master/defaults/clades.tsv). When available, we will include [WHO labels for VoCs and VoIs](https://www.who.int/en/activities/tracking-SARS-CoV-2-variants/). | ||
|
||
Learn more about how Nextclade assigns clades in the [documentation](https://docs.nextstrain.org/projects/nextclade/en/stable/user/algorithm/). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
##gff-version 3 | ||
##sequence-region MN908947 1 29903 | ||
# Gene map (genome annotation) of SARS-CoV-2 in GFF format. | ||
# For gene map purpses we only need some of the columns. We substitute unused values with "." as per GFF spec. | ||
# See GFF format reference at https://www.ensembl.org/info/website/upload/gff.html | ||
# seqname source feature start end score strand frame attribute | ||
MN908947 GenBank gene 266 13468 . + . gene_name=ORF1a | ||
MN908947 GenBank gene 13468 21555 . + . gene_name=ORF1b | ||
MN908947 GenBank gene 25393 26220 . + . gene_name=ORF3a | ||
MN908947 GenBank gene 21563 25384 . + . gene_name=S | ||
MN908947 GenBank gene 26245 26472 . + . gene_name=E | ||
MN908947 GenBank gene 26523 27191 . + . gene_name=M | ||
MN908947 GenBank gene 27202 27387 . + . gene_name=ORF6 | ||
MN908947 GenBank gene 27394 27759 . + . gene_name=ORF7a | ||
MN908947 GenBank gene 27756 27887 . + . gene_name=ORF7b | ||
MN908947 GenBank gene 27894 28259 . + . gene_name=ORF8 | ||
MN908947 GenBank gene 28274 29533 . + . gene_name=N | ||
MN908947 GenBank gene 28284 28577 . + . gene_name=ORF9b |
Oops, something went wrong.