From e0447868fc3192aec9ab5b359946328f7dd8f943 Mon Sep 17 00:00:00 2001 From: Cornelius Roemer Date: Tue, 19 Dec 2023 17:08:54 +0100 Subject: [PATCH] fix: Update description to point to new clade nomenclature Follow-up of #45 Fixes #47 --- config/description.md | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/config/description.md b/config/description.md index 5602d58..1e7bb94 100644 --- a/config/description.md +++ b/config/description.md @@ -1,6 +1,5 @@ We gratefully acknowledge the authors, originating and submitting laboratories of the genetic sequences and metadata for sharing their work. Please note that although data generators have generously shared data in an open fashion, that does not mean there should be free license to publish on this data. Data generators should be cited where possible and collaborations should be sought in some circumstances. Please try to avoid scooping someone else's work. Reach out if uncertain. - We maintain three views of human respiratory syncytial virus evolution for each RSV subtype: The first is ['rsv/a/genome'](https://nextstrain.org/staging/rsv/a/genome) and ['rsv/b/genome'](https://nextstrain.org/staging/rsv/b/genome), which show evolution of full genome sequences. @@ -9,21 +8,20 @@ The second is ['rsv/a/G'](https://nextstrain.org/staging/rsv/a/G) and ['rsv/b/G' The second is ['rsv/a/F'](https://nextstrain.org/staging/rsv/a/F) and ['rsv/b/F'](https://nextstrain.org/staging/rsv/b/G), which show evolution of only the F gene. The F gene builds currently don't contain any clade annotations. - #### Analysis + Our bioinformatic processing workflow can be found at [github.com/nextstrain/rsv](https://github.com/nextstrain/rsv) and includes: -- sequence alignment by a combination of [nextalign](https://docs.nextstrain.org/projects/nextclade/en/stable/user/nextalign-cli.html) and [MAFFT](https://mafft.cbrc.jp/alignment/software/). +- sequence alignment by a combination of [Nextclade](https://docs.nextstrain.org/projects/nextclade/en/stable/user/nextclade-cli.html) and [MAFFT](https://mafft.cbrc.jp/alignment/software/). - phylogenetic reconstruction using [IQTREE](http://www.iqtree.org/) - ancestral state reconstruction and temporal inference using [TreeTime](https://github.com/neherlab/treetime) -- clade assignment via clade definitions defined here for - [RSV-A/genome](https://github.com/nextstrain/rsv/blob/master/config/clades_genome_a.tsv), - [RSV-B/genome](https://github.com/nextstrain/rsv/blob/master/config/clades_genome_b.tsv), - [RSV-A/G gene](https://github.com/nextstrain/rsv/blob/master/config/clades_G_a.tsv), and - [RSV-B/G gene](https://github.com/nextstrain/rsv/blob/master/config/clades_G_b.tsv) to label RSV clades based on the entire genome and for just the G gene. - These clade definitions are based on the proposed nomenclatures by [Goya et al](https://onlinelibrary.wiley.com/doi/abs/10.1111/irv.12715) and [Ramaekers et al](https://doi.org/10.1093/ve/veaa052). +- clade assignment via clade definitions defined here: + [RSV-A](https://raw.githubusercontent.com/rsv-lineages/lineage-designation-A/main/.auto-generated/lineage.tsv) + [RSV-B](https://raw.githubusercontent.com/rsv-lineages/lineage-designation-A/main/.auto-generated/lineage.tsv) + These clade definitions are based on the not-yet-published nomenclature of the RSV Genotyping Consensus Consortium. #### Underlying data + We curate sequence data and metadata from the [NCBI Datasets command line tools](https://www.ncbi.nlm.nih.gov/datasets/docs/v2/download-and-install/) as starting point for these analyses. See our [ingest configuration file](https://github.com/nextstrain/rsv/blob/master/ingest/config/config.yaml) for the NCBI Taxonomy IDs used to fetch the virus genomes. @@ -34,6 +32,3 @@ Curated sequences and metadata are available as flat files at: - [RSV-B sequences](https://data.nextstrain.org/files/workflows/rsv/b/sequences.fasta.xz) - [RSV-B metadata](https://data.nextstrain.org/files/workflows/rsv/b/metadata.tsv.gz) - - -