Skip to content

Commit

Permalink
flu: update datasets
Browse files Browse the repository at this point in the history
  • Loading branch information
rneher committed Nov 18, 2023
1 parent 0592d46 commit c356a0a
Show file tree
Hide file tree
Showing 65 changed files with 11,001 additions and 0 deletions.
7 changes: 7 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,12 @@
# CHANGELOG

## 2023-11-18

### New Influenza datasets version (tag `2023-11-18T12:00:00Z`)

- new subclades for several lineages
- new alias for A/H3N2 HA

## 2023-10-26

### New SARS-CoV-2 dataset version (tag `2023-10-26T12:00:00Z`)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
##gff-version 3
##sequence-region CY121680.1 1 1752
CY121680.1 feature gene 21 71 . + . gene_name="SigPep"
CY121680.1 feature gene 72 1052 . + . gene_name="HA1"
CY121680.1 feature gene 1053 1718 . + . gene_name="HA2"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Country (Institute),Target,Oligonucleotide,Sequence
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"schemaVersion": "1.2.0",
"privateMutations": {
"enabled": true,
"typical": 5,
"cutoff": 15,
"weightLabeledSubstitutions": 2,
"weightReversionSubstitutions": 1,
"weightUnlabeledSubstitutions": 1
},
"missingData": {
"enabled": false,
"missingDataThreshold": 100,
"scoreBias": 10
},
"snpClusters": {
"enabled": false,
"windowSize": 100,
"clusterCutOff": 5,
"scoreWeight": 50
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 4
},
"frameShifts": {
"enabled": true
},
"stopCodons": {
"enabled": true,
"ignoredStopCodons": []
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
>CY121680.1 Influenza A virus (A/California/07/2009(H1N1)) hemagglutinin (HA) gene, complete cds
GGAAAACAAAAGCAACAAAAATGAAGGCAATACTAGTAGTTCTGCTATATACATTTGCAACCGCAAATGCAGACACATTATGTATAGGTTATCATGCGAACAATTCAACAGACACTGTAGACACAGTACTAGAAAAGAATGTAACAGTAACACACTCTGTTAACCTTCTAGAAGACAAGCATAACGGGAAACTATGCAAACTAAGAGGGGTAGCCCCATTGCATTTGGGTAAATGTAACATTGCTGGCTGGATCCTGGGAAATCCAGAGTGTGAATCACTCTCCACAGCAAGCTCATGGTCCTACATTGTGGAAACACCTAGTTCAGACAATGGAACGTGTTACCCAGGAGATTTCATCGATTATGAGGAGCTAAGAGAGCAATTGAGCTCAGTGTCATCATTTGAAAGGTTTGAGATATTCCCCAAGACAAGTTCATGGCCCAATCATGACTCGAACAAAGGTGTAACGGCAGCATGTCCTCATGCTGGAGCAAAAAGCTTCTACAAAAATTTAATATGGCTAGTTAAAAAAGGAAATTCATACCCAAAGCTCAGCAAATCCTACATTAATGATAAAGGGAAAGAAGTCCTCGTGCTATGGGGCATTCACCATCCATCTACTAGTGCTGACCAACAAAGTCTCTATCAGAATGCAGATGCATATGTTTTTGTGGGGTCATCAAGATACAGCAAGAAGTTCAAGCCGGAAATAGCAATAAGACCCAAAGTGAGGGATCGAGAAGGGAGAATGAACTATTACTGGACACTAGTAGAGCCGGGAGACAAAATAACATTCGAAGCAACTGGAAATCTAGTGGTACCGAGATATGCATTCGCAATGGAAAGAAATGCTGGATCTGGTATTATCATTTCAGATACACCAGTCCACGATTGCAATACAACTTGTCAAACACCCAAGGGTGCTATAAACACCAGCCTCCCATTTCAGAATATACATCCGATCACAATTGGAAAATGTCCAAAATATGTAAAAAGCACAAAATTGAGACTGGCCACAGGATTGAGGAATATCCCGTCTATTCAATCTAGAGGCCTATTTGGGGCCATTGCCGGTTTCATTGAAGGGGGGTGGACAGGGATGGTAGATGGATGGTACGGTTATCACCATCAAAATGAGCAGGGGTCAGGATATGCAGCCGACCTGAAGAGCACACAGAATGCCATTGACGAGATTACTAACAAAGTAAATTCTGTTATTGAAAAGATGAATACACAGTTCACAGCAGTAGGTAAAGAGTTCAACCACCTGGAAAAAAGAATAGAGAATTTAAATAAAAAAGTTGATGATGGTTTCCTGGACATTTGGACTTACAATGCCGAACTGTTGGTTCTATTGGAAAATGAAAGAACTTTGGACTACCACGATTCAAATGTGAAGAACTTATATGAAAAGGTAAGAAGCCAGCTAAAAAACAATGCCAAGGAAATTGGAAACGGCTGCTTTGAATTTTACCACAAATGCGATAACACGTGCATGGAAAGTGTCAAAAATGGGACTTATGACTACCCAAAATACTCAGAGGAAGCAAAATTAAACAGAGAAGAAATAGATGGGGTAAAGCTGGAATCAACAAGGATTTACCAGATTTTGGCGATCTATTCAACTGTCGCCAGTTCATTGGTACTGGTAGTCTCCCTGGGGGCAATCAGTTTCTGGATGTGCTCTAATGGGTCTCTACAGTGTAGAATATGTATTTAACATTAGGATTTCAGAAGCATGAGAAAAACAC

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
{
"tag": "2023-11-18T12:00:00Z",
"comment": "Addition of experimental lineages",
"compatibility": {
"nextcladeCli": {
"min": "1.3.0",
"max": null
},
"nextcladeWeb": {
"min": "1.6.0",
"max": null
}
},
"enabled": true,
"files": {
"geneMap": "genemap.gff",
"primers": "primers.csv",
"qc": "qc.json",
"reference": "reference.fasta",
"sequences": "sequences.fasta",
"tree": "tree.json",
"virusPropertiesJson": "virus_properties.json"
},
"metadata": {}
}

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
"schemaVersion": "1.10.0",
"nucMutLabelMap": {},
"nucMutLabelMapReverse": {},
"aaMotifs": [
{
"name": "glycosylation",
"nameShort": "Glyc.",
"nameFriendly": "Glycosylation",
"description": "N-linked glycosylation motifs (N-X-S/T with X any amino acid other than P)",
"includeGenes": [
{
"gene":"HA1",
"ranges":[]
},
{
"gene":"HA2",
"ranges":[{"begin":0, "end":186}]
}
],
"motifs": [
"N[^P][ST]"
]
}
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
##gff-version 3
##sequence-region MW626062.1 1 1752
MW626062.1 feature gene 21 71 . + . gene_name="SigPep"
MW626062.1 feature gene 72 1052 . + . gene_name="HA1"
MW626062.1 feature gene 1053 1718 . + . gene_name="HA2"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Country (Institute),Target,Oligonucleotide,Sequence
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
{
"schemaVersion": "1.2.0",
"privateMutations": {
"enabled": true,
"typical": 5,
"cutoff": 15,
"weightLabeledSubstitutions": 2,
"weightReversionSubstitutions": 1,
"weightUnlabeledSubstitutions": 1
},
"missingData": {
"enabled": false,
"missingDataThreshold": 100,
"scoreBias": 10
},
"snpClusters": {
"enabled": false,
"windowSize": 100,
"clusterCutOff": 5,
"scoreWeight": 50
},
"mixedSites": {
"enabled": true,
"mixedSitesThreshold": 4
},
"frameShifts": {
"enabled": true
},
"stopCodons": {
"enabled": true,
"ignoredStopCodons": []
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
>MW626062.1 Influenza A virus (A/Wisconsin/588/2019(H1N1)) segment 4 hemagglutinin (HA) gene, complete cds
GGAAAACAAAAGCAACAAAAATGAAGGCAATACTAGTAGTTATGCTGTATACATTTACAACCGCAAATGC
AGACACATTATGTATAGGTTATCATGCGAACAATTCAACAGACACTGTGGACACAGTACTAGAAAAGAAT
GTAACAGTAACACACTCTGTCAATCTTCTGGAAGACAAGCATAACGGAAAACTATGCAAACTAAGAGGGG
TAGCCCCATTGCATTTGGGTAAATGTAACATTGCTGGCTGGATCCTGGGAAATCCAGAGTGTGAATCACT
CTCCACAGCAAGATCATGGTCCTACATTGTGGAAACATCTAATTCAGACAATGGAACGTGTTACCCAGGA
GATTTCATCAATTATGAGGAGCTAAGAGAGCAATTGAGCTCAGTGTCATCATTTGAAAGGTTTGAAATAT
TCCCCAAGACAAGTTCATGGCCTAATCATGACTCGGACAATGGTGTAACGGCAGCATGTCCTCACGCTGG
AGCAAAAAGCTTCTACAAAAACTTGATATGGCTGGTTAAAAAAGGAAAATCATACCCAAAGATCAACCAA
ACCTACATTAATGATAAAGGGAAAGAAGTCCTCGTGCTGTGGGGCATTCACCATCCACCTACTATTGCTG
ACCAACAAAGTCTCTATCAGAATGCAGATGCATATGTTTTTGTGGGGACATCAAGATACAGCAAGAAGTT
CAAGCCGGAAATAGCAACAAGACCCAAAGTGAGGGATCAAGAAGGGAGAATGAACTATTACTGGACACTA
GTAGAACCGGGAGACAAAATAACATTCGAAGCAACTGGTAATCTAGTGGCACCGAGATATGCATTCACAA
TGGAAAGAGATGCTGGATCTGGTATTATCATTTCAGATACACCAGTCCACGATTGCAATACAACTTGTCA
GACACCCGAGGGTGCTATAAACACCAGCCTCCCATTTCAGAATGTACATCCGATCACAATTGGGAAATGT
CCAAAGTATGTAAAAAGCACAAAATTGAGACTGGCCACAGGATTGAGGAATGTCCCGTCTATTCAATCTA
GAGGCCTATTCGGGGCCATTGCTGGCTTCATCGAAGGGGGGTGGACAGGGATGGTAGATGGATGGTACGG
TTATCACCATCAAAATGAGCAGGGGTCAGGATATGCAGCCGATCTGAAGAGCACACAAAATGCCATTGAT
AAGATTACTAACAAAGTAAATTCTGTTATTGAAAAGATGAATACACAGTTCACAGCAGTTGGTAAAGAGT
TCAACCACCTTGAAAAAAGAATAGAGAATCTAAATAAAAAGGTTGATGATGGTTTCCTGGACATTTGGAC
TTACAATGCCGAACTGTTGGTTCTACTGGAAAACGAAAGAACTTTGGACTATCACGATTCAAATGTGAAG
AACTTGTATGAAAAAGTAAGAAACCAGTTAAAAAACAATGCCAAGGAAATTGGAAACGGCTGCTTTGAAT
TTTACCACAAATGCGACAACACATGCATGGAAAGTGTCAAGAATGGGACTTATGACTACCCAAAATACTC
AGAGGAAGCAAAATTAAACAGAGAAAAAATAGATGGAGTAAAGCTGGACTCAACAAGGATCTACCAGATT
TTGGCGATCTATTCAACTGTTGCCAGTTCATTGGTACTGGTAGTCTCCCTGGGGGCAATCAGCTTCTGGA
TGTGCTCTAATGGGTCTCTACAGTGTAGAATATGTATTTAACATTAGGATTTCAGAATCATGAGAAAAAC
AC
Loading

0 comments on commit c356a0a

Please sign in to comment.