Skip to content

Commit

Permalink
Cleanup datasets
Browse files Browse the repository at this point in the history
Here I cleanup datasets:
-  If I don't recognize a dataset or if I know that it is incorrect, I removed it.
- I give friendly names to the remaining datasets, mostly according to what's written in their reference fasta file.
- I try to make directory structure and naming a little more consistent (it is now reflected in the dataset `path` which is used as an identifier of the dataset in many places exposed to the end user).

This is mostly a "cosmetic" change, needed for testing usability of the current Nextclade Web UI and for making it prettier.

I have little knowledge in most of these pathogens, so this PR can be rolled back and the reorganization can be done properly by a scientists later.
  • Loading branch information
ivan-aksamentov committed Oct 2, 2023
1 parent e1a5a1b commit 4bff341
Show file tree
Hide file tree
Showing 368 changed files with 39 additions and 457,922 deletions.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
{
"attributes": {
"name": {
"value": "adenovirus/hadv-a",
"valueFriendly": "UNKNOWN"
"value": "hadv-a-12",
"valueFriendly": "Human mastadenovirus A (serotype 12)"
},
"reference": {
"value": "UNKNOWN",
"valueFriendly": "UNKNOWN"
"value": "NC_001460.1",
"valueFriendly": "HAdV-A serotype 12"
}
},
"compatibility": {
Expand Down
2 changes: 1 addition & 1 deletion data/community/collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,6 @@
]
},
"dataset_order": [
"community/adenovirus/hadv-a"
"community/adenovirus/hadv-a/12"
]
}
77 changes: 19 additions & 58 deletions data/nextstrain/collection.json
Original file line number Diff line number Diff line change
Expand Up @@ -22,63 +22,24 @@
},
"dataset_order": [
"nextstrain/sars-cov-2/MN908947",
"nextstrain/sars-cov-2-21L/BA.2",
"nextstrain/MPXV/ancestral",
"nextstrain/hMPXV/NC_063383.1",
"nextstrain/hMPXV_B1/pseudo_ON563414",
"nextstrain/rsv_a/EPI_ISL_412866",
"nextstrain/rsv_b/EPI_ISL_1653999",
"nextstrain/flu_h1n1pdm_ha/CY121680",
"nextstrain/flu_h1n1pdm_ha/MW626062",
"nextstrain/flu_h1n1pdm_na/MW626056",
"nextstrain/flu_h3n2_ha/CY163680",
"nextstrain/flu_h3n2_ha/EPI1857216",
"nextstrain/flu_h3n2_na/EPI1857215",
"nextstrain/flu_vic_ha/KX058884",
"nextstrain/flu_vic_na/CY073894",
"nextstrain/flu_yam_ha/JN993010",
"nextstrain/sars-cov-2-no-recomb/MN908947",
"nextstrain/dummy1",
"nextstrain/h1_na",
"nextstrain/hbv",
"nextstrain/hiv",
"nextstrain/hiv_mat",
"nextstrain/rsv-a",
"nextstrain/sc2",
"nextstrain/sc2-big",
"nextstrain/sc2_mat",
"nextstrain/vic_na",
"nextstrain/ebola",
"nextstrain/enterovirus/d68",
"nextstrain/flu/h1n1pdm/ha",
"nextstrain/flu/h1n1pdm/na",
"nextstrain/flu/h1n1pdm/np",
"nextstrain/flu/h1n1pdm/ns",
"nextstrain/flu/h1n1pdm/pa",
"nextstrain/flu/h1n1pdm/pb1",
"nextstrain/flu/h1n1pdm/pb2",
"nextstrain/flu/h3n2/ha",
"nextstrain/flu/h3n2/ma",
"nextstrain/flu/h3n2/na",
"nextstrain/flu/h3n2/np",
"nextstrain/flu/h3n2/ns",
"nextstrain/flu/h3n2/pa",
"nextstrain/flu/h3n2/pb1",
"nextstrain/flu/h3n2/pb2",
"nextstrain/flu/vic/ha",
"nextstrain/flu/vic/na",
"nextstrain/flu/vic/np",
"nextstrain/flu/vic/ns",
"nextstrain/flu/vic/pa",
"nextstrain/flu/vic/pb1",
"nextstrain/flu/vic/pb2",
"nextstrain/flu/yam/ha",
"nextstrain/flu/yam/na",
"nextstrain/flu/yam/np",
"nextstrain/flu/yam/ns",
"nextstrain/flu/yam/pa",
"nextstrain/flu/yam/pb1",
"nextstrain/flu/yam/pb2",
"nextstrain/zika"
"nextstrain/sars-cov-2/BA.2",
"nextstrain/flu/h1n1pdm/ha/CY121680",
"nextstrain/flu/h1n1pdm/ha/MW626062",
"nextstrain/flu/h1n1pdm/na/MW626056",
"nextstrain/flu/h3n2/ha/CY163680",
"nextstrain/flu/h3n2/ha/EPI1857216",
"nextstrain/flu/h3n2/na/EPI1857215",
"nextstrain/flu/vic/ha/KX058884",
"nextstrain/flu/vic/na/CY073894",
"nextstrain/flu/yam/ha/JN993010",
"nextstrain/rsv/a/EPI_ISL_412866",
"nextstrain/rsv/b/EPI_ISL_1653999",
"nextstrain/mpx/hmpxv-b1/pseudo_ON563414",
"nextstrain/mpx/hmpxv/NC_063383.1",
"nextstrain/mpx/mpxv/ancestral",
"nextstrain/ebola/zaire",
"nextstrain/enterovirus/d68/fermon",
"nextstrain/hiv/1",
"nextstrain/zika/KX369547.1"
]
}
14 changes: 0 additions & 14 deletions data/nextstrain/dummy1/README.md

This file was deleted.

3 changes: 0 additions & 3 deletions data/nextstrain/dummy1/genome_annotation.gff3

This file was deleted.

2 changes: 0 additions & 2 deletions data/nextstrain/dummy1/reference.fasta

This file was deleted.

18 changes: 0 additions & 18 deletions data/nextstrain/dummy1/sequences.fasta

This file was deleted.

108 changes: 0 additions & 108 deletions data/nextstrain/dummy1/tree.json

This file was deleted.

File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
"attributes": {
"name": {
"value": "ebola",
"valueFriendly": "UNKNOWN"
"valueFriendly": "Zaire ebolavirus"
},
"reference": {
"value": "UNKNOWN",
"valueFriendly": "UNKNOWN"
"value": "KR075003.1",
"valueFriendly": "LBR/2014/Makona-Liberia-DQE14"
}
},
"compatibility": {
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
{
"attributes": {
"name": {
"value": "flu/h1n1pdm/na",
"valueFriendly": "UNKNOWN"
"value": "ev-d68-fermon",
"valueFriendly": "Human enterovirus D68"
},
"reference": {
"value": "UNKNOWN",
"valueFriendly": "UNKNOWN"
"value": "AY426531.1",
"valueFriendly": "EV-D68 strain Fermon"
}
},
"compatibility": {
Expand Down
31 changes: 0 additions & 31 deletions data/nextstrain/enterovirus/d68/pathogen.json

This file was deleted.

File renamed without changes.
14 changes: 0 additions & 14 deletions data/nextstrain/flu/h1n1pdm/ha/README.md

This file was deleted.

3 changes: 0 additions & 3 deletions data/nextstrain/flu/h1n1pdm/ha/genome_annotation.gff3

This file was deleted.

31 changes: 0 additions & 31 deletions data/nextstrain/flu/h1n1pdm/ha/pathogen.json

This file was deleted.

Loading

0 comments on commit 4bff341

Please sign in to comment.