Skip to content

Commit

Permalink
Add a configurable includes.txt
Browse files Browse the repository at this point in the history
Adds a configurable includes_{serotype}.txt to force-include key strains
(e.g. vaccine-related, lineage-defining) for each serotype tree.
  • Loading branch information
j23414 committed May 24, 2024
1 parent 9431655 commit 5af15a1
Show file tree
Hide file tree
Showing 7 changed files with 8 additions and 0 deletions.
1 change: 1 addition & 0 deletions phylogenetic/config/config_dengue.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ display_strain_field: "strain"

filter:
exclude: "config/exclude.txt"
include: "config/include_{serotype}.txt"
group_by: "year region"
min_length:
genome: 5000
Expand Down
1 change: 1 addition & 0 deletions phylogenetic/config/include_all.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Format: <GenBank accession> [# <documentation on reason for inclusion>]
1 change: 1 addition & 0 deletions phylogenetic/config/include_denv1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Format: <GenBank accession> [# <documentation on reason for inclusion>]
1 change: 1 addition & 0 deletions phylogenetic/config/include_denv2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Format: <GenBank accession> [# <documentation on reason for inclusion>]
1 change: 1 addition & 0 deletions phylogenetic/config/include_denv3.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Format: <GenBank accession> [# <documentation on reason for inclusion>]
1 change: 1 addition & 0 deletions phylogenetic/config/include_denv4.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
# Format: <GenBank accession> [# <documentation on reason for inclusion>]
2 changes: 2 additions & 0 deletions phylogenetic/rules/prepare_sequences.smk
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ rule filter:
sequences = lambda wildcard: "data/sequences_{serotype}.fasta" if wildcard.gene in ['genome'] else "results/{gene}/sequences_{serotype}.fasta",
metadata = "data/metadata_{serotype}.tsv",
exclude = config["filter"]["exclude"],
include = config["filter"]["include"],
output:
sequences = "results/{gene}/filtered_{serotype}.fasta"
params:
Expand All @@ -69,6 +70,7 @@ rule filter:
--metadata {input.metadata} \
--metadata-id-columns {params.strain_id} \
--exclude {input.exclude} \
--include {input.include} \
--output {output.sequences} \
--group-by {params.group_by} \
--sequences-per-group {params.sequences_per_group} \
Expand Down

0 comments on commit 5af15a1

Please sign in to comment.