v2.0.0
This major release renames workflows to utilize the TheiaCoV tag (previously Titan) and adds five new workflows for public health viral genomics.
Workflow names changed and modifications made:
- Titan_Augur_Prep → TheiaCoV_Augur_Prep
- Titan_Augur_Run → TheiaCoV_Augur_Run
- Allow subsampling via user-defined builds.yml file
- Update default nextstrain docker images (
nextstrain/base:build-20210127T135203Z
→nextstrain/base:build-20210218T081251
)
- Titan_ClearLabs
- Update default consensus task docker container image (
quay.io/staphb/artic-ncov2019:1.3.0
→quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3
)- Note:
quay.io/staphb/artic-ncov2019:1.3.0
&quay.io/staphb/artic-ncov2019-epi2me
are both compatible alternative docker images
- Note:
- Use of
fastq-scan
rather thanfastqc
to calculate number of reads and pairs - Allow for use of a user-defined reference genome for consensus genome assembly
reference_genome
consensus task input variable
- Update default consensus task docker container image (
- Titan_Illumina_PE → TheiaCoV_Illumina_PE
- Default minimum coverage changed from 20x to 100x (
ivar consensus
andivar variants
tasks) - Use of
fastq-scan
rather thanfastqc
to calculate number of reads and pairs - Allow for use of a user-defined reference genome for consensus genome assembly
reference_genome
workflow input variable
- Default minimum coverage changed from 20x to 100x (
- Titan_Illumina_SE → TheiaCoV_Illumina_SE
- Default minimum coverage changed from 20x to 100x (
ivar consensus
andivar variants
tasks) - Use of
fastq-scan
rather thanfastqc
to calculate number of reads and pairs - Allow for use of a user-defined reference genome for consensus genome assembly
reference_genome
workflow input variable
- Default minimum coverage changed from 20x to 100x (
- Titan_ONT → TheiaCoV_ONT
- Update default consensus task docker container image (
quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3
→quay.io/staphb/artic-ncov2019-epi2me
)- Note:
quay.io/staphb/artic-ncov2019:1.3.0
&quay.io/staphb/artic-ncov2019:1.3.0-medaka-1.4.3
are both compatible alternative docker images
- Note:
- Use of
fastq-scan
rather thanfastqc
to calculate number of reads and pairs - Allow for use of a user-defined reference genome for consensus genome assembly
reference_genome
consensus task input variable
- Update default consensus task docker container image (
- Titan_FASTA → TheiaCoV_FASTA
- Titan-GC → TheiaCoV-GC
Workflows Added:
- TheiaCoV_Validate
- Workflow that allows for the rapid comparison of critical output values generated by differing versions of TheiaCoV workflows for SARS-CoV-2 genomic characterization for bioinformatics validation purposes
- TheiaCoV_DistanceTree
- Workflow that allows for Augur distance trees to be generated without refinement
- Workflows for SARS-CoV-2 Wastewater Data Analysis
- Freyja_FASTQ
- Workflow that allows running of the Freyja software with raw paired-end fastq files
- This workflow will generate the required alignment that is used as input to the
freya variants
command that is then analyzed withfreyja demix
- This workflow will generate the required alignment that is used as input to the
- Workflow that allows running of the Freyja software with raw paired-end fastq files
- Freyja_Plot
- Workflow to visualize Freyja outputs using the
freyja plot
command
- Workflow to visualize Freyja outputs using the
- TheiaCoV_WWVC
- Workflow for waste water variant calling that incorporates a modified version of the CDPHE's WasteWaterVariantCalling WDL Worfklow
- Freyja_FASTQ
Other modifications:
- Default docker images updated for Pangolin (
staphb/pangolin:3.1.11-pangolearn-2021-08-24
→quay.io/staphb/3.1.20-pangolearn-2022-02-02
), VADR (staphb/vadr:1.3
→quay.io/staphb/1.4.1-models-1.3-2
) and Nextclade (nextstrain/nextclade:1.3.0
→nextstrain/nextclade:1.10.3
) and Nextclade dataset tag (2021-06-25T00:00:00Z
→2022-02-07T12:00:00Z
) in all TheiaCOV workflows for SARS-CoV-2 genomic characterization (TheiaCoV_ClearLabs, TheiaCoV_FASTA, TheiaCoV_Illumina_PE, TheiaCoV_Illumina_SE, and TheiaCoV_ONT)- NOTE: In order to incorporate Nextclade ≥v1.10.0, modifications to the
nextclade_one_sample
were made that render it incompatible with older versions of Nextclade.
- NOTE: In order to incorporate Nextclade ≥v1.10.0, modifications to the
- Inclusion of S-gene coverage calculation in all Theia_COV workflows for SARS-CoV-2 genomic characterization that incorporate an alignment step (TheiaCoV_ClearLabs, TheiaCoV_Illumina_PE, TheiaCoV_Illumina_SE, and TheiaCoV_ONT)
- Mercury_Batch requiring
Array[String]
(i.e. gcp_uri) forsra_reads
input (wasArray[File]
); this change avoids the need for localization into VM before transferring to transfer bucket for SRA read submission drastically decreasing runtime- This modifications means that a zipped file of reads for web portal submission is no longer produced if a gcp_bucket is not specified; instead, users are encouraged to utilize the
zip_column_content
workflow from the Theiagen Terra_Utilities repository to generate these files.
- This modifications means that a zipped file of reads for web portal submission is no longer produced if a gcp_bucket is not specified; instead, users are encouraged to utilize the
- Implementation of a repository style guide