v2.3.0
PHVG v2.3.0 Release Notes
This minor release introduces updates organism updates for the TheiaCoV workflow series as well as a new workflow for preparing and submitting metadata to public repositories (Mercury_Prep_N_Batch).
Updates to the TheiaCoV Workflow Series
Organism track updates:
- “MPXV” for monkeypox analysis: VADR annotation assessment enabled (was previously not supported)
- "WNV" for West Nile Virus analysis: VADR annotation assessment enabled (was previously not supported)
- "flu" for influenza analysis: will initiate genome assembly with IRMA and characterization with ABRicate against InsaFlu database and NextClade; available in TheiaCoV_Illumina_PE only
- "HIV" for Human Immunodeficiency Virus analysis: will initiate consensus assembly by alignment (BWA + iVar or minimap2 + Medaka for Illumina and ONT read data, respectively) and characterization with Quasitools HyDRA for antiretroviral drug resistance detection
Note: The default value for the organism
variable is “sars-cov-2”
QC and read processing modules updates:
- Option to utilize fastp rather than trimmomatic for read processing
- Reads processed by BBduk ordered reads help to ensure that downstream alignments are consistent
Mercury Prep-N-Batch Workflow
The Mercury_Prep_N_Batch workflow combines the previously separate Mercury_PE/SE_Prep and Mercury_Batch workflows into one.
This workflow functions as follows:
Step 1: Performs supermassive metadata wrangling (task sm_metadata_wrangling in task_mercury_file_wrangling)
- downloads the entire origin Terra table where the data, analysis results, metadata, etc. are stored.
- extracts the samples that the user intends to upload
- creates some standard variables that are used multiple times (such as year, isolate, etc.)
- determines which organism is being run (currently only supports sars-cov-2 and mpox) and sets the required and optional variables for each file that is being created (e.g., BioSample vs SRA vs GISAID vs GenBank/BankIt)
- removes any entries that do not meet predetermined quality thresholds (
vadr_num_alerts
andnumber_N
) - removes any entries that do not have all required fields present, and writes the samples that were removed to a table that also lists what fields were missing
- renames columns as appropriate
- reformats columns as appropriate
- compiles all required and optional information in TSV files
- renames files with the submission_id and edits fasta headers as appropriate
- uploads read files to the Theiagen SRA GCP Google bucket
Step 2: If sars-cov-2, trim GenBank fasta files of terminal Ns (task trim_genbank_fastas in task_mercury_file_wrangling.wdl)
- uses VADR to trim terminal ambiguous nucleotides
- returns the edited fasta file
Step 3: If mpox, put metadata into sqn format (task table2asn in task_mercury_file_wrangling.wdl)
- soft links the .sbt, .fsa, and .src files to have common name
- converts the data into a sqn file with table2asn so it can be emailed to NCBI
New Documentation
Detailed documentation has been created for all workflows in the PHVG v2.3.0 repository.
What's Changed
- citation.cff update by @kapsakcj in #172
- New VADR output:
.zip
of output fasta files by @kapsakcj in #171 - VADR updates for MPXV; update default nextclade_dataset_tag and docker by @kapsakcj in #175
- Add optional arguments input to trimmomatic task and add fastp task by @michellescribner in #182
- Rp3 add support for adapter files in bbduk, update ci test by @kapsakcj in #186
- adds support for running VADR on WNV samples by @kapsakcj in #190
- Azure compatibility by @sage-wright in #193
- Adding flu organism track by @kevinlibuit in #194
- Fja hiv merge dev by @frankambrosio3 in #198
- the Mercury_Prep_N_Batch workflow by @sage-wright in #196
- Fix conditional logic in Mercury Prep N Batch by @sage-wright in #199
- Fix lowercase things by @sage-wright in #201
- Ensure bbduk outputs are ordered by @kevinlibuit in #202
- Update version and SC2 references by @kevinlibuit in #203
- quality exclusion write out by @sage-wright in #204
- Smw excluded mercury dev by @sage-wright in #205
New Contributors
- @michellescribner made their first contribution in #182
Full Changelog: v2.2.0...v2.3.0