Skip to content

v1.3.0

Compare
Choose a tag to compare
@cimendes cimendes released this 17 Jan 18:03
· 275 commits to main since this release
c3f3b70

Public Health Bioinformatics v1.3.0 Release Notes

This minor release introduces two new workflows, improves on several workflows, and resolves various bugs

Full release notes can be found here.

🆕 New workflows:

🚀 Changes to existing workflows:

  • TheiaCoV_ONT_PHB

    • Influenza is now supported. Use "flu" for the organism optional input String parameter.
      • "sars-cov-2" and "HIV" tracks are unchanged.
  • TheiaProk Workflow Series

    • If user-input (expected_taxon) or predicted taxon by Gambit belongs to the Shigella genus, the Extensively Drug-Resistant phenotype is predicted using the new resfinder pointfinder database.
    • If user-input (expected_taxon) or predicted taxon by Gambit is the Mycobacterium tuberculosis species, bcftools indexes and merges all potential VCF files created by TbProfiler (both .bcf and .gz files).
    • Kraken2 has been added as an optional module (except for TheiaProk_ONT_PHB). If call_kraken is true, a database must be provided through kraken_db.
    • Two new optional inputs were added to control ANIm behaviour: ani_threshold (default 85.00) and percent_bases_aligned_threshold (default 70.00).
  • TheiaCoV_FASTA_PHB

    • The list of allowed input organism now includes "sars-cov-2" (default), "rsv_a", "rsv_b", "WNV", "MPXV" and "flu".
  • TheiaCoV_Illumina_PE_PHB

    • If organism is set as "flu", the workflow searches for antiviral mutations in the HA, NA, PA, PB1 and PB2 assembly segments, targeting the following 10 antivirals.: A_315675, compound_367, Favipiravir, Fludase, L_742_001, Laninamivir, Peramivir, Pimodivir, Xofluza and Zanamivir.
  • All Illumina SE and PE Workflows

    • A new optional input, read_qc, to allow the user to decide between fastq_scan and fastqc for the evaluation of read quality. The affected workflows are: TheiaCoV_Illumina_PE_PHB, TheiaCoV_Illumina_SE_PHB, TheiaProk_Illumina_SE_PHB, TheiaProk_Illumina_PE_PHB, TheiaMeta_Illumina_PE_PHB and Freyja_FASTQ_PHB.
  • CZGenEpi_Prep_PHB

    • Instead of extracting the sample_is_private_column_name and the gisaid_id_column_name columns, these columns are now generated by the program using already-provided inputs and by the new is_private Boolean variable which is used to set the value for all samples in the set. The field "GISAID ID (Public ID) - Optional" will now reflect the GISAID syntax for Virus Name.

Docker container updates:

  • AMRFinderPlus has been updated to version v3.11.20 and database 2023-09-26.1
  • tbp-parser has been updated to version 1.2.0
  • Freyja has been updated to version 1.4.8
  • ts_mlst database has been updated as of January 2024
  • Gambit has been updated to version 1.3.0, including its database files
  • Pangolin4 has been updated to version 4.3.1-pdata-1.23.1
  • IRMA has been updated to version 1.1.3

Tag updates:

  • SARS-CoV-2 Nexclade Dataset Tag has been updated to 2023-12-03T12:00:00Z

🐛 Bug fixes and small improvements:

  • kSNP3_PHB: The ksnp3_core_vcfoutput has been renamed to ksnp3_vcf_ref_genome for readability. Additionally, two new outputs are provided: ksnp3_vcf_snps_not_in_ref and ksnp3_vcf_ref_samplename.
  • TheiaProk Workflow Series: The MIDAS task was adjusted to reduce logging, and therefore the size of the log file, aiding debugging & reducing storage costs.
  • TheiaMeta_Illumina_PE_PHB: A new task Krona was added for the visualization of the Kraken2 reports.
  • Mercury_Prep_N_Batch: The excluded_samples.tsv is now printed to the execution log file, aiding debugging.
  • TheiaCoV Workflow Series: The nextclade_lineage output now populates correctly for SARS-CoV-2. Additionally, the nexclade_qc field is now exposed as an output.
  • Augur_PHB: The AUGUR refine input clock_filter_iqd has been reverted to the previous default value of 4.
  • Kraken Standalone Workflows: A new task Krona was added for the visualization of the Kraken2 reports.
  • TheiaValidate_PHB: TheiaValidate now outputs a table with validation-criteria failures only. Additionally, a new input was added that can translate different column names between tables to enable comparison.
  • TheiaCoV_ONT_PBH: If a sample fails quality check with read screening, this will no longer cause the workflow to fail. Instead, it will finish with an appropriate message.
  • Samples_To_Ref_Tree_PHB: The organism input has been renamed to nextclade_dataset_name for better clarity.
  • Various workflows: Call caching was disabled in the following workflows: BaseSpace_Fetch_PHB, Transfer_Column_Content_PHB, Assembly_Fetch_PHB, Snippy_Streamline_PHB and TheiaValidate_PHB.

What's Changed

  • updated VCF output file renaming in kSNP3 task by @kapsakcj in #207
  • reduce unnecessary logging in MIDAS task by @kapsakcj in #210
  • update default amrfinderplus docker image to v3.11.20 and db 2023-09-26.1 by @kapsakcj in #229
  • TheiaCoV_ONT_PHB Influenza Track by @jrotieno in #233
  • TheiaCoV_FASTA_Batch: TheiaCoV_FASTA, for many samples at once by @sage-wright in #238
  • Add krona task to TheiaMeta_Illumina_PE by @cimendes in #213
  • added 2 QC thresholds to ANI task to reduce false positives by @kapsakcj in #168
  • Resfinder improvements, added support for Shigella spp., added XDR Shigella prediction by @kapsakcj in #159
  • disable call caching for various workflows by @kapsakcj in #251
  • Mercury_Prep_N_Batch: print the excluded_samples.tsv and update Docker to avoid Google SDK warning by @sage-wright in #220
  • Nextclade Output Added by @DOH-HNH0303 in #239
  • TheiaCoV_FASTA: Adding five new organisms by @jrotieno in #194
  • Update task_augur_refine iqd back to 4 by @jrotieno in #268
  • TheiaCoV Illumina PE: Identify Influenza Antiviral Resistance Mutations in Assemblies by @jrotieno in #252
  • [New Utility] Workflow to rename FASTQ files (non-destructive) by @cimendes in #267
  • [TheiaCoV_Fasta_Batch] Substitute FASTA concatenating task to ensure proper sample_id propagation by @cimendes in #274
  • Kraken2 Standalone: add krona visualisation by @cimendes in #225
  • TheiaValidate_PHB: new features and new Docker image from TheiaValidate repository by @sage-wright in #255
  • TheiaProk TB: new VCF output and modification to the coverage report by @sage-wright in #245
  • TheiaCoV_ONT: prevent failure by coercing files into strings by @sage-wright in #288
  • update default freyja docker image to 1.4.8 for multiple tasks by @kapsakcj in #289
  • FastQC added as an optional module in all Illumina_PE and Illumina_SE workflows by @sage-wright in #260
  • update docker to version tag 2.23.0-2024-01 by @cimendes in #293
  • [TheiaProk Workflows] Add Kraken2 as optional module by @cimendes in #286
  • CZGenEpi_Prep_PHB: implementing user-requested changes by @sage-wright in #244
  • Update Gambit database files to version 1.3.0 by @kevinlibuit in #292
  • [PHB Release 1.3.0] update version and docker tags (nexclade sc2, pangolin, tbp-parser 1.1.7) by @cimendes in #296
  • [PR Template Update] Updating template per identified dev process improvements by @kelseykropp in #300
  • [TheiaProk suite] Patch fix: change type of kraken2_report to be string in taxon_table task by @cimendes in #297
  • Samples_To_Ref_Tree_PHB: changed "organism" input to "nextclade_dataset_name" by @jrotieno in #303
  • theiacov_fasta wf logic change for flu by @kapsakcj in #305
  • restore vadr_num_alerts string output to theiacov_fasta workflow by @kapsakcj in #307

New Contributors

Full Changelog: v1.2.1...v1.3.0