diff --git a/docs/assets/figures/Freyja_FASTQ.png b/docs/assets/figures/Freyja_FASTQ.png index 1789c8c53..75eb466c2 100644 Binary files a/docs/assets/figures/Freyja_FASTQ.png and b/docs/assets/figures/Freyja_FASTQ.png differ diff --git a/docs/workflows/genomic_characterization/freyja.md b/docs/workflows/genomic_characterization/freyja.md index c5b15c54d..fc1094204 100644 --- a/docs/workflows/genomic_characterization/freyja.md +++ b/docs/workflows/genomic_characterization/freyja.md @@ -1,16 +1,10 @@ # Freyja Workflow Series -!!! dna inline end "Wastewater and more" - The typical use case of Freyja is to **analyze mixed SARS-CoV-2 samples** from a sequencing dataset, most often **wastewater**. - - !!! warning "Default Values" - The defaults included in the Freyja workflows reflect this use case but **can be adjusted for other pathogens**. See the [**Running Freyja on other pathogens**](freyja.md#running-freyja-on-other-pathogens) section for more information. - ## Quick Facts | **Workflow Type** | **Applicable Kingdom** | **Last Known Changes** | **Command-line Compatibility** | **Workflow Level** | |---|---|---|---|---| -| [Genomic Characterization](../../workflows_overview/workflows_type.md/#genomic-characterization) | [Viral](../../workflows_overview/workflows_kingdom.md/#viral) | PHB v2.2.0 | Yes | Sample-level, Set-level | +| [Genomic Characterization](../../workflows_overview/workflows_type.md/#genomic-characterization) | [Viral](../../workflows_overview/workflows_kingdom.md/#viral) | PHB v2.3.0 | Yes | Sample-level, Set-level | ## Freyja Overview @@ -21,9 +15,15 @@ Additional post-processing steps can produce visualizations of aggregated samples. +!!! dna "Wastewater and more" + The typical use case of Freyja is to **analyze mixed SARS-CoV-2 samples** from a sequencing dataset, most often **wastewater**. + + !!! warning "Default Values" + The defaults included in the Freyja workflows reflect this use case but **can be adjusted for other pathogens**. See the [**Running Freyja on other pathogens**](freyja.md#running-freyja-on-other-pathogens) section for more information. + !!! caption "Figure 1: Workflow Diagram for Freyja_FASTQ_PHB workflow" ##### Figure 1 { #figure1 } - ![**Figure 1: Workflow diagram for Freyja_FASTQ_PHB workflow.**](../../assets/figures/Freyja_FASTQ.png){width=25%} + ![**Figure 1: Workflow diagram for Freyja_FASTQ_PHB workflow.**](../../assets/figures/Freyja_FASTQ.png){width=100%} Depending on the type of data (Illumina or Oxford Nanopore), the Read QC and Filtering steps, as well as the Read Alignment steps use different software. The user can specify if the barcodes and lineages file should be updated with `freyja update` before running Freyja or if bootstrapping is to be performed with `freyja boot`. @@ -63,7 +63,7 @@ We recommend running this workflow with **"Run inputs defined by file paths"** s | freyja_update | **gcp_uri** | String | The path where you want the Freyja reference files to be stored. Include gs:// at the beginning of the string. Full example with a Terra workspace bucket: "gs://fc-87ddd67a-c674-45a8-9651-f91e3d2f6bb7" | | Required | | freyja_update_refs | **cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional | | freyja_update_refs | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | -| freyja_update_refs | **docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" | Optional | +| freyja_update_refs | **docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" | Optional | | freyja_update_refs | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 16 | Optional | | transfer_files | **cpu** | Int | Number of CPUs to allocate to the task | 2 | Optional | | transfer_files | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | @@ -110,12 +110,14 @@ This workflow runs on the sample level. | freyja | **confirmed_only** | Boolean | Include only confirmed SARS-CoV-2 lineages | FALSE | Optional | | freyja | **cpu** | Int | Number of CPUs to allocate to the task | 2 | Optional | | freyja | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | -| freyja | **docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" | Optional | +| freyja | **docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" | Optional | | freyja | **eps** | Float | The minimum lineage abundance cut-off value | 0.001 | Optional | -| freyja | **freyja_lineage_metadata** | File | (found in the optional section, but is required) File containing the lineage metadata; the "curated_lineages.json" file found can be used for this variable. Does not need to be provided if update_db is true. | None | Optional, Required | +| freyja | **freyja_barcodes** | String | Custom barcode file. Does not need to be provided if update_db is true if the freyja_pathogen is provided. | None | Optional | +| freyja | **freyja_lineage_metadata** | File | File containing the lineage metadata; the "curated_lineages.json" file found can be used for this variable. Does not need to be provided if update_db is true or if the freyja_pathogen is provided. | None | Optional, Required | +| freyja | **freyja_pathogen** | String | Pathogen of interest, used if not providing the barcodes and lineage metadata files. Options: SARS-CoV-2, MPXV, H5NX, H1N1pdm, FLU-B-VIC, MEASLESN450, MEASLES, RSVa, RSVb | None | Optional | | freyja | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 4 | Optional | | freyja | **number_bootstraps** | Int | The number of bootstraps to perform (only used if bootstrap = true) | 100 | Optional | -| freyja | **update_db** | Boolean | Updates the Freyja reference files (the usher barcodes and lineage metadata files) but will not save them as output (use Freyja_Update for that purpose). If set to true, the `freyja_lineage_metadata` and `freyja_usher_barcodes` files are not required. | FALSE | Optional | +| freyja | **update_db** | Boolean | Updates the Freyja reference files (the usher barcodes and lineage metadata files) but will not save them as output (use Freyja_Update for that purpose). If set to true, the `freyja_lineage_metadata` and `freyja_barcodes` files are not required. | FALSE | Optional | | freyja_fastq | **depth_cutoff** | Int | The minimum coverage depth with which to exclude sites below this value and group identical barcodes | 10 | Optional | | freyja_fastq | **kraken2_target_organism** | String | The organism whose abundance the user wants to check in their reads. This should be a proper taxonomic name recognized by the Kraken database. | "Severe acute respiratory syndrome coronavirus 2" | Optional | | freyja_fastq | **ont** | Boolean | Indicates if the input data is derived from an ONT instrument. | FALSE | Optional | @@ -364,7 +366,7 @@ The main output file used in subsequent Freyja workflows is found under the `fre | freyja_fastq_wf_version | String | The version of the Public Health Bioinformatics (PHB) repository used | ONT, PE, SE | | freyja_lineage_metadata_file | File | Lineage metadata JSON file used. Can be the one provided as input or downloaded by Freyja if update_db is true | ONT, PE, SE | | freyja_metadata_version | String | Name of lineage metadata file used, or the date if update_db is true | ONT, PE, SE | -| freyja_usher_barcode_file | File | USHER barcode feather file used. Can be the one provided as input or downloaded by Freyja if update_db is true | ONT, PE, SE | +| freyja_barcode_file | File | Barcode feather file used. Can be the one provided as input or downloaded by Freyja if update_db is true | ONT, PE, SE | | freyja_variants | File | The TSV file containing the variants identified by Freyja | ONT, PE, SE | | freyja_version | String | version of Freyja used | ONT, PE, SE | | ivar_version_primtrim | String | Version of iVar for running the iVar trim command | ONT, PE, SE | @@ -431,7 +433,7 @@ This workflow runs on the set level. | freyja_plot | **collection_date** | Array[String] | An array containing the collection dates for the sample (YYYY-MM-DD format) | | Optional | | freyja_plot_task | **cpu** | Int | Number of CPUs to allocate to the task | 2 | Optional | | freyja_plot_task | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | -| freyja_plot_task | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22 | Optional | +| freyja_plot_task | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02 | Optional | | freyja_plot_task | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 4 | Optional | | freyja_plot_task | **mincov** | Int | The minimum genome coverage used as a cut-off of data to include in the plot | 60 | Optional | | freyja_plot_task | **plot_day_window** | Int | The width of the rolling average window; only used if plot_time_interval is "D" | 14 | Optional | @@ -492,7 +494,7 @@ This workflow runs on the set level. | freyja_dashboard | **dashboard_intro_text** | File | A file containing the text to be contained at the top of the dashboard. | SARS-CoV-2 lineage de-convolution performed by the Freyja workflow (). | Optional | | freyja_dashboard_task | **config** | File | (found in the optional section, but is required) A yaml file that applies various configurations to the dashboard, such as grouping lineages together, applying colorings, etc. See also . | None | Optional, Required | | freyja_dashboard_task | **cpu** | Int | Number of CPUs to allocate to the task | 2 | Optional | -| freyja_dashboard_task | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22 | Optional | +| freyja_dashboard_task | **docker** | String | The Docker container to use for the task | us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02 | Optional | | freyja_dashboard_task | **headerColor** | String | A hex color code to change the color of the header | | Optional | | freyja_dashboard_task | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 4 | Optional | | freyja_dashboard_task | **mincov** | Float | The minimum genome coverage used as a cut-off of data to include in the dashboard. Default is set to 60 by the freyja command-line tool (not a WDL task default, per se) | None | Optional | @@ -532,26 +534,33 @@ This workflow runs on the set level. The main requirement to run Freyja on other pathogens is **the existence of a barcode file for your pathogen of interest**. Currently, barcodes exist for the following organisms -- MEASLES +- SARS-CoV-2 (default) - MPXV +- H5NX +- H1N1pdm +- FLU-B-VIC +- MEASLESN450 +- MEASLES - RSVa - RSVb -The appropriate barcode file and reference sequence need to be downloaded and uploaded to your [Terra.bio](http://Terra.bio) workspace. - !!! warning "Freyja barcodes for other pathogens" Data for various pathogens can be found in the following repository: [Freyja Barcodes](https://github.com/gp201/Freyja-barcodes) Folders are organized by pathogen, with each subfolder named after the date the barcode was generated, using the format YYYY-MM-DD. Barcode files are named `barcode.csv`, and reference genome files are named `reference.fasta`. +The appropriate barcode file and reference sequence need to be downloaded and uploaded to your [Terra.bio](http://Terra.bio) workspace. + + + When running **Freyja_FASTQ_PHB**, the appropriate reference and barcodes file need to be passed as inputs. The first is a required input and will show up at the top of the workflows inputs page on [Terra.bio](http://Terra.bio) ([Figure 2](freyja.md/#figure2)). !!! caption "Figure 2: Required input for Freyja_FASTQ_PHB to provide the reference genome to be used by Freyja" ##### Figure 2 { #figure2 } ![**Figure 2: Required input for Freyja_FASTQ_PHB to provide the reference genome to be used by Freyja.**](../../assets/figures/Freyja_figure2.png) -The barcodes file can be passed directly to Freyja by the `freyja_usher_barcodes` optional input ([Figure 3](freyja.md/#figure3)). +The barcodes file can be passed directly to Freyja by the `freyja_barcodes` optional input ([Figure 3](freyja.md/#figure3)). !!! caption "Figure 3: Optional input for Freyja_FASTQ_PHB to provide the barcodes file to be used by Freyja" ##### Figure 3 {#figure3} diff --git a/tasks/taxon_id/freyja/task_freyja.wdl b/tasks/taxon_id/freyja/task_freyja.wdl index a0894e55e..b3a7ed2c2 100644 --- a/tasks/taxon_id/freyja/task_freyja.wdl +++ b/tasks/taxon_id/freyja/task_freyja.wdl @@ -5,7 +5,8 @@ task freyja_one_sample { File primer_trimmed_bam String samplename File reference_genome - File? freyja_usher_barcodes + String? freyja_pathogen + File? freyja_barcodes File? freyja_lineage_metadata Float? eps Float? adapt @@ -16,7 +17,7 @@ task freyja_one_sample { Int? depth_cutoff Int memory = 8 Int cpu = 2 - String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" + String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" Int disk_size = 100 } command <<< @@ -44,9 +45,9 @@ task freyja_one_sample { freyja_metadata_version="freyja update: $(date +"%Y-%m-%d")" else # configure barcode - if [[ ! -z "~{freyja_usher_barcodes}" ]]; then - echo "User freyja usher barcodes identified; ~{freyja_usher_barcodes} will be utilized for freyja demixing" - freyja_usher_barcode_version=$(basename -- "~{freyja_usher_barcodes}") + if [[ ! -z "~{freyja_barcodes}" ]]; then + echo "User freyja usher barcodes identified; ~{freyja_barcodes} will be utilized for freyja demixing" + freyja_usher_barcode_version=$(basename -- "~{freyja_barcodes}") else freyja_usher_barcode_version="unmodified from freyja container: ~{docker}" fi @@ -74,9 +75,10 @@ task freyja_one_sample { # Calculate Boostraps, if specified if ~{bootstrap}; then freyja boot \ + ~{"--pathogen" + freyja_pathogen} \ ~{"--eps " + eps} \ ~{"--meta " + freyja_lineage_metadata} \ - ~{"--barcodes " + freyja_usher_barcodes} \ + ~{"--barcodes " + freyja_barcodes} \ ~{"--depthcutoff " + depth_cutoff} \ ~{"--nb " + number_bootstraps } \ ~{true='--confirmedonly' false='' confirmed_only} \ @@ -91,7 +93,7 @@ task freyja_one_sample { freyja demix \ ~{'--eps ' + eps} \ ~{'--meta ' + freyja_lineage_metadata} \ - ~{'--barcodes ' + freyja_usher_barcodes} \ + ~{'--barcodes ' + freyja_barcodes} \ ~{'--depthcutoff ' + depth_cutoff} \ ~{true='--confirmedonly' false='' confirmed_only} \ ~{'--adapt ' + adapt} \ @@ -144,7 +146,7 @@ task freyja_one_sample { File? freyja_bootstrap_summary = "~{samplename}_summarized.csv" File? freyja_bootstrap_summary_pdf = "~{samplename}_summarized.pdf" # capture barcode file - first is user supplied, second appears if the user did not supply a barcode file - File freyja_usher_barcode_file = select_first([freyja_usher_barcodes, "usher_barcodes.feather"]) + File freyja_barcode_file = select_first([freyja_barcodes, "usher_barcodes.feather"]) File freyja_lineage_metadata_file = select_first([freyja_lineage_metadata, "curated_lineages.json"]) String freyja_barcode_version = read_string("FREYJA_BARCODES") String freyja_metadata_version = read_string("FREYJA_METADATA") diff --git a/tasks/taxon_id/freyja/task_freyja_dashboard.wdl b/tasks/taxon_id/freyja/task_freyja_dashboard.wdl index a463a4cf6..24be429a9 100644 --- a/tasks/taxon_id/freyja/task_freyja_dashboard.wdl +++ b/tasks/taxon_id/freyja/task_freyja_dashboard.wdl @@ -13,7 +13,7 @@ task freyja_dashboard_task { Boolean scale_by_viral_load = false String freyja_dashboard_title File? dashboard_intro_text - String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" + String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" Int disk_size = 100 Int memory = 4 Int cpu = 2 diff --git a/tasks/taxon_id/freyja/task_freyja_plot.wdl b/tasks/taxon_id/freyja/task_freyja_plot.wdl index 82735e1a4..7c02572cb 100644 --- a/tasks/taxon_id/freyja/task_freyja_plot.wdl +++ b/tasks/taxon_id/freyja/task_freyja_plot.wdl @@ -10,7 +10,7 @@ task freyja_plot_task { String plot_time_interval="MS" Int plot_day_window=14 String freyja_plot_name - String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" + String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" Int disk_size = 100 Int mincov = 60 Int memory = 4 diff --git a/tasks/taxon_id/freyja/task_freyja_update.wdl b/tasks/taxon_id/freyja/task_freyja_update.wdl index d877ba282..14bf716b2 100644 --- a/tasks/taxon_id/freyja/task_freyja_update.wdl +++ b/tasks/taxon_id/freyja/task_freyja_update.wdl @@ -2,7 +2,7 @@ version 1.0 task freyja_update_refs { input { - String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.1-07_02_2024-01-27-2024-07-22" + String docker = "us-docker.pkg.dev/general-theiagen/staphb/freyja:1.5.2-11_30_2024-02-00-2024-12-02" Int disk_size = 100 Int memory = 16 Int cpu = 4 diff --git a/workflows/freyja/wf_freyja_fastq.wdl b/workflows/freyja/wf_freyja_fastq.wdl index 2e0fe755e..b758d3ca4 100644 --- a/workflows/freyja/wf_freyja_fastq.wdl +++ b/workflows/freyja/wf_freyja_fastq.wdl @@ -208,7 +208,7 @@ workflow freyja_fastq { File freyja_depths = freyja.freyja_depths File freyja_demixed = freyja.freyja_demixed Float freyja_coverage = freyja.freyja_coverage - File freyja_usher_barcode_file = freyja.freyja_usher_barcode_file + File freyja_barcode_file = freyja.freyja_barcode_file File freyja_lineage_metadata_file = freyja.freyja_lineage_metadata_file String freyja_barcode_version = freyja.freyja_barcode_version String freyja_metadata_version = freyja.freyja_metadata_version