Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Kraken to Krona conversion to blastoff.wdl #544

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

golu099
Copy link
Contributor

@golu099 golu099 commented Jun 25, 2024

Kraken to Krona Conversion: Utilizes the krona command to convert Kraken output into Krona HTML visualizations, which allows for dynamic visualization of blastoff Kraken outputs.

@@ -414,3 +414,13 @@ workflows:
primaryDescriptorPath: /pipes/WDL/workflows/create_enterics_qc_viz_general.wdl
testParameterFiles:
- /empty.json
- name: blastoff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you insert these added blocks above, so they're in alphabetical order with the other task names?

@@ -0,0 +1,390 @@
version 1.0

task trim_rmdup_subsamp {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for exposing assembly.py trim_rmdup_subsamp in a WDL task. Since the functionality isn't metablast-specific, could you move the task to tasks_read_utils.wdl? (and then change the location it's imported from in the workflows using it)

assembly.py trim_rmdup_subsamp \
"~{inBam}" \
"~{clipDb}" \
"$(pwd)/outbam.bam" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you expose the output bam among outputs of this task, so this task can be used independently for applications expecting bam-format?


#samtools [OutBam -> FASTA]
#-f 4 (f = include only) (4 = unmapped reads) https://broadinstitute.github.io/picard/explain-flags.html
samtools fasta "$(pwd)/outbam.bam" > "~{bam_basename}.fasta"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe wrap the fasta output in a WDL conditional so it is only produced if a task input (emit_fasta, say) is set to true? (the task output would then need to have an optional File? type, since it may or may not exist)

This fasta has the potential to be quite large. Consider compressing it (with xz) here, and then decompress it in the task consuming it (iff it is compressed).

version 1.0

task trim_rmdup_subsamp {
meta {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we may want to mark this task as volatile: true, but it looks like the samtools-based downsampling in viral-core uses a constant seed, so it should be deterministic and we don't need to mark it as volatile.

echo "--END STAGE 2"
echo "---START STAGE 3"

#Stage 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot is going on in this stage. Can inline comments be added to walk through what's happening?
(and maybe also split up the long lines across multiple lines)

}
parameter_meta {
krona_taxonomy_tab: {
description: "Krona taxonomy database containing a single file: 'taxonomy.tab' (exact name), or possibly just a compressed 'taxonomy.tab"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would likely help users if the description says something about how to create or where to obtain a taxonomy.tab file (ex. "[...] as created by ktUpdateTaxonomy")

@@ -0,0 +1,56 @@
version 1.0

import "../tasks/tasks_megablast.wdl" as tools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe change the input alias to something more descriptive like metablast rather than tools (& change usage below to match)

@@ -0,0 +1,30 @@
version 1.0

import "../tasks/tasks_megablast.wdl" as tools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about renaming the import alias to as megablast

allowNestedInputs: true
}
input {
File inFasta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be consistent, underscores rather than camel-case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants