Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resfinder improvements, added support for Shigella spp., added XDR Shigella prediction #159

Merged
merged 37 commits into from
Dec 5, 2023

Conversation

kapsakcj
Copy link
Contributor

@kapsakcj kapsakcj commented Aug 21, 2023

⚠️ NOTE: I have rebased this branch twice now onto origin/main so that is why it appears that there are lots of files changed and commits included in this PR. Do not fear, these changes have mostly been incorporated into main already, and I will highlight the files changed that are relevant & brought in with this PR.

It seems that resolving merge conflicts has totally un-done the rebase-ing I've done, so the "Files changed" section is much simpler and easier to review.

Setting as a draft until we & CDPH folks are happy with the changes & outputs in Terra CDPH folks are happy, will update everything and start running tests

TODOs:

  • update CI once final resfinder task changes are complete.
  • test 4 workflows in Terra:
  • TheiaProk_Illumina_PE
  • TheiaProk_Illumina_SE
  • TheiaProk_ONT
  • TheiaProk_FASTA

Successful test workflows in Terra

For all test samples in the amrfinderplus_testing_sample data table, I expect similar if not exact results between the ILMN PE, ILMN SE, and FASTA results. There may be minor differences due to assemblies differing between single-end and paired-end inputs, but not drastic differences.

🛠️ Changes Being Made

tasks/gene_typing/task_resfinder.wdl

  • Added support for Shigella species to be run through E.coli pointfinder database,
  • changed output file extensions for TSVs
  • exposed cpu and memory optional inputs
  • changed name of 2 input params to be more recognizable (new names are min_id and call_PointFinder
  • added 8 new String outputs:
    • resfinder_predicted_pheno_resistance: Semicolon delimited list of antimicrobial drugs and associated genes and/or point mutations. <drug1>: <gene1>, <gene1>, <point_mutation1>; <drug2>: <gene3>, <gene4>;
    • resfinder_predicted_xdr_shigella which has 3 potential output strings:
      • Not Shigella based on gambit_predicted_taxon or user input
      • Not XDR Shigella for samples identified as Shigella by GAMBIT or user input BUT does resfinder did not predict resistance to all 6 drugs in XDR definition
      • XDR Shigella which is based on predicted resistance to ceftriazone, azithromycin, ciprofloxacin, trimethoprim, sulfamethoxazole, and ampicillin.
    • resfinder_predicted_resistance_Amp which states either Resistance or No Resistance predicted based on resfinder results
    • resfinder_predicted_resistance_Azm (same as above)
    • resfinder_predicted_resistance_Axo (same as above)
    • resfinder_predicted_resistance_Cip (same as above)
    • resfinder_predicted_resistance_Smx (same as above)
    • resfinder_predicted_resistance_Tmp (same as above)

tasks/utilities/task_broad_terra_tools.wdl

  • Added 8 new resfinder String outputs as inputs to export_taxon_tables task

4 workflows: TheiaProk_Illumina_PE, SE, ONT, and FASTA

  • added 8 new resfinder String outputs to workflows
  • added 8 new resfinder String outputs as inputs to export_taxon_tables call blocks

🧠 Context and Rationale

Mainly wanted to add support for running all Shigella species through the PointFinder E. coli database. Also wanted the ability to easily detect XDR Shigella samples based on the CDC definition here: https://emergency.cdc.gov/han/2023/han00486.asp

CDC defines XDR Shigella bacteria as strains that are resistant to all commonly recommended empiric and alternative antibiotics — azithromycin, ciprofloxacin, ceftriaxone, trimethoprim-sulfamethoxazole (TMP-SMX), and ampicillin.

The WDL task now parses the predicted phenotypes TSV file (NOT the species-specific predicted phenotypes TSV) to look for genes conferring resistance to these drugs, and will output a string telling the user if it meets all parts of the definition.

📋 Workflow/Task Steps

N/A

Inputs

N/A

Outputs

N/A

🧪 Testing

Locally

Tested the WDL task without issue using miniwdl

Terra

Will follow up with more thorough tests once we get feedback from CDPH and make final changes

🔬 Quality checks

Pull Request (PR) checklist:

  • Include a description of what is in this pull request in this message.
  • The workflow/task has been tested locally and on Terra
  • The CI/CD has been adjusted and tests are passing
  • Everything follows the style guide

… to be run through E.coli pointfinder database, changed output file extensions for TSVs, exposed cpu and memory optional inputs, changed name of 2 input params to be more recognizeable
… to be run through E.coli pointfinder database, changed output file extensions for TSVs, exposed cpu and memory optional inputs, changed name of 2 input params to be more recognizeable
…_resistance which is semicolon delimited output of "<antimicrobial>: <Genetic background>"
…xdr_shigella" which checks for predicted resistance to ceftriazone, azithromycin, ciprofloxacin, trimethoprim, fulfamethozazole, and ampicillin
…E, ONT). also added to export_taxon_tables inputs. renamed call block alias for resfinder in theiaprok_ONT to match other workflows
… to be run through E.coli pointfinder database, changed output file extensions for TSVs, exposed cpu and memory optional inputs, changed name of 2 input params to be more recognizeable
…_resistance which is semicolon delimited output of "<antimicrobial>: <Genetic background>"
…xdr_shigella" which checks for predicted resistance to ceftriazone, azithromycin, ciprofloxacin, trimethoprim, fulfamethozazole, and ampicillin
…E, ONT). also added to export_taxon_tables inputs. renamed call block alias for resfinder in theiaprok_ONT to match other workflows
…DR status output strings; added 6 new string outputs with predicted resistance to 6 drugs in XDR classification
…ave XDR non-Shigella samples (like E. coli) that would still be relevant to user
…mn. Either "Not Shigella..." or "XDR Shigella" or "Not XDR Shigella"
… same to export_taxon_tables task to input call block within theiaprok_illumina_pe workflow
@kapsakcj kapsakcj marked this pull request as ready for review November 28, 2023 22:16
@kapsakcj kapsakcj changed the title Resfinder improvements and added support for Shigella spp. Resfinder improvements, added support for Shigella spp., added XDR Shigella prediction Nov 29, 2023
Copy link
Member

@sage-wright sage-wright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@sage-wright sage-wright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sage-wright sage-wright merged commit d46c26e into main Dec 5, 2023
12 checks passed
@kapsakcj kapsakcj deleted the cjk-pointfinder branch December 22, 2023 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants