Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker workshop: Updated DNAapler to 2.0 #769

Closed
wants to merge 12 commits into from
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [colorid](https://hub.docker.com/r/staphb/colorid) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/colorid)](https://hub.docker.com/r/staphb/colorid) | <ul><li>0.1.4.3</li></ul> | https://github.com/hcdenbakker/colorid |
| [cutshaw-report-env](https://hub.docker.com/r/staphb/cutshaw-report-env) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cutshaw-report-env)](https://hub.docker.com/r/staphb/cutshaw-report-env) | <ul><li>1.0.0</li></ul> | https://github.com/VADGS/CutShaw |
| [datasets-sars-cov-2](https://github.com/CDCgov/datasets-sars-cov-2) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/datasets-sars-cov-2)](https://hub.docker.com/r/staphb/datasets-sars-cov-2) | <ul><li>0.6.2</li><li>0.6.3</li><li>0.7.2</li></ul> | https://github.com/CDCgov/datasets-sars-cov-2 |
| [dnaapler](https://hub.docker.com/r/staphb/dnaapler) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/dnaapler)](https://hub.docker.com/r/staphb/dnaapler) | <ul><li>[0.1.0](dnaapler/0.1.0/)</li></ul> | https://github.com/gbouras13/dnaapler |
| [dnaapler](https://hub.docker.com/r/staphb/dnaapler) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/dnaapler)](https://hub.docker.com/r/staphb/dnaapler) | <ul><li>[0.1.0](dnaapler/0.1.0/)</li></ul> <ul><li>[0.2.0](dnaapler/0.2.0/)</li></ul> | https://github.com/gbouras13/dnaapler |
| [dragonflye](https://hub.docker.com/r/staphb/dragonflye) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/dragonflye)](https://hub.docker.com/r/staphb/dragonflye) | <ul><li>1.0.14</li><li>[1.1.1](dragonflye/1.1.1/)</li></ul> | https://github.com/rpetit3/dragonflye |
| [DSK](https://hub.docker.com/r/staphb/dsk) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/dsk)](https://hub.docker.com/r/staphb/dsk) | <ul><li>0.0.100</li></ul> | https://gatb.inria.fr/software/dsk/ |
| [emboss](https://hub.docker.com/r/staphb/emboss) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/emboss)](https://hub.docker.com/r/staphb/emboss) | <ul><li>6.6.0 (no version)</li></ul> | http://emboss.sourceforge.net |
Expand Down Expand Up @@ -181,7 +181,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [Mugsy](https://hub.docker.com/r/staphb/mugsy) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/mugsy)](https://hub.docker.com/r/staphb/mugsy) | <ul><li>1r2.3</li></ul> | http://mugsy.sourceforge.net/ |
| [MultiQC](https://hub.docker.com/r/staphb/multiqc) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/multiqc)](https://hub.docker.com/r/staphb/multiqc) | <ul><li>1.7</li><li>1.8</li></ul> | https://github.com/ewels/MultiQC |
| [Mummer](https://hub.docker.com/r/staphb/mummer) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/mummer)](https://hub.docker.com/r/staphb/mummer) | <ul><li>4.0.0</li><li>4.0.0 + RGDv2</li><li>4.0.0 + RGDv2 + gnuplot</li></ul> | https://github.com/mummer4/mummer |
| [Mykrobe + Genotyphi + sonneityping](https://hub.docker.com/r/staphb/mykrobe) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/mykrobe)](https://hub.docker.com/r/staphb/mykrobe) | <ul><li>0.11.0 (Mykrobe) & 1.9.1 (Genotyphi) </li><li>0.12.1 (Mykrobe) & 1.9.1 (Genotyphi) & v20210201 (sonneityping) </li><li>0.12.1 (Mykrobe) & 2.0 (Genotyphi) & v20210201 (sonneityping) </li></ul> | https://github.com/Mykrobe-tools/mykrobe <br/> https://github.com/typhoidgenomics/genotyphi <br/> https://github.com/katholt/sonneityping |
| [Mykrobe + Genotyphi + sonneityping](https://hub.docker.com/r/staphb/mykrobe) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/mykrobe)](https://hub.docker.com/r/staphb/mykrobe) | <ul><li>0.11.0 (Mykrobe) & 1.9.1 (Genotyphi) </li><li>0.12.1 (Mykrobe) & 1.9.1 (Genotyphi) & v20210201 (sonneityping) </li><li>0.12.1 (Mykrobe) & 2.0 (Genotyphi) & v20210201 (sonneityping) </li><li>0.12.2 (Mykrobe) & 2.0 (Genotyphi) & v20210201 (sonneityping) </li></ul> | https://github.com/Mykrobe-tools/mykrobe <br/> https://github.com/typhoidgenomics/genotyphi <br/> https://github.com/katholt/sonneityping |
| [NanoPlot](https://hub.docker.com/r/staphb/nanoplot) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/nanoplot)](https://hub.docker.com/r/staphb/nanoplot) | <ul><li>1.27.0</li><li>1.29.0</li><li>1.30.1</li><li>1.32.0</li><li>1.33.0</li><li>1.40.0</li></ul> | https://github.com/wdecoster/NanoPlot |
| [ngmaster](https://hub.docker.com/r/staphb/ngmaster) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/ngmaster)](https://hub.docker.com/r/staphb/ngmaster) | <ul><li>0.5.8</li><li>1.0.0</li></ul> | https://github.com/MDU-PHL/ngmaster |
| [NCBI Datasets](https://hub.docker.com/r/staphb/ncbi-datasets) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/ncbi-datasets)](https://hub.docker.com/r/staphb/ncbi-datasets) | <details><summary> Click to see all datasets versions </summary> **datasets versions** <ul><li>13.31.0</li><li>13.35.0</li><li>13.43.2</li><li>14.0.0</li><li>14.3.0</li><li>14.7.0</li><li>14.13.2</li><li>14.20.0</li><li>[14.27.0](ncbi-datasets/14.27.0/)</li><li>[15.1.0](ncbi-datasets/15.1.0/)</li><li>[15.2.0](ncbi-datasets/15.2.0/)</li><li>[15.11.0](ncbi-datasets/15.11.0/)</li></ul> | [https://github.com/ncbi/datasets](https://github.com/ncbi/datasets) <br/>[https://www.ncbi.nlm.nih.gov/datasets/docs/v1/](https://www.ncbi.nlm.nih.gov/datasets/docs/v1/) |
Expand Down
62 changes: 62 additions & 0 deletions dnaapler/0.2.0/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
FROM mambaorg/micromamba:1.4.1 as app

USER root

WORKDIR /

ARG DNAAPLER_VER="0.2.0"

# metadata labels
LABEL base.image="mambaorg/micromamba:1.4.1"
LABEL dockerfile.version="1"
LABEL software="dnaapler"
LABEL software.version="${DNAAPLER_VER}"
LABEL description="Rotates chromosomes and more"
LABEL website="https://github.com/gbouras13/dnaapler"
LABEL license="MIT"
LABEL license.url="https://github.com/gbouras13/dnaapler/blob/main/LICENSE"
LABEL maintainer="Erin Young"
LABEL maintainer.email="[email protected]"

# install dependencies; cleanup apt garbage
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
procps && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

# create the conda environment, install mykrobe via bioconda package; cleanup conda garbage
RUN micromamba create -n dnaapler -y -c bioconda -c defaults -c conda-forge dnaapler=${DNAAPLER_VER} && \
micromamba clean -a -y

# set the PATH and LC_ALL for singularity compatibility
ENV PATH="/opt/conda/envs/dnaapler/bin/:${PATH}" \
LC_ALL=C.UTF-8

# so that mamba/conda env is active when running below commands
ENV ENV_NAME="dnaapler"
ARG MAMBA_DOCKERFILE_ACTIVATE=1

# set final working directory as /data
WORKDIR /data

# default command is to print help options
CMD [ "dnaapler", "--help" ]

# new base for testing
FROM app as test

# set working directory to /test
WORKDIR /test

# so that mamba/conda env is active when running below commands
ENV ENV_NAME="dnaapler"
ARG MAMBA_DOCKERFILE_ACTIVATE=1

# downloads genome sequence and then extracts the last plasmid in the laziest way possible
RUN wget https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/025/259/185/GCA_025259185.1_ASM2525918v1/GCA_025259185.1_ASM2525918v1_genomic.fna.gz && \
gunzip GCA_025259185.1_ASM2525918v1_genomic.fna.gz && \
grep "CP104365.1" GCA_025259185.1_ASM2525918v1_genomic.fna -A 50000 > CP104365.1.fasta && \
dnaapler mystery --prefix mystery_test --output mystery_test -i CP104365.1.fasta && \
dnaapler plasmid --prefix plasmid_test --output plasmid_test -i CP104365.1.fasta && \
ls mystery_test plasmid_test
41 changes: 41 additions & 0 deletions dnaapler/0.2.0/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# dnaapler container

Main tool : [dnappler](https://github.com/gbouras13/dnaapler)

Additional tools:

- [blast](https://blast.ncbi.nlm.nih.gov/Blast.cgi) 2.14.0

Full documentation: [https://github.com/gbouras13/dnaapler](https://github.com/gbouras13/dnaapler)

> `dnaapler` is a simple python program that takes a single nucleotide input sequence (in FASTA format), finds the desired start gene using blastx against an amino acid sequence database, checks that the start codon of this gene is found, and if so, then reorients the chromosome to begin with this gene on the forward strand.

dnaapler has several commands for chromosomes, plasmids, and more.

```
Usage: dnaapler [OPTIONS] COMMAND [ARGS]...

Options:
-h, --help Show this message and exit.
-V, --version Show the version and exit.

Commands:
chromosome Reorients your sequence to begin with the dnaA chromosomal...
citation Print the citation(s) for this tool
custom Reorients your sequence with a custom database
mystery Reorients your sequence with a random gene
phage Reorients your sequence to begin with the terL large...
plasmid Reorients your sequence to begin with the repA replication...
```

WARNING: Does not support multifasta files. Each sequence must be processed individually.

## Example Usage

```bash
# for a fasta of a chromsome sequence
dnaapler chromosome --input chromosome.fasta --output dnaapler_chr

# for a fasta of a plasmid sequence
dnaapler plasmid --input plasmid.fasta --output dnaapler_plasmid
```
127 changes: 127 additions & 0 deletions mykrobe/0.12.2/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
FROM mambaorg/micromamba:0.27.0 as app

# build and run as root users since micromamba image has 'mambauser' set as the $USER
USER root
# set workdir to default for building; set to /data at the end
WORKDIR /

# ARG variables only persist during build time
ARG MYKROBE_VER="0.12.2"
ARG SONNEITYPING_VER="20210201"
# see below for why we aren't using this. Keeping as a comment for when we can switch to versioned releases
#ARG GENOTYPHI_VER="1.9.1"

# metadata labels
LABEL base.image="mambaorg/micromamba:0.27.0"
LABEL dockerfile.version="1"
LABEL software="Mykrobe & Genotyphi & Sonneityping"
LABEL software.version=${MYKROBE_VER}
LABEL description="Conda environment for Mykrobe, particularly for Genotyphi"
LABEL website1="https://github.com/Mykrobe-tools/mykrobe"
LABEL license1="MIT"
LABEL license1.url="https://github.com/Mykrobe-tools/mykrobe/blob/master/LICENSE"
LABEL website2="https://github.com/katholt/genotyphi"
LABEL license2="GNU General Public License v3.0"
LABEL license2.url="https://github.com/katholt/genotyphi/blob/main/LICENSE"
LABEL website3="https://github.com/katholt/sonneityping/"
LABEL license3="GNU General Public License v3.0"
LABEL license3.url="https://github.com/katholt/sonneityping/blob/master/LICENSE.txt"
LABEL maintainer1="Curtis Kapsak"
LABEL maintainer1.email="[email protected]"

# install dependencies; cleanup apt garbage
RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
ca-certificates \
git \
procps \
jq && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

# get the genotyphi code; make /data
# cloning this commit: 98a6e9ccdf069bb86fcf41035b8c5fa92952aa9e
# url: https://github.com/katholt/genotyphi/commit/98a6e9ccdf069bb86fcf41035b8c5fa92952aa9e
# because genotyphi v1.9.1 does NOT include parse_typhi_mykrobe.py script for parsing mykrobe results
RUN git clone https://github.com/katholt/genotyphi.git && \
cd genotyphi && \
git checkout 98a6e9ccdf069bb86fcf41035b8c5fa92952aa9e && \
chmod +x /genotyphi/parse_typhi_mykrobe.py && \
mkdir -v /data

# Get the sonneityping code
RUN wget https://github.com/katholt/sonneityping/archive/refs/tags/v${SONNEITYPING_VER}.tar.gz && \
tar -xzf v${SONNEITYPING_VER}.tar.gz && \
rm -vf v${SONNEITYPING_VER}.tar.gz && \
mv -v sonneityping-${SONNEITYPING_VER}/ /sonneityping/ && \
chmod +x /sonneityping/parse_mykrobe_predict.py

# set the PATH and LC_ALL for singularity compatibility
ENV PATH="${PATH}:/opt/conda/envs/mykrobe/bin/:/genotyphi:/sonneityping" \
LC_ALL=C.UTF-8

# create the conda environment, install mykrobe via bioconda package; cleanup conda garbage
# INSTALL PANDAS HERE INSTEAD
RUN micromamba create -n mykrobe -y -c conda-forge -c bioconda -c defaults \
mykrobe=${MYKROBE_VER} \
python \
pip \
pandas && \
micromamba clean -a -y

# so that mamba/conda env is active when running below commands
ENV ENV_NAME="mykrobe"
ARG MAMBA_DOCKERFILE_ACTIVATE=1

# get the latest databases (AKA "panels")
RUN mykrobe panels update_metadata && \
mykrobe panels update_species all && \
mykrobe panels describe

WORKDIR /data

# new base for testing
FROM app as test

# so that mamba/conda env is active when running below commands
ENV ENV_NAME="mykrobe"
ARG MAMBA_DOCKERFILE_ACTIVATE=1

# test with TB FASTQs
RUN wget -O test_reads.fq.gz https://ndownloader.figshare.com/files/21059229 && \
mykrobe predict -s SAMPLE -S tb -o out.json --format json -i test_reads.fq.gz && \
cat out.json && \
mykrobe panels describe && \
mykrobe --version

### OUTPUT FROM mykrobe panels describe run on 2022-11-01: ###
# Species summary:

# Species Update_available Installed_version Installed_url Latest_version Latest_url
# sonnei no 20210201 https://ndownloader.figshare.com/files/26274424 20210201 https://ndownloader.figshare.com/files/26274424
# staph no 20201001 https://ndownloader.figshare.com/files/24914930 20201001 https://ndownloader.figshare.com/files/24914930
# tb no 20201014 https://ndownloader.figshare.com/files/25103438 20201014 https://ndownloader.figshare.com/files/25103438
# typhi no 20210323 https://ndownloader.figshare.com/files/28533549 20210323 https://ndownloader.figshare.com/files/28533549

# sonnei default panel: 20210201
# sonnei panels:
# Panel Reference Description
# 20201012 NC_016822.1 Genotyping panel for Shigella sonnei based on scheme defined in Hawkey 2020, and panel for variants in the quinolone resistance determining regions in gyrA and parC
# 20210201 NC_016822.1 Genotyping panel for Shigella sonnei based on scheme defined in Hawkey 2020, and panel for variants in the quinolone resistance determining regions in gyrA and parC (same as 20201012, but with lineage3.7.30 added)

# staph default panel: 20170217
# staph panels:
# Panel Reference Description
# 20170217 BX571856.1 AMR panel described in Bradley, P et al. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun. 6:10063 doi: 10.1038/ncomms10063 (2015)

# tb default panel: 202010
# tb panels:
# Panel Reference Description
# 201901 NC_000962.3 AMR panel based on first line drugs from NEJM-2018 variants (DOI 10.1056/NEJMoa1800474), and second line drugs from Walker 2015 panel
# 202010 NC_000962.3 AMR panel based on first line drugs from NEJM-2018 variants (DOI 10.1056/NEJMoa1800474), second line drugs from Walker 2015 panel, and lineage scheme from Chiner-Oms 2020
# bradley-2015 NC_000962.3 AMR panel described in Bradley, P et al. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nat. Commun. 6:10063 doi: 10.1038/ncomms10063 (2015)
# walker-2015 NC_000962.3 AMR panel described in Walker, Timothy M et al. Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: a retrospective cohort study. The Lancet Infectious Diseases , Volume 15 , Issue 10 , 1193 - 1202

# typhi default panel: 20210323
# typhi panels:
# Panel Reference Description
# 20210323 AL513382.1 GenoTyphi genotyping scheme and AMR calling using Wong et al 2016 (https://doi.org/10.1038/ncomms12827) and updates as described in Dyson & Holt 2021 (https://doi.org/10.1101/2021.04.28.441766)
Loading