-
Notifications
You must be signed in to change notification settings - Fork 126
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add genoflu * add readme * add to license * add --force-pkgs-dirs * krakentools add * update readmes * silence warnings about AS casing; fix typo * added comment in dockerfile & added additional dependencies and versions to readme for krakentools * pinning biopython and pandas versions and removing unnecessary jq and gawk * fixed version of python installed listed in krakentools readme * add tbprofiler 6.4.0 * update readme * add list_db * exec format, capitalize AS * update versions * pinning delly version due to conflict resolution with usher 0.6.3 which is required by tb-profiler * added delly to list of deps for tb-profiler. Plus link to GH issue describing troubles w delly * add 6.4.1 * update readme * correct versions of a few dependencies in tbprofiler readme * Update Dockerfile * Update README.md --------- Co-authored-by: Curtis Kapsak <[email protected]>
- Loading branch information
1 parent
97ac25d
commit 312443c
Showing
3 changed files
with
127 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,66 @@ | ||
FROM mambaorg/micromamba:1.5.8 AS app | ||
|
||
USER root | ||
WORKDIR / | ||
|
||
ARG TBPROFILER_VER="6.4.1" | ||
|
||
# this version is the shortened commit hash on the `master` branch here https://github.com/jodyphelan/tbdb/ | ||
# commits are found on https://github.com/jodyphelan/tbdb/commits/master | ||
# this was the latest commit as of 2024-10-31 | ||
ARG TBDB_COMMIT="2c92475219416a449e89782f2b768149d26f7979" | ||
|
||
# LABEL instructions tag the image with metadata that might be important to the user | ||
LABEL base.image="micromamba:1.5.8" | ||
LABEL dockerfile.version="1" | ||
LABEL software="tbprofiler" | ||
LABEL software.version="${TBPROFILER_VER}" | ||
LABEL description="The pipeline aligns reads to the H37Rv reference using bowtie2, BWA or minimap2 and then calls variants using bcftools. These variants are then compared to a drug-resistance database." | ||
LABEL website="https://github.com/jodyphelan/TBProfiler/" | ||
LABEL license="https://github.com/jodyphelan/TBProfiler/blob/master/LICENSE" | ||
LABEL maintainer="John Arnn" | ||
LABEL maintainer.email="[email protected]" | ||
LABEL maintainer2="Curtis Kapsak" | ||
LABEL maintainer2.email="[email protected]" | ||
LABEL maintainer3="Sage Wright" | ||
LABEL maintainer3.email="[email protected]" | ||
|
||
# Install dependencies via apt-get; cleanup apt garbage | ||
RUN apt-get update && apt-get install -y --no-install-recommends \ | ||
wget \ | ||
ca-certificates \ | ||
procps && \ | ||
apt-get autoclean && rm -rf /var/lib/apt/lists/* | ||
|
||
# install tb-profiler via bioconda; install into 'base' conda env | ||
RUN micromamba install --yes --name base --channel conda-forge --channel bioconda \ | ||
tb-profiler=${TBPROFILER_VER} && \ | ||
micromamba clean --all --yes -f && micromamba list | ||
|
||
# hardcode 'base' env bin into PATH, so conda env does not have to be "activated" at run time | ||
ENV PATH="/opt/conda/bin:${PATH}" | ||
|
||
# Version of database can be confirmed at /opt/conda/share/tbprofiler/tbdb.version.json | ||
# can also run 'tb-profiler list_db' to find the same version info | ||
|
||
# https://github.com/jodyphelan/tbdb | ||
RUN tb-profiler update_tbdb --commit ${TBDB_COMMIT} && \ | ||
tb-profiler list_db | ||
|
||
WORKDIR /data | ||
|
||
# Added command to bring help menu up upon running container. | ||
CMD ["tb-profiler"] | ||
|
||
# test stage | ||
FROM app AS test | ||
|
||
# checking if tool is in PATH | ||
RUN tb-profiler && tb-profiler version | ||
|
||
WORKDIR /tests | ||
|
||
# download some TB FASTQs and run through tb-profiler | ||
RUN wget -q ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_1.fastq.gz && \ | ||
wget -q ftp://ftp.sra.ebi.ac.uk/vol1/fastq/ERR166/009/ERR1664619/ERR1664619_2.fastq.gz && \ | ||
tb-profiler profile -1 ERR1664619_1.fastq.gz -2 ERR1664619_2.fastq.gz -t 2 -p ERR1664619 --txt |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
# TBProfiler Container | ||
|
||
Main tool: [TBProfiler](https://github.com/jodyphelan/TBProfiler) | ||
|
||
The pipeline aligns reads to the H37Rv reference using bowtie2, BWA or minimap2 and then calls variants using bcftools. These variants are then compared to a drug-resistance database. It also predicts the number of reads supporting drug resistance variants as an insight into hetero-resistance. | ||
|
||
## Database | ||
|
||
This tool relies on a database to run. The version (AKA git commit hash) of the database that is included in the docker image is `2c92475`. This is from the GitHub repository https://github.com/jodyphelan/tbdb. This can be confirmed in the json file: `/opt/conda/share/tbprofiler/tbdb.variables.json`: | ||
|
||
```bash | ||
$ grep 'commit' /opt/conda/share/tbprofiler/tbdb.variables.json | ||
{"db-schema-version": "1.0.0", "snpEff_db": "Mycobacterium_tuberculosis_h37rv", "drugs": ["rifampicin", "isoniazid", "ethambutol", "pyrazinamide", "moxifloxacin", "levofloxacin", "bedaquiline", "delamanid", "pretomanid", "linezolid", "streptomycin", "amikacin", "kanamycin", "capreomycin", "clofazimine", "ethionamide", "para-aminosalicylic_acid", "cycloserine"], "tb-profiler-version": ">=6.0.0,<7.0.0", "version": {"name": "tbdb", "commit": "2c92475", "Merge": "8918884 2a51937", "Author": "Jody Phelan <[email protected]>", "Date": "Mon Oct 7 17:06:42 2024 +0100", "db-schema-version": "1.0.0"}, "amplicon": false, "files": {"ref": "tbdb.fasta", "gff": "tbdb.gff", "bed": "tbdb.bed", "json_db": "tbdb.dr.json", "variables": "tbdb.variables.json", "spoligotype_spacers": "tbdb.spoligotype_spacers.txt", "spoligotype_annotations": "tbdb.spoligotype_list.csv", "bedmask": "tbdb.mask.bed", "barcode": "tbdb.barcode.bed", "rules": "tbdb.rules.txt"}} | ||
``` | ||
|
||
Additionally you can run the command `tb-profiler list_db` to list the same information | ||
|
||
```bash | ||
$ tb-profiler list_db | ||
tbdb 2c92475 Jody Phelan <[email protected]> Mon Oct 7 17:06:42 2024 +0100 /opt/conda/share/tbprofiler/tbdb | ||
``` | ||
|
||
## Additional included tools/dependencies | ||
|
||
- bedtools 2.31.1 | ||
- gatk4 4.6.1.0 | ||
- kmc 3.2.4 | ||
- pathogen-profiler 4.5.1 | ||
- perl 5.32.1 | ||
- python 3.12.7 | ||
- trimmomatic 0.39 | ||
- bwa 0.7.18 | ||
- minimap2 2.28 | ||
- samtools 1.21 | ||
- bcftools 1.21 | ||
- freebayes 1.3.6 | ||
- tqdm 4.67.0 | ||
- parallel 20240922 | ||
- samclip 0.4.0 | ||
- snpeff 5.2 | ||
- delly 1.2.6 (the more recent version 1.3.1 did not allow for the conda environment to resolve; TBProfiler has specifically pinned v1.2.6. More info here: https://github.com/jodyphelan/TBProfiler/issues/393#issuecomment-2452076859) | ||
|
||
## Example Usage | ||
|
||
Run whole pipeline on Illumina paired-end reads: | ||
|
||
```bash | ||
tb-profiler profile -1 ERR1664619_1.fastq.gz -2 ERR1664619_2.fastq.gz -t 4 -p ERR1664619 --txt | ||
``` | ||
|
||
Make alternative database: | ||
|
||
```bash | ||
tb-profiler create_db --prefix <new_library_name> | ||
tb-profiler load_library --prefix <new_library_name> | ||
``` | ||
|
||
## Updates | ||
|
||
Release 5.0.1 implemented sqlite3 database locking with https://py-filelock.readthedocs.io/en/latest/index.html. This should fix issues using it over network filing systems (NFS). For more information, official documentation can be found [here.](https://jodyphelan.gitbook.io/tb-profiler/) |