Skip to content

Commit

Permalink
adding cat_pack version 6.0.1
Browse files Browse the repository at this point in the history
  • Loading branch information
erinyoung committed Dec 19, 2024
1 parent 9684e88 commit a5aa4e7
Show file tree
Hide file tree
Showing 4 changed files with 172 additions and 2 deletions.
2 changes: 1 addition & 1 deletion Program_Licenses.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ The licenses of the open-source software that is contained in these Docker image
| BUSCO | MIT | https://gitlab.com/ezlab/busco/-/raw/master/LICENSE |
| BWA | GNU GPLv3 | https://github.com/lh3/bwa/blob/master/COPYING |
| Canu <br/> Racon <br/> Minimap2 | GNU GPLv3 (Canu), <br/> MIT (Racon), <br/> MIT (Minimap2) | https://github.com/marbl/canu/blob/master/README.license.GPL https://github.com/isovic/racon/blob/master/LICENSE https://github.com/lh3/minimap2/blob/master/LICENSE.txt |
| CAT | MIT | https://github.com/MGXlab/CAT_pack?tab=MIT-1-ov-file#readme |
| CAT | MIT | https://github.com/MGXlab/CAT_pack?tab=MIT-1-ov-file#readme and https://github.com/MGXlab/CAT_pack/blob/master/LICENSE.md |
| centroid | GitHub No License | https://github.com/https://github.com/stjacqrm/centroid |
| CDC-SPN | GitHub No License | https://github.com/BenJamesMetcalf/Spn_Scripts_Reference |
| cfsan-snp-pipeline | non-standard license see --> | https://github.com/CFSAN-Biostatistics/snp-pipeline/blob/master/LICENSE.txt |
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ To learn more about the docker pull rate limits and the open source software pro
| [BWA](https://hub.docker.com/r/staphb/bwa) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/bwa)](https://hub.docker.com/r/staphb/bwa) | <ul><li>0.7.17</li><li>[0.7.18](./bwa/0.7.18/)</li></ul> | https://github.com/lh3/bwa |
| [Canu](https://hub.docker.com/r/staphb/canu) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/canu?)](https://hub.docker.com/r/staphb/canu)| <ul><li>2.0</li><li>2.1.1</li><li>2.2</li></ul> | https://canu.readthedocs.io/en/latest/ <BR/> https://github.com/marbl/canu |
| [Canu-Racon](https://hub.docker.com/r/staphb/canu-racon/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/canu-racon)](https://hub.docker.com/r/staphb/canu-racon) | <ul><li>1.7.1 (Canu), 1.3.1 (Racon), 2.13 (minimap2)</li><li>1.9 (Canu), 1.4.3 (Racon), 2.17 (minimap2)</li><li>1.9i (Canu), 1.4.3 (Racon), 2.17 (minimap2), (+racon_preprocess.py)</li><li>2.0 (Canu), 1.4.3 (Racon), 2.17 (minimap2)</li></ul> | https://canu.readthedocs.io/en/latest/ <br/> https://github.com/lbcb-sci/racon <br/> https://github.com/isovic/racon (ARCHIVED) <br/> https://lh3.github.io/minimap2/ |
| [CAT](https://github.com/dutilh/CAT) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cat)](https://hub.docker.com/r/staphb/cat) | <ul><li>[5.3](./cat/5.3)</li></ul> | https://github.com/dutilh/CAT |
| [CAT](https://hub.docker.com/r/staphb/CAT) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cat)](https://hub.docker.com/r/staphb/cat) | <ul><li>[5.3](./cat/5.3)</li><li>[6.0.1](./cat/6.0.1/)</li></ul> | https://github.com/dutilh/CAT / https://github.com/MGXlab/CAT_pack |
| [centroid](https://hub.docker.com/r/staphb/centroid/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/centroid)](https://hub.docker.com/r/staphb/centroid) | <ul><li>1.0.0</li></ul> | https://github.com/stjacqrm/centroid |
| [CDC-SPN](https://hub.docker.com/r/staphb/cdc-spn/) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cdc-spn)](https://hub.docker.com/r/staphb/cdc-spn) | <ul><li>0.1 (no version)</li></ul> | https://github.com/BenJamesMetcalf/Spn_Scripts_Reference |
| [cfsan-snp-pipeline](https://hub.docker.com/r/staphb/cfsan-snp-pipeline) <br/> [![docker pulls](https://badgen.net/docker/pulls/staphb/cfsan-snp-pipeline)](https://hub.docker.com/r/staphb/cfsan-snp-pipeline) | <ul><li>2.0.2</li> <li>2.2.1</li> </ul> | https://github.com/CFSAN-Biostatistics/snp-pipeline |
Expand Down
126 changes: 126 additions & 0 deletions cat/6.0.1/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
# Set global variables
ARG CAT_VER="6.0.1"
ARG SAMTOOLS_VER="1.21"
ARG BWA_VER="0.7.18"
ARG DIAMOND_VER="2.1.10"


FROM ubuntu:jammy AS builder

ARG SAMTOOLS_VER
ARG BWA_VER
ARG DIAMOND_VER

# install dependencies required for compiling samtools
RUN apt-get update && apt-get install --no-install-recommends -y \
libncurses5-dev \
libbz2-dev \
liblzma-dev \
libcurl4-gnutls-dev \
zlib1g-dev \
libssl-dev \
libdeflate-dev \
gcc \
wget \
make \
perl \
bzip2 \
gnuplot \
ca-certificates

# download, compile, and install samtools
RUN wget -q https://github.com/samtools/samtools/releases/download/${SAMTOOLS_VER}/samtools-${SAMTOOLS_VER}.tar.bz2 && \
tar -xjf samtools-${SAMTOOLS_VER}.tar.bz2 && \
cd samtools-${SAMTOOLS_VER} && \
./configure && \
make && \
make install


RUN wget -q https://github.com/lh3/bwa/archive/refs/tags/v${BWA_VER}.tar.gz &&\
tar -xvf v${BWA_VER}.tar.gz &&\
cd bwa-${BWA_VER} &&\
make &&\
mv bwa /usr/local/bin/

RUN wget -q https://github.com/bbuchfink/diamond/releases/download/v${DIAMOND_VER}/diamond-linux64.tar.gz &&\
tar -C /usr/local/bin -xvf diamond-linux64.tar.gz && \
rm diamond-linux64.tar.gz


# Application Stage
FROM ubuntu:jammy AS app
ARG CAT_VER


LABEL base.image="ubuntu:jammy"
LABEL dockerfile.version="1"
LABEL software="CAT"
LABEL software.version=${CAT_VER}
LABEL description="CAT: a tool for taxonomic classification of contigs and metagenome-assembled genomes (MAGs)."
LABEL website="https://github.com/MGXlab/CAT_pack"
LABEL license.url="https://github.com/MGXlab/CAT_pack/blob/master/LICENSE.md"
LABEL maintainer="Taylor K. Paisie"
LABEL maintainer.email='[email protected]'

RUN apt-get update && apt-get install -y --no-install-recommends \
wget \
unzip \
ca-certificates \
python3 \
python3-pip \
prodigal && \
apt-get autoclean && rm -rf /var/lib/apt/lists/*

COPY --from=builder /usr/local/bin/* /usr/local/bin/



RUN wget -q https://github.com/MGXlab/CAT_pack/archive/refs/tags/v${CAT_VER}.tar.gz && \
tar -xvzf v${CAT_VER}.tar.gz && \
chmod +x /CAT_pack-${CAT_VER}/CAT_pack/CAT_pack && \
rm v${CAT_VER}.tar.gz


# Add CAT to PATH
ENV PATH="${PATH}:/CAT_pack-${CAT_VER}/CAT_pack"

CMD ["CAT_pack", "--help"]
WORKDIR /data

# Optional stage: Test data
FROM app AS test

ARG CAT_VER

WORKDIR /data/test

RUN CAT_pack --help && CAT_pack --version

RUN wget -nv --no-check-certificate \
https://raw.githubusercontent.com/taylorpaisie/docker_containers/main/checkm2/1.0.2/burk_wgs.fa \
-O burk_wgs_pos_ctrl.fa &&\
wget -nv --no-check-certificate \
https://merenlab.org/data/refining-mags/files/GN02_MAG_IV_B_1-contigs.fa \
-O GN02_MAG_IV_B_1-contigs.fa

# Prepare testing database
RUN mkdir -p db_tests && \
gzip -d /CAT_pack-${CAT_VER}/tests/data/prepare/small.fa.gz && \
CAT_pack prepare --db_fasta /CAT_pack-${CAT_VER}/tests/data/prepare/small.fa \
--acc2tax /CAT_pack-${CAT_VER}/tests/data/prepare/prot2acc.txt \
--names /CAT_pack-${CAT_VER}/tests/data/prepare/names.dmp \
--nodes /CAT_pack-${CAT_VER}/tests/data/prepare/nodes.dmp \
--db_dir db_tests/

# Running CAT on contigs
RUN CAT_pack contigs -c burk_wgs_pos_ctrl.fa \
-d db_tests/db \
-t db_tests/tax

# Running BAT on a set of MAGs
RUN CAT_pack bins -b GN02_MAG_IV_B_1-contigs.fa \
-d db_tests/db \
-t db_tests/tax

WORKDIR /data
44 changes: 44 additions & 0 deletions cat/6.0.1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# CAT

Main tool: [CAT](https://github.com/MGXlab/CAT_pack)

Code repository: https://github.com/MGXlab/CAT_pack

Basic information on how to use this tool:
- executable: CAT_pack
- help: --help
- version: --version
- description: |
> Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies

Full documentation: https://github.com/MGXlab/CAT_pack


# Testing CAT:
```
# Download test data
wget -nv --no-check-certificate https://raw.githubusercontent.com/taylorpaisie/docker_containers/main/checkm2/1.0.2/burk_wgs.fa -O burk_wgs_pos_ctrl.fa
wget -nv --no-check-certificate https://merenlab.org/data/refining-mags/files/GN02_MAG_IV_B_1-contigs.fa -O GN02_MAG_IV_B_1-contigs.fa
# Prepare testing database
RUN mkdir -p db_tests && \
gzip -d /CAT/tests/data/prepare/small.fa.gz && \
CAT_pack prepare --db_fasta /CAT/tests/data/prepare/small.fa \
--acc2tax /CAT/tests/data/prepare/prot2acc.txt \
--names /CAT/tests/data/prepare/names.dmp \
--nodes /CAT/tests/data/prepare/nodes.dmp \
--db_dir db_tests/
# Use CAT and BAT for taxonomic classification for both best datasets
# Running CAT on contigs
CAT_pack contigs -c test/burk_wgs_pos_ctrl.fa \
-d db_tests/db \
-t db_tests/tax
# Running BAT on a set of MAGs
CAT_pack bins -b test/GN02_MAG_IV_B_1-contigs.fa \
-d db_tests/db \
-t db_tests/tax
```

0 comments on commit a5aa4e7

Please sign in to comment.