Ryan Bartelme, PhD
If you want to test anvi'o on an HPC system, here are a few strategies:
Start by using singularity to pull the latest version of the anvi'o image from dockerhub:
singularity pull docker://meren/anvio
After seeing the standard output of the docker pull command, Singularity will print out something like:
INFO: Creating SIF file...
And the *.sif
file should appear in the directory:
$ ls
anvio_latest.sif
The latest docker image of anvi'o will NOT have the databases configured. This is also an opportune time to create your own customized docker image from the meren/anvio:latest
docker image tag.
See an example: Dockerfile this runs through the database configurations for anvi'o. (As of 03-25-21 this does not properly compile the 3d structure db's)
In this case I used a Dockerfile, where I am building off the anvio-dbconfig
image. The modifications include an installation of ncbi-genome-download using the anvio conda environment pip and setting the entrypoint to the conda environment of anvio for the docker runtime. Note profile is included to make sure the container sources the .bashrc
for the conda path.
Our local cluster singularity version:
singularity-ce version 3.8.0
Building from the Docker image above:
NOTE: This required sudo su
on our local cluster, which I have access to, this has not been tested with --fakeroot
yet.
sudo su
-
Singularity build statement, using Singularity recipe:
singularity build anvio-pangenomics.sif Singularity
-
Get ownership of Singularity
*.sif
file and set group permissions.sudo chown rbartelme:iplant-everyone anvio-pangenomics.sif
-
Read up on job scheduling with your HPC's IT team documentation
Anvi'o has awesome snakemake workflows built in! This is the "end-to-end" approach for all your HPC or cloud compute needs.
Example json input for Comparative Genomics Workflow: