-
Notifications
You must be signed in to change notification settings - Fork 3
Instructions: Using Singularity on the HPC
Singularity is a container software specifically designed for clusters. Application containers allow us to package software into a portable, shareable image. The ability to create a static image with all of the dependencies necessary for a given package or workflow allows us to control the environment in which we test, debug, and execute our code. For scientists, this is extremely useful.
Consider a experiment acquiring neuroimaging data over a long period of time. Given the amount of time it takes to process these data (e.g., Freesurfer alone normally takes ~12 hours), it only makes sense to process new subjects as they are acquired. However, even on HPCs, software packages are likely to be updated more than once within the lifespan of the project. This is a problem. Changes to the software will necessarily induce time-related confounds to the processed data. Two common solutions to this are to either 1) process all of the data after it has been acquired or 2) use project-specific environments on the HPC to specify versions of individual software packages when running processing workflows. The former approach is inefficient (although it does prevent data peeking) in that it may cause substantial delays in analyzing the data after acquisition is complete, while the latter is not exactly secure, as changes on the HPC or unsupervised changes to the environment by lab members can affect results without users' knowledge. Container software like Singularity addresses the weaknesses in both of these approaches.
BIDS Apps are processing and analysis pipelines for neuroimaging data specifically designed to work on datasets organized in BIDS format. These pipelines are able to run on any datasets organized according to this convention (assuming they contain the requisite data, of course). Combined with application container software like Docker or Singularity, this means that the same pipeline will return the same results on the same dataset, no matter where or when you run it!
Moreover, because the majority of these pipelines have been developed by methodologists and have been evaluated in associated publications (e.g., Esteban et al., 2017; Craddock et al., 2013), they are likely to be of higher quality and better validated than pipelines developed in-lab (typically based on some in-lab dataset). Using independently-developed pipelines also reduces the ability and incentive of researchers to leverage the analytic flexibility inherent to neuroimaging data in order to p-hack (whether intentionally or not) their pipelines to produce the most appealing results in their data.
-
Install Docker on a local machine. While Singularity images can be built without Docker, there are already several tools designed specifically to work with neuroimaging software in Docker images.
- Instructions for installing Docker.
- NOTE: Only one lab member needs to install Docker.
-
Build a Docker image for your workflow or environment of interest. Pre-existing workflows (e.g., BIDS Apps), are often released on Docker Hub. Alternatively, if all you have is a GitHub repository associated with the workflow containing a Dockerfile, you can build the Docker image from that. Finally, if you are interested in designing a Docker image with specific versions of different neuroimaging software packages, Neurodocker makes that process very easy.
- To build a Docker image for a BIDS App:
- BIDS Apps are released on Docker Hub. You can locate the necessary repository by searching directly on Docker Hub, by following a link typically provided in the BIDS App's GitHub repository's README, or by following a link on the BIDS-Apps website.
- Docker images hosted on Docker Hub can be built with a simple command
docker pull poldracklab/mriqc:0.10.4
- This is just an example command, which builds the
0.10.4
release of the poldracklab/mriqc BIDS App locally.
- To build a Docker image for a Neurodocker-built environment:
- First, create a Dockerfile with Neurodocker
docker run --rm kaczmarj/neurodocker:v0.3.2 generate -b centos:7 -p yum \ --afni version=latest install_r=True install_python3=True > /Users/tsalo/afnir_docker/Dockerfile
- This is just an example command. It uses the
v0.3.2
release of Neurodocker (kaczmarj/neurodocker) to build an image, which will use Centos 7 as its OS, with yum as its package installer, along with AFNI's latest version including R (and AFNI's R dependencies) and Python 3. Neurodocker's GitHub repository's README has more complete documentation.
- This is just an example command. It uses the
- Next, build a Docker image using the Dockerfile.
docker build /Users/tsalo/afnir_docker/ -t afnir:v0.0.1
- This builds a Docker image on the local machine with the tag
afnir:v0.0.1
. This step can be used for any given Dockerfile, not just one written with Neurodocker.
- This builds a Docker image on the local machine with the tag
- First, create a Dockerfile with Neurodocker
- To build a Docker image for a BIDS App:
-
Build a Singularity image from the Docker image. You do not need to install Singularity locally to do this. In fact, there is a nice Docker image built specifically for this step: singularityware/docker2singularity.
- To build the Singularity image, you must call
docker2singularity
using the image name and tagdocker run -v /var/run/docker.sock:/var/run/docker.sock \ -v /output/directory:/output \ --privileged -t --rm singularityware/docker2singularity \ -m "/scratch" poldracklab/mriqc:0.10.4
- This command builds a Singularity image file (an actual
.img
file) saved to the output directory (in this case,/output/directory
) for the Docker imagepoldracklab/mriqc:0.10.4
. The most important thing to note here is that you need to specifically mount the volume/scratch
in the Singularity image. This is a step necessary specifically for the FIU HPC. Singularity images only have access to specific directories, and without this step, your image will only be able to access your home directory on the server. The Singularity configuration on the FIU HPC explicitly allows users to bindscratch
, but no other shared directories (e.g.,/home/data
).
- To build the Singularity image, you must call
- Copy the Singularity image over to the FIU HPC. The image will probably be ~10-15GB, so allow some time for the image to be copied over.
-
Change the Singularity image permissions. The image will be created with
rw-r--r--
permissions by default, but, in order to use it, you will need to give it at leastwx-wx----
permissions (x
to for you and fellow lab members to run the image andw
for you all to delete the image if necessary.
-
Copy your data to
/scratch
. Your Singularity image can only access/scratch
and your home directory. -
Write a sub file for your job.
- An example sub file for processing data with a BIDS App.
- An example sub file for using a Singularity image as an environment. Not yet figured out, but more information available here.