Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Script to configure and populate an alien cache #37

Open
ocaisa opened this issue Sep 14, 2020 · 9 comments
Open

Script to configure and populate an alien cache #37

ocaisa opened this issue Sep 14, 2020 · 9 comments
Labels
documentation Improvements or additions to documentation

Comments

@ocaisa
Copy link
Member

ocaisa commented Sep 14, 2020

This is a (basic) script to create and populate a CVMFS alien cache for use on systems that do not have access to the internet (but you do require internet access to perform the initial run of the script).

# Set group (as required, useful if you would like to share the cache with others)
MYGROUP=$GROUPS

# Set user
MYUSER=$USER

# Set path to shared space
SHAREDSPACE="/path/to/shared/space"

# Set path to (node) local space to store a local alien cache (e.g., /tmp or /dev/shm)
# WARNING: This directory needs to exist on the nodes where you will mount or you will
#          get a binding error from Singularity!
LOCALSPACE="/tmp"

# Chose the Singularity image to use
STACK="2020.12"
SINGULARITY_REMOTE="client-pilot:centos7-$(uname -m)"

#########################################################################
# Variables below this point can be changed (but they don't need to be) #
#########################################################################

SINGULARITY_IMAGE="$SHAREDSPACE/$MYGROUP/$MYUSER/${SINGULARITY_REMOTE/:/_}.sif"

# Set text colours for info on commands being run
YELLOW='\033[0;33m'
NC='\033[0m' # No Color

# Make the directory structures
SINGULARITY_CVMFS_ALIEN="$SHAREDSPACE/$MYGROUP/alien_$STACK"
mkdir -p $SINGULARITY_CVMFS_ALIEN

SINGULARITY_HOMEDIR="$SHAREDSPACE/$MYGROUP/$MYUSER/home"
mkdir -p $SINGULARITY_HOMEDIR

##################################################
# No more variable definitions beyond this point #
##################################################

# Pull the container
if [ ! -f $SINGULARITY_IMAGE ]; then
    echo -e "${YELLOW}\nPulling singularity image\n${NC}"
    singularity pull $SINGULARITY_IMAGE docker://eessi/$SINGULARITY_REMOTE
fi

# Create a default.local file in the users home
# We use a tiered cache, with a shared alien cache and a local alien cache.
# We populate the shared alien cache and that is used to fill the local
# alien cache (which is usually in a space that gets cleaned up like /tmp or /dev/shm)
if [ ! -f $SINGULARITY_HOMEDIR/default.local ]; then
    echo -e "${YELLOW}\nCreating CVMFS configuration for shared and local alien caches\n${NC}"
    echo "# Custom settings" > $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_WORKSPACE=/var/lib/cvmfs" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_PRIMARY=hpc" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_TYPE=tiered" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_UPPER=local" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_LOWER=alien" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_LOWER_READONLY=yes" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_ALIEN=\"/local_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_ALIEN=\"/shared_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_HTTP_PROXY=\"INVALID-PROXY\"" >> $SINGULARITY_HOMEDIR/default.local
fi


# Environment variables
export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"
export SINGULARITY_HOME="$SINGULARITY_HOMEDIR:/home/$MYUSER"
export SINGULARITY_SCRATCH="/var/lib/cvmfs,/var/run/cvmfs"
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien"

# Create a dirTab file so we only cache the stack we are interested in using
if [ ! -f $SINGULARITY_HOMEDIR/dirTab.$STACK ]; then
    # We will only use this workspace until we have built our dirTab file
    # (this is required because the Singularity scratch dirs are just 16MB,
    #  i.e., not enough to cache what we need to run the python script below.
    #  Once we have an alien cache this is no longer a concern)
    export SINGULARITY_WORKDIR=$(mktemp  -d)

    platform=$(uname -m)
    if [[ $(uname -s) == 'Linux' ]]; then
        os_type='linux'
    else
        os_type='macos'
    fi

    # Find out which software directory we should be using (grep used to filter warnings)
    arch_dir=$(singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE /cvmfs/pilot.eessi-hpc.org/${STACK}/compat/${os_type}/${platform}/usr/bin/python3 /cvmfs/pilot.eessi-hpc.org/${STACK}/init/eessi_software_subdir_for_host.py /cvmfs/pilot.eessi-hpc.org/${STACK} | grep ${platform})

    # Construct our dirTab the alien cache is populated with the software we require
    echo -e "${YELLOW}\nCreating CVMFS dirTab for $STACK alien cache\n${NC}"
    echo "/$STACK/init" > $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/tests" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/compat/${os_type}/${platform}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK
    echo "/$STACK/software/${arch_dir}" >> $SINGULARITY_HOMEDIR/dirTab.$STACK

    # Now clean up the workspace
    rm -r $SINGULARITY_WORKDIR
    unset SINGULARITY_WORKDIR   
fi

# Download the script for populating the alien cache
if [ ! -f $SINGULARITY_HOMEDIR/cvmfs_preload ]; then
    echo -e "${YELLOW}\nGetting CVMFS preload script\n${NC}"
    singularity exec $SINGULARITY_IMAGE curl https://cvmrepo.web.cern.ch/cvmrepo/preload/cvmfs_preload -o /home/$MYUSER/cvmfs_preload
fi

# Get the public keys for our repos
if [ ! -f $SINGULARITY_HOMEDIR/pilot.eessi-hpc.org.pub ]; then
    echo -e "${YELLOW}\nGetting CVMFS repositories public keys\n${NC}"

    export SINGULARITY_WORKDIR=$(mktemp  -d)
    singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /cvmfs/cvmfs-config.eessi-hpc.org/etc/cvmfs/keys/eessi-hpc.org/pilot.eessi-hpc.org.pub /home/$MYUSER/
    singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" $SINGULARITY_IMAGE cp /etc/cvmfs/keys/eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub /home/$MYUSER/

    # Now clean up the workspace
    rm -r $SINGULARITY_WORKDIR
    unset SINGULARITY_WORKDIR   
fi

# Populate the alien cache (the connections to these can fail and may need to be restarted)
#  (A note here: this is an expensive operation and puts a heavy load on the Stratum 0. From the developers:
#   "With the -u <url> preload parameter, you can switch between stratum 0 and stratum 1 as necessary. I'd not
#    necessarily use the stratum 1 for the initial snapshot though because the replication thrashes the stratum
#    1 cache. Instead, for preloading I'd recommend to establish a dedicated URL. This URL can initially be simply
#    an alias to the stratum 0. As long as there are only a handful of preload destinations, that should work fine. If
#    more sites preload, this URL can turn into a dedicated stratum 1 or a large cache in front of the stratum 0.
#   ")
echo -e "${YELLOW}\nPopulating CVMFS alien cache\n${NC}"
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/cvmfs-config.eessi-hpc.org  -r /shared_alien -k /home/$MYUSER/cvmfs-config.eessi-hpc.org.pub 
# We use the dirTab file for the software repo to limit what we pull in
singularity exec $SINGULARITY_IMAGE sh /home/$MYUSER/cvmfs_preload -u http://cvmfs-s0.eessi-hpc.org/cvmfs/pilot.eessi-hpc.org  -r /shared_alien -k /home/$MYUSER/pilot.eessi-hpc.org.pub -d /home/$MYUSER/dirTab.$STACK 

# Now that we have a populated alien cache we can use it
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien,$SINGULARITY_HOMEDIR/default.local:/etc/cvmfs/default.local"

# Get a shell
echo -e "${YELLOW}\nTo get a shell inside a singularity container (for example), use:\n${NC}"
echo -e "  export EESSI_CONFIG=\"$EESSI_CONFIG\""
echo -e "  export EESSI_PILOT=\"$EESSI_PILOT\""
echo -e "  export SINGULARITY_HOME=\"$SINGULARITY_HOME\""
echo -e "  export SINGULARITY_BIND=\"$SINGULARITY_BIND\""
echo -e "  export SINGULARITY_SCRATCH=\"/var/lib/cvmfs,/var/run/cvmfs\""
echo -e "  singularity shell --fusemount \"\$EESSI_CONFIG\" --fusemount \"\$EESSI_PILOT\" $SINGULARITY_IMAGE"
@ocaisa
Copy link
Member Author

ocaisa commented Sep 14, 2020

It is a basic configuration but works for backends not connected to the internet. A more complex configuration is documented at https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#example

@ocaisa
Copy link
Member Author

ocaisa commented Sep 14, 2020

Note that populating the cache is a bit flaky so you may need to run it a few times before things are fully populated.

@ocaisa
Copy link
Member Author

ocaisa commented Sep 14, 2020

Also note that we can easily restrict what is cached using a dirtab file, see https://cvmfs.readthedocs.io/en/stable/cpt-hpc.html#preloading-the-cernvm-fs-cache so we can restrict ourselves to specific releases/arch (or even more fine-grained control)

@rptaylor
Copy link

Public keys should be served via HTTPS (for production use anyway).

@ocaisa
Copy link
Member Author

ocaisa commented Sep 22, 2020

FYI This script allows running MPI applications on workers that are not connected to the internet:

[ocais1@juwels01 ~]$ SLURM_MPI_TYPE=pspmix OMP_NUM_THREADS=2 srun --time=00:05:00 --nodes=1 --ntasks-per-node=24 --cpus-per-task=2 singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" ~/client-pilot_centos7-2020.08.sif /cvmfs/pilot.eessi-hpc.org/2020.12/software/x86_64/intel/haswell/software/GROMACS/2020.1-foss-2020a-Python-3.8.2/bin/gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 10 -g logfile

@ocaisa
Copy link
Member Author

ocaisa commented Oct 5, 2020

Script updated to allow us to query what arch directory we should be pre-populating our alien cache with (rather than pulling in every arch)

@bedroge bedroge added the documentation Improvements or additions to documentation label Jan 8, 2021
@casparvl
Copy link
Contributor

casparvl commented Jan 8, 2021

Just to also log this somewhere: your script was aimed at using an alien cache on a system where the batch nodes don't have internet access. Another use case for the alien cache is if you want to use the EESSI container to run multiprocessing.

E.g. simply bind mounting a local dir and then starting a parallel run

...
export SINGULARITY_BIND="/tmp/$USER/var-run-cvmfs:/var/run/cvmfs,/tmp/$USER/var-lib-cvmfs:/var/lib/cvmfs
srun singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" docker://eessi/client-pilot:centos7-$(uname -m)-2020.12 <some_mpi_program>

Typically fails with the error Failed to initialize loader socket (see e.g. EESSI/docs#40), because multiple instances of the container are launched on an individual node, that all try to use the same /tmp/$USER/var-run-cvmfs.

I've adapted the above script somewhat to remove the preloading, make the lower cache writable, and set a the proxy to DIRECT:

# shared_alien_cache.sh

# Set group (as required, useful if you would like to share the cache with others)
MYGROUP=$GROUPS

# Set user
MYUSER=$USER

# Set path to shared space
SHAREDSPACE="/path/to/shared/space"

# Set path to (node) local space to store a local alien cache (e.g., /tmp or /dev/shm)
# WARNING: This directory needs to exist on the nodes where you will mount or you will
#          get a binding error from Singularity!
LOCALSPACE="/tmp"

# Chose the Singularity image to use
STACK="2020.12"
SINGULARITY_REMOTE="client-pilot:centos7-$(uname -m)"

#########################################################################
# Variables below this point can be changed (but they don't need to be) #
#########################################################################

SINGULARITY_IMAGE="$SHAREDSPACE/$MYGROUP/$MYUSER/${SINGULARITY_REMOTE/:/_}.sif"

# Set text colours for info on commands being run
YELLOW='\033[0;33m'
NC='\033[0m' # No Color

# Make the directory structures
SINGULARITY_CVMFS_ALIEN="$SHAREDSPACE/$MYGROUP/alien_$STACK"
mkdir -p $SINGULARITY_CVMFS_ALIEN

SINGULARITY_HOMEDIR="$SHAREDSPACE/$MYGROUP/$MYUSER/home"
mkdir -p $SINGULARITY_HOMEDIR

##################################################
# No more variable definitions beyond this point #
##################################################


if [ ! -f $SINGULARITY_HOMEDIR/default.local ]; then
    echo -e "${YELLOW}\nCreating CVMFS configuration for shared and local alien caches\n${NC}"
    echo "# Custom settings" > $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_WORKSPACE=/var/lib/cvmfs" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_PRIMARY=hpc" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_TYPE=tiered" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_UPPER=local" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_hpc_LOWER=alien" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_LOWER_READONLY=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_local_ALIEN=\"/local_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_TYPE=posix" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_SHARED=no" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_QUOTA_LIMIT=-1" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_CACHE_alien_ALIEN=\"/shared_alien\"" >> $SINGULARITY_HOMEDIR/default.local
    echo "CVMFS_HTTP_PROXY=\"DIRECT\"" >> $SINGULARITY_HOMEDIR/default.local
fi


# Environment variables
export EESSI_CONFIG="container:cvmfs2 cvmfs-config.eessi-hpc.org /cvmfs/cvmfs-config.eessi-hpc.org"
export EESSI_PILOT="container:cvmfs2 pilot.eessi-hpc.org /cvmfs/pilot.eessi-hpc.org"
export SINGULARITY_HOME="$SINGULARITY_HOMEDIR:/home/$MYUSER"
export SINGULARITY_SCRATCH="/var/lib/cvmfs,/var/run/cvmfs"
# export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien"
export SINGULARITY_BIND="$SINGULARITY_CVMFS_ALIEN:/shared_alien,$LOCALSPACE:/local_alien,$SINGULARITY_HOMEDIR/default.local:/etc/cvmfs/default.local"

# Get a shell
echo -e "${YELLOW}\nTo get a shell inside a singularity container (for example), use:\n${NC}"
echo -e "  export EESSI_CONFIG=\"$EESSI_CONFIG\""
echo -e "  export EESSI_PILOT=\"$EESSI_PILOT\""
echo -e "  export SINGULARITY_HOME=\"$SINGULARITY_HOME\""
echo -e "  export SINGULARITY_BIND=\"$SINGULARITY_BIND\""
echo -e "  export SINGULARITY_SCRATCH=\"/var/lib/cvmfs,/var/run/cvmfs\""
echo -e "  singularity shell --fusemount \"\$EESSI_CONFIG\" --fusemount \"\$EESSI_PILOT\" $SINGULARITY_IMAGE"

I then create a script for the commands I want to run in the container:

#!/bin/bash
# run_gromacs.sh
source /cvmfs/pilot.eessi-hpc.org/2020.12/init/bash
module load GROMACS
gmx_mpi mdrun -s ion_channel.tpr -maxh 0.50 -resethway -noconfout -nsteps 10000 -g logfile

And finally create a job script:

#!/bin/bash
# gromacs_job.sh

#SBATCH ...

source shared_alien_cache.sh

# Copy gromacs runscript plus input to the directory that singularity exec will start in:
cp run_gromacs.sh $SINGULARITY_HOMEDIR/
cp ion_channel.tpr $SINGULARITY_HOMEDIR/

srun --mpi=pmix singularity exec --fusemount "$EESSI_CONFIG" --fusemount "$EESSI_PILOT" docker://eessi/client-pilot:centos7-$(uname -m)-$STACK ./run_gromacs.sh

This works fine for me :)

Note that for 'laptop users' we could make it even simpler: we don't need the tiered caches, and probably just configuring a single, alien cache is sufficient to make it shareable between multiple MPI processes.

@ocaisa
Copy link
Member Author

ocaisa commented Jan 11, 2021

@casparvl Yeah, there are a few different configurations that should be documented here. I wonder is it worth creating something that will generate the right setup for you based on a few settings? Is it complicated enough to consider moving from bash to python?

@casparvl
Copy link
Contributor

I wonder is it worth creating something that will generate the right setup for you based on a few settings?

Yeah, this would be an option. Just as an idea, I think something with arguments like

--local_tmpdir <some_local_dir>
--shared_tmpdir <some_shared_dir> [optional]
--dirtab [optional]

would be sufficient to cover most scenarios I think.

  • Someone on a laptop would simply only define --local_tmpdir, which would configure a single level alien cache (needs to be an alien cache to be shareable between MPI processes).
  • Someone on a cluster with batch nodes that have internet access would define --local_tmpdir and --shared_tmpdir, and would get a two-tiered alien cache, where the lower level is writable.
  • Someone on a cluster with batch nodes that don't have internet access would define all three arguments, and would get a two-tiered alien cache, with a non-writable lower level. Cache would be prefilled based on the dirtab file passed on the command line. We could have the dirtab file manually created by the user, or create another script to generate (just) that, where it automatically determines os_type and architecture (user hás to run it on the batch node then in case the login node has different architecture).

peterstol added a commit to peterstol/filesystem-layer that referenced this issue Feb 18, 2021
Add Ansible task for making symlinks to host files/directories
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants