Skip to content

Running HDDM on CCV Oscar (can be adapted to any SLURM using cluster)

Jae-Young Son edited this page Aug 24, 2021 · 1 revision

Happily, Brown CCV has made it really easy to use Anaconda on Oscar.
For details, read the CCV's documentation on Oscar.
Of particular note are their pages on using Anaconda and on transferring files.

  1. Open up a terminal and start a new Oscar session using SSH, as usual.

  2. If this is your first time loading the Anaconda 2020.02 module, you'll need to type the following. Ideally, you'll then close out of your Oscar session, and start a new Oscar session.

module load anaconda/2020.02
conda init bash
  1. Now, sequentially run each of the following:
module load anaconda/2020.02
conda create -n pyHDDM python=3.6
conda activate pyHDDM
conda install conda-build
conda install pymc=2.3.8 -c conda-forge
conda install pandas patsy
pip install cython
pip install hddm
  1. Ignore the scary-looking messages. Just say yes to everything.

  2. Congrats, you can now run HDDM on the cluster! Once you have a shell script on the server (more on this in a second), you can run it simply by including the following in your shell script:

module load anaconda/2020.02
conda activate pyHDDM
python run_hddm_script.py
  1. Once you've bug-tested your code locally (e.g., in a Jupyter notebook or comparable), you need to collect your code into a single Python file (ends in .py). The philosophy of cluster computing is one that revolves around "jobs" being "scheduled" to run on the cluster, so you also need something called a "shell script" to tell the cluster what to do with your Python code. Here's an example of a shell script. You can copy/paste the following into a text editor (I recommend Sublime, but any plaintext editor will suffice... do NOT use MS Word or Pages), then save it as something like run_hddm_script.sh.
#!/bin/bash

#SBATCH --account=POTENTIALLY_YOUR_ACCOUNT_HERE_ELSE_DELETE_THIS_LINE
#SBATCH --mail-type=END
#SBATCH [email protected]

#SBATCH -t 00:30:00
#SBATCH -n 1
#SBATCH -c 1
#SBATCH --mem=1gb

module load anaconda/2020.02
conda activate pyHDDM
python run_hddm_script.py

echo "Done!"
  1. You'll note that there are lots of calls to #SBATCH. Each of these is an adjustable parameter. Right now, I have the script set to run for time = 30 minutes (using the -t flag). You can modify that if you want to. You can think of nodes (-n) as being like separate computers, and cores (-c) as being the number of CPUs on a single computer. You do NOT want to spread the computation across multiple nodes. Honestly, I forget whether HDDM is threaded, but I don't think it is. (Let me know if I'm wrong!) If I'm right, it means that adding more cores will NOT speed up the computation. Similarly, I forget whether more RAM helps speed up computation, but this is easy for someone to verify (let me know and I'll update this!). In any case, you can adjust these parameters to your liking.

  2. Once you have your Python and shell scripts ready to go, upload them to Oscar. Then, you can run your job by navigating to where your files live (in your terminal), then typing:

sbatch run_hddm_script.sh

You should really read the CCV documentation for details, as it's a very well-written resource, and as the details are beyond the scope of this short tutorial.