Skip to content

Working with the smew servers

Lino Ferreira edited this page Feb 27, 2023 · 3 revisions

This page provides some basic documentation on how to use the group's smew servers, which are integrated into the server network of the Department of Statistics.

Overview and resources

We have two servers, called smew01 and smew02, each with the following specifications:

  • 36 CPUs;
  • 375GB of RAM;
  • 1TB of disk space.

Accessing the servers

To access these servers, you must first log in to the department's gateway server using the following command (see this page on the Statistics website for more info):

ssh -l username gate.stats.ox.ac.uk

Once in the gateway server, you must then log in to one of the Slurm head nodes (servers used to submit jobs to compute servers using the Slurm workload manager). To access the greytail head node, for example, simply run:

ssh greytail

Your prompt should now have changed from blackcap2{username}% to greytail{username}%.

Quick access

As of this writing in Feb 2023, there is no shared storage system for the departmental servers, although one is being prepared. This means that to access the storage of either of the smew servers, you must submit a job to run on that server.

If you want to quickly browse the storage on either smew01 or smew02, you can submit a 'debug' job (which has a maximum duration of 30min) as follows:

srun --pty -t 0:30:0 -M swan --partition=smew01-debug bash

(Replace smew01 with smew02 for accessing the latter server.) Once you do this, you may find that your prompt has not changed and remains greytail{username}%. However, if you now run hostname you should see smew01.cpu.stats.ox.ac.uk which means that you are now running an interactive Bash session on smew01 and can browse the system's storage as usual.

Interactive session

If you wish to do some computational work on one of the servers interactively (meaning that you have access to a command-line prompt and can run commands in real time without pre-specifying them in a script), you can submit an interactive job to the interactive-cpu partition with Slurm.

Note: this is available only for smew01. Stuart McRobert from IT set up this workflow in the summer of 2022 for Simon's MSc students. If it would be useful to have an equivalent setup for smew02 he would probably be the person to contact in the IT office.

Interactive jobs have a maximum duration of 12 hours. You may request up to the maximum number of CPUs and RAM available on the server. For example, to submit a 12-hour job using 4 CPUs and 64GB of RAM in total, run the following command:

srun --pty -t 12:00:00 -M swan --partition=interactive-cpu -n 4 --mem=64GB bash

Submitting a (non-interactive) batch job

For longer jobs, you can submit a 'batch' job with Slurm using the sbatch command (official documentation here) rather than srun. See this page on the internal (SSO login required) Stats IT pages for a detailed description of how to use Slurm on the department's servers.

As a basic example, here is a Bash script which submits a 5min job to smew02 requesting 2 CPUs and 24GB of RAM:

#!/bin/bash

#SBATCH --cluster=swan
#SBATCH --partition=smew01-cpu
#SBATCH --time=0:05:00
#SBATCH --cpus-per-task=2
#SBATCH --mem=24GB

#SBATCH --job-name="test"
#SBATCH --output=test.out

{Bash commands here}

Moving files to and from the servers

The lack of a shared storage system and the fact that the compute nodes are not externally accessible via ssh makes copying files to/from the smew servers non-trivial.

New! Using the 'bitbucket' shared filesystem

A shared filesystem accessible from all departmental Linux systems (including smew!) is now available at /vols/bitbucket/. This was set up in December 2022 and is at an 'early access' stage and still under development.

Your personal space in this file system is in /vols/bitbucket/username/. You can copy a file to this system from your local machine with a command like the following (rsync is also reported to work):

scp myfile.txt [email protected]:/vols/bitbucket/username/

Copying a file from the 'bitbucket' to your local machine is very similar:

scp [email protected]:/vols/bitbucket/username/myfile.txt ./

Finally, you can access the in this directory from either smew01 or smew02 simply at /vols/bitbucket/username/myfile.txt.

The procedures below are more cumbersome but were needed before the new 'bitbucket' became available. We keep these instructions here for future reference.

Copying a file from a local system to your Stats home directory

Suppose you have a file ~/myfile.txt in your home directory on your local machine (e.g. a laptop). You can copy this file to your Statistics home directory by sending it via scp to the gateway server:

scp ~/myfile.txt [email protected]:~/

You should now be able to find your file in the /homes/username directory once you log in to the gateway server, or directly in the storage accessible from you Stats desktop machine.

Copying a file from your Stats home directory to smew

Once you have the file in your home directory on the Statistics network, you can copy it to smew01 or smew02 by starting an interactive section the server you wish to copy the file to and then using scp to copy it from your home directory.

First start a 'debug' session:

srun --pty -t 0:30:0 -M swan --partition=smew01-debug bash

Now copy the file:

scp [email protected]:~/myfile.txt ~/

This will place the file in your home directory on smew01 (/homes/username).

Copying a file from smew to your Stats home directory and then to a local machine

Let's now do the reverse operation and copy the file from smew onto our Stats home directory and finally to our local machine.

First, and while logged in to smew01 or smew02, let's copy the file to the Stats system:

scp myfile.txt greytail.stats.ox.ac.uk:~/

Now log off the smew and Stats servers and return to your local machine. You can copy the file from your Stats internal directory as follows (note that we must now specify the gateway server as the remote host rather than greytail):

scp [email protected]:~/myfile.txt ./