Home

Thanks to FCT and other funding sources, our research group has access to an high-performance GPU/Deep Learning server to run complex computations. The server can be used by researchers, but please note that it is shared between users (MIR/MER and Health Informatics). Follow the defined netiquette rules.

Netiquette Rules

Do not hoard resources for your tasks (GPUs / Memory / CPU cores)
- General rule is to use 1 GPU per user and a reasonable number of CPU cores
- If you need more, at least warn your colleagues (Skype group) and try to give an estimate (how much and for how long?)
Design / plan and test your experiments with smaller subsets of data to confirm it works before going all in for weeks
- Not cool to hoard resources for long without at least getting results
Do not reboot the machine without asking, especially if there are tasks running
- Similarly, be careful when changing software or killing processes
If editing files remotely (e.g., using remote desktop), make sure you save your work before leaving.
For long experiments use checkpoints (save/load progress) and logs. This saves time in case your process goes down. Random TensorFlow/Keras example here.

Hardware Description

SuperServer 4029GP-TRT2 - 4U Dual Processor (Intel), Single-Root GPU System with Up to 8 PCI-E GPUs. Currently with:

CPU: 2x Intel® Xeon® Silver 4214 Processor (lscpu)
- Frequency: 2.20 GHz (Turbo 3.20 GHz)
- Total Cores: 24 (12 per CPU)
- Total Threads: 48 (24 per CPU)
- Cache: 16.5 MB (per CPU)
RAM: 320 GB DDR4 (sudo lshw -short -C memory)
- 8 x 32GiB DIMM DDR4 Synchronous 2666 MHz (0.4 ns)
- 2 x 32GiB DIMM DDR4 Synchronous 3200 MHz (0.3 ns)
GPUs: 8 installed (lspci | grep -i --color 'vga\|3d\|2d' and nvidia-smi)
- 5 x NVIDIA RTX A5000 24GB GDDR6
  - 8192 CUDA Cores ; 24GB GDDR6 ; 27.8 TFLOPS
- 3 x NVIDIA Quadro P5000 16GB GDDR5X
  - 2560 CUDA Cores ; 16 GB GDDR5X ; 8.9 TFLOPS
HDD: 20+TB (lsblk -f, sudo hwinfo --disk --short, cat /proc/mdstat)
- 2x Intel SSD DC S4500 Series 480GB 2.5in SATA 6Gbs
  - RAID 1, mounted at / (OS)
- 6x Seagate 4TB Barracuda 2.5 5400rpm SATA III 128MB (ST4000LM024)
  - RAID 5, mounter at /home (user files)

Available Software

Several stuff already installed. Should we install multi-user stuff or each user will install as needed?

NVIDIA-SMI 520.61.05 / Driver Version: 520.61.05 / CUDA Version: 11.8
MATLAB R2021b Update 1 (9.11.0.1809720) 64-bit (glnxa64)
- Probably will ask you for a personal license, check matlab activation @ helpdesk
- Can be used also via SSH just to run scripts (explain how here?)
- MATLAB R2014b is also installed (ll /usr/local/bin/matlab_r2014b)
Python 3.8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

Netiquette Rules

Hardware Description

Available Software

Clone this wiki locally