Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Course development: Topic 8 - Managing large datasets and I/O #320

Merged
merged 36 commits into from
May 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
2cda67b
Course development: Topic 8 - Managing large datasets and I/O
rkronberg Feb 26, 2024
09d521d
add new section
rkronberg Feb 26, 2024
b3697cb
Merge branch 'master' of github.com:csc-training/csc-env-eff into 08-…
rkronberg Mar 18, 2024
93d07ba
updated slides and exercises
Mar 25, 2024
3fc6460
Create fast_localdisks.md
yetulaxman Apr 20, 2024
24397d2
Update index.md
yetulaxman Apr 20, 2024
ac27bbc
Rename fast_localdisks.md to fast-local-disks.md
yetulaxman Apr 20, 2024
ae23647
Delete part-2/datamigration directory
yetulaxman Apr 20, 2024
8b8bce4
slides
May 7, 2024
4880fe0
slides
May 7, 2024
6888367
slides
May 8, 2024
a9dc4b8
reorder
rkronberg May 8, 2024
c1664b2
reorder
rkronberg May 8, 2024
d9fa7dc
small edits
rkronberg May 8, 2024
91e0ec8
fix nav
rkronberg May 8, 2024
ec9fb49
fix nav
rkronberg May 8, 2024
cdad782
Course development: Topic 8 - Managing large datasets and I/O
rkronberg Feb 26, 2024
21548d8
add new section
rkronberg Feb 26, 2024
867af96
updated slides and exercises
Mar 25, 2024
26671c1
Create fast_localdisks.md
yetulaxman Apr 20, 2024
2996e5c
Update index.md
yetulaxman Apr 20, 2024
0c32862
Rename fast_localdisks.md to fast-local-disks.md
yetulaxman Apr 20, 2024
d5a7b26
slides
May 7, 2024
e8e2ab0
slides
May 7, 2024
a7924fc
reorder
rkronberg May 8, 2024
eee99ef
small edits
rkronberg May 8, 2024
af2b2dc
fix nav
rkronberg May 8, 2024
3bba0b0
fix nav
rkronberg May 8, 2024
1ccce7b
Merge branch '08-io-dev' of github.com:csc-training/csc-env-eff into …
rkronberg May 8, 2024
feb2195
move files
rkronberg May 8, 2024
94e1545
ood
rkronberg May 8, 2024
2d1a84c
Merge branch 'master' into 08-io-dev
rkronberg May 8, 2024
911d5bb
formatting
rkronberg May 8, 2024
4f48af6
permalinks
rkronberg May 8, 2024
218ba7c
ood note
rkronberg May 8, 2024
d74cd8c
ood note
rkronberg May 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
205 changes: 205 additions & 0 deletions _slides/08_datamigration_io.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,205 @@
---
theme: csc-eurocc-2019
lang: en
---

# Working efficiently with data {.title}

<div class="column">
![](https://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png)
</div>
<div class="column">
<small>
All materials (c) 2020-2024 by CSC – IT Center for Science Ltd.
This work is licensed under a **Creative Commons Attribution-ShareAlike** 4.0
Unported License, [http://creativecommons.org/licenses/by-sa/4.0/](http://creativecommons.org/licenses/by-sa/4.0/)
</small>
</div>

# Outline

- Efficient file I/O in HPC systems
- Using Allas in batch scripts
- Moving data to/from Allas, IDA and LUMI-O
- Transferring data in sensitive data computing
- Cleaning and backing up data
- Working with remote mounts

# Parallel file systems

- A parallel file system (PFS) provides a common file system area that can be accessed from all nodes in a cluster
- Without PFS users would have to always copy all needed data to compute nodes before runs (cf. local disk)
- Also the results would not be visible outside the compute node
- CSC uses **Lustre** parallel file system Puhti and Mahti

# Lustre

<div class=column style="width:43%">
![](img/lustre1.svg){width=100%}
</div>
<div class=column style="width:55%">
- One or more metadata servers (MDS) with metadata targets (MDT) that store the file system metadata
- One or more object storage servers (OSS) with object storage targets (OST) that store the actual file system contents
- Connection to nodes via the high-speed interconnect (InfiniBand)
</div>

# What happens when you access a file?

<div class=column style="width:43%">
![](img/lustre2.svg){width=100%}
</div>
<div class=column style="width:55%">
1. Send metadata request
2. Response with metadata
3. Request data
4. Data response
</div>

# Managing file I/O (1/3)

- Parallel file system (Lustre):
- Shared across all nodes in the cluster (e.g., `/scratch`)
- Optimized for parallel I/O of large files, slow if accessing lots of small files!
- [Temporary local storage (NVMe)](https://docs.csc.fi/computing/disk/#temporary-local-disk-areas):
- Accessible on login nodes (`$TMPDIR`) and to jobs on some compute nodes (`$LOCAL_SCRATCH`)
- Automatically purged after the job finishes
- Availability varies depending on the supercomputer (Puhti/Mahti/LUMI)
- For example, Mahti has NVMe only on login nodes and GPU nodes

# Managing file I/O (2/3)

- To avoid on Lustre:
- Accessing lots of small files, opening/closing a single file in a rapid pace
- Having many files in a single directory
- Use [file striping](https://docs.csc.fi/computing/lustre/#file-striping-and-alignment) to distribute large files across many OSTs
- Use more efficient file formats when possible
- Simply using `tar` and compression is a good start
- High-level I/O libraries and portable file formats like HDF5 or NetCDF
- Enable fast I/O through a single file format and parallel operations
- [AI/ML example: TensorFlow's TFRecords](https://github.com/CSCfi/machine-learning-scripts/blob/master/notebooks/tf2-pets-create-tfrecords.ipynb) – a simple record-oriented binary format
- Docs CSC: [How to achieve better I/O performance on Lustre](https://docs.csc.fi/support/tutorials/lustre_performance/)

# Managing file I/O (3/3)

- Use fast local disk to handle file I/O with lots of small files
- Requires staging and unstaging of data
- `tar xf /scratch/<project>/big_dataset.tar.gz -C $LOCAL_SCRATCH`
- Processing data in memory allows better performance compared to writing to and reading from the disk
- "Ramdisk" (`/dev/shm`) can be used on Mahti nodes without NVMe
- `export TMPDIR=/dev/shm`
- Do not use databases on `/scratch`
- Instead, consider hosting DBs on cloud resources (e.g., [Pukki DBaaS](https://docs.csc.fi/cloud/dbaas/))

# Using Allas in batch jobs

- Swift (all projects, 8-hour) *vs*. S3 protocol (fixed for a project, persistent)
- `allas-conf` needs setting up CSC password interactively
- Jobs may start late and actual job may take longer than 8 hrs
- Use `allas-conf -k`
- stores password in variable `$OS_PASSWORD` to generate a new token automatically
- a-tools regenerate a token using `$OS_PASSWORD` automatically
- `rclone` requires explicitly setting environment variable in batch jobs:
```bash
source /appl/opt/allas-cli-utils/allas_conf -f -k $OS_PROJECT_NAME
```

# Configuring Allas for S3 protocol

- Opening Allas connection in s3mode
- `source allas_conf --mode s3cmd`
- Connection is persistent
- Usage:
- `s3cmd` with endpoint `s3:`
- `rclone` with endpoint `s3allas:`
- `a-put`/`a-get` with `-S` flag

# How to use LUMI-O from Puhti/Mahti

- LUMI-O is very similar to Allas, but it uses only S3 protocol
- In Puhti and Mahti, connection to LUMI-O can be opened with command:
- `allas-conf --lumi`
- Usage:
- Using LUMI-O with `rclone` (endpoint is `lumi-o:`)
- e.g., `rclone lsd lumi-o:`
- One can use a-tools with option `--lumi`
- e.g., `a-list --lumi`
- Docs CSC: [Using Allas and LUMI-O from LUMI](https://docs.csc.fi/data/Allas/allas_lumi/)


# Moving data between LUMI-O and Allas

- Requires activating connections to both LUMI-O and Allas at the same time:
- `allas-conf --mode s3cmd`
- `allas-conf --lumi`
- Use `rclone` with `s3allas:` as endpoint for Allas and `lumi-o`: for LUMI-O
- `rclone copy -P lumi-o:lumi-bucket/object s3allas:allas-bucket/`

# Moving data between IDA and Allas

- Needs transfer of data *via* supercomputer (e.g., Puhti)
- First, [configure IDA in CSC supercomputers](https://docs.csc.fi/data/ida/using_ida/). For example:

```bash
module load ida
ida_configure
ida upload /test123/data1 test_data
ida download /project1 project1_data.zip
```

- Then, move data between Puhti and Allas

# Transferring data for sensitive data computing

- CSC sensitive data services: SD Connect and SD Desktop, use service-specific encryption
- SD Desktop is able to read encrypted data from Allas
- If you want to make your data available in SD Desktop, you need to encrypt the data with the *CSC public key* before data is uploaded to Allas
- Use `a-put` with option `--sdx` or command `a-encrypt` to make your Allas data compatible with SD Desktop
- Upcoming version of SD Connect will change the situation, but new server will be compatible with previously uploaded data as well

# Questions that users should consider

- Should I store each file as a separate object, or should I collect them into bigger chunks?
- In general: consider how you use the data
- Should I use compression?
- Who can use the data: projects and access rights?
- What will happen to my data later on?
- How to keep track of all the data I have in Allas?

# Cleaning and backing up data (1/3)

- **[Disk cleaning](https://docs.csc.fi/support/tutorials/clean-up-data/#automatic-removal-of-files)**
- In force for project disk areas under `/scratch` **on Puhti**
- Files older than 180 days will be removed periodically
- Listed in a purge list, e.g. `/scratch/purge_lists/project_2001234/path_summary.txt`
- *[LCleaner](https://docs.csc.fi/support/tutorials/clean-up-data/#using-lcleaner-to-check-which-files-will-be-automatically-removed)* tool can help you discover which of your files have been targeted for automatic removal
- **Best practice tips**
- Don't save everything automatically
- Use *[LUE](https://docs.csc.fi/support/tutorials/lue/)* tool to analyze your disk usage
- Avoid `du` and `find -size`, these commands are heavy on the file system
- Move important data not in current use to Allas

# Cleaning and backing up data (2/3)

- [`allas-backup`](https://docs.csc.fi/data/Allas/using_allas/a_backup/) command provides an easy-to-use command-line interface for the `restic` backup tool
- Backing up differs from normal storing:
- Incremental (efficient) and version control (no overriding)
- Based on hashes and requires more computing
- Efficient way to store different versions of a dataset

# Cleaning and backing up data (3/3)

- Please note that Allas is intended for storing *active data*
- Project lifetime is usually 1-5 years
- Commands for backing up data:
- `allas-backup --help`
- `allas-backup [add] file-or-directory`
- `allas-backup list `
- `allas-backup restore snapshot-id`

# Working with remote disk mounts

- Using `sshfs` command in Linux/MacOS:
- `mkdir csc_home`
- `sshfs <username>@puhti.csc.fi:/users/<username> csc_home`
- To unmount the file system, give the command:
- `fusermount -u csc_home`
Loading