diff --git a/docs/filesystem_layer/stratum1.md b/docs/filesystem_layer/stratum1.md index 7c7551f8d..e00662525 100644 --- a/docs/filesystem_layer/stratum1.md +++ b/docs/filesystem_layer/stratum1.md @@ -1,44 +1,66 @@ # Setting up a Stratum 1 -Setting up a Stratum 1 involves the following steps: - -- set up the Stratum 1, preferably by running the Ansible playbook that we provide; -- request a Stratum 0 firewall exception for your Stratum 1 server; -- request a `.stratum1.cvmfs.eessi-infra.org` DNS entry; -- open a pull request to include the URL to your Stratum 1 in the EESSI configuration. - -The last two steps can be skipped if you want to host a "private" Stratum 1 for your site. - +The EESSI project provides a number of geographically distributed public Stratum 1 servers that you can use to make EESSI available on your machine(s). +It is always recommended to have a local caching layer consisting of a few Squid proxies. +If you want to be even better protected against network outages and increase the bandwidth between your cluster nodes and the Stratum 1 servers, +you could also consider setting up a local (private) Stratum 1 server that replicates the EESSI CVMFS repository. +This guarantees that you always have a full and up-to-date copy of the entire stack available in your local network. ## Requirements for a Stratum 1 The main requirements for a Stratum 1 server are a good network connection to the clients it is going to serve, -and sufficient disk space. For the EESSI repository, a few hundred gigabytes should suffice, but for production -environments at least 1 TB would be recommended. +and sufficient disk space. As the EESSI repository is constantly growing, make sure that the disk space can easily be extended if necessary. +Currently, we recommend to have at least 1 TB available. In terms of cores and memory, a machine with just a few (~4) cores and 4-8 GB of memory should suffice. -Various Linux distributions are supported, but we recommend one based on RHEL 7 or 8. +Various Linux distributions are supported, but we recommend one based on RHEL 8 or 9. -Finally, make sure that ports 80 (for the Apache web server) and 8000 are open. +Finally, make sure that ports 80 and 8000 are open to clients. -## Step 1: set up the Stratum 1 +## Configure the Stratum 1 -The recommended way for setting up an EESSI Stratum 1 is by running the Ansible playbook `stratum1.yml` -from the [filesystem-layer repository on GitHub](https://github.com/EESSI/filesystem-layer). +Stratum 1 servers have to synchronize the contents of their CVMFS repositories regularly, and usually they replicate from a CVMFS Stratum 0 server. +In order to ensure the stability and security of the EESSI Stratum 0 server, it has a strict firewall, and only the EESSI-maintained public Stratum 1 servers are allowed to replicate from it. +However, EESSI provides a synchronisation server that can be used for setting up private Stratum 1 replica servers, and this is available at `http://aws-eu-west-s1-sync.eessi.science`. + +!!! warn Potential issues with intrusion prevention systems + In the past we have seen a few occurrences of data transfer issues when files were being pulled in by or from a Stratum 1 server. + In such cases the `cvmfs_server snapshot` command, used for synchronizing the Stratum 1, may break with errors like `failed to download `. + Trying to manually download the mentioned file with `curl` will also not work, and result in errors like: + ``` + curl: (56) Recv failure: Connection reset by peer + ``` + In all cases this was due to an intrusion prevention system scanning the associated network, and hence scanning all files going in or out of the Stratum 1. + Though it was a false-positive in all cases, this breaks the synchronization procedure of your Stratum 1. + If this is the case, you can try switching to HTTPS by using `https://aws-eu-west-s1-sync.eessi.science` for synchronizing your Stratum 1. + Even though there is no advantage for CVMFS itself in using HTTPS (it has built-in mechanisms for ensuring the integrity of the data), + this will prevent the described issues, as the intrusion prevention system will not be able to inspect the encrypted data. + However, not only does HTTPS introduce some overhead due to the encryption/decryption, it also makes caching in forward proxies impossible. + Therefore, it is strongly discouraged to use HTTPS as default. -Installing a Stratum 1 requires a GEO API license key, which will be used to find the (geographically) closest Stratum 1 server for your client and proxies. -More information on how to (freely) obtain this key is available in the CVMFS documentation: https://cvmfs.readthedocs.io/en/stable/cpt-replica.html#geo-api-setup. +### Manual configuration -You can put your license key in the local configuration file `inventory/local_site_specific_vars.yml`. +In order to set up a Stratum 1 manually, you can make use of the instructions in the [Private Stratum 1 replica server](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/access/stratum1/) +section of the MultiXscale tutorial ["Best Practices for CernVM-FS in HPC"](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/). -Furthermore, the Stratum 1 runs a Squid server. The template configuration file can be found at `templates/eessi_stratum1_squid.conf.j2`. -If you want to customize it, for instance for limiting the access to the Stratum 1, you can make your own version of this template file -and point to it by setting `local_stratum1_cvmfs_squid_conf_src` in `inventory/local_site_specific_vars.yml`. -See the comments in the example file for more details. +### Configuration using Ansible -Start by installing Ansible: +The recommended way for setting up an EESSI Stratum 1 is by running the Ansible playbook `stratum1.yml` +from the [filesystem-layer repository on GitHub](https://github.com/EESSI/filesystem-layer). +For the commands in this section, we are assuming that you cloned this repository, and your working directory is `filesystem-layer`. + +!!! note GEO API + Installing a Stratum 1 usually requires a GEO API license key, which will be used to find the (geographically) closest Stratum 1 server for your client and proxies. + However, for a private Stratum 1 this can be skipped, and you can disable the use of the GEO API in the configuration of your clients by setting `CVMFS_USE_GEOAPI=no`. + In this case, they will just connect to your local Stratum 1 by default. + + If you do want to set up the GEO API, you can find more information on how to (freely) obtain this key in the CVMFS documentation: https://cvmfs.readthedocs.io/en/stable/cpt-replica.html#geo-api-setup. + + You can put your license key in the local configuration file `inventory/local_site_specific_vars.yml`. + +Start by installing Ansible, e.g.: ```bash sudo yum install -y ansible @@ -47,128 +69,65 @@ sudo yum install -y ansible Then install Ansible roles for EESSI: ```bash -ansible-galaxy role install -r requirements.yml -p ./roles --force +ansible-galaxy role install -r ./requirements.yml --force ``` -Make sure you have enough space in `/srv` (on the Stratum 1) since the snapshot of the Stratum 0 -will end up there by default. To alter the directory where the snapshot gets copied to you can add -this variable in `inventory/host_vars/`: - +Make sure you have enough space in `/srv` on the Stratum 1, since the snapshots of the repositories +will end up there by default. To alter the directory where the snapshots get stored you can manually +create a symlink before running the playbook: ```bash -cvmfs_srv_mount: /srv +sudo ln -s /lots/of/space/cvmfs /srv/cvmfs ``` -Make sure that you have added the hostname or IP address of your server to the -`inventory/hosts` file. Finally, install the Stratum 1 using one of the two following options. +Also make sure that you have added the hostname or IP address of your server to the +`inventory/hosts` file, that you are able to log in to the server from the machine that is going to run the playbook +(preferably using an SSH key), and that you can use `sudo`. -Option 1: +Finally, install the Stratum 1 using: ``` bash -# -b to run as root, optionally use -K if a sudo password is required -ansible-playbook -b [-K] -e @inventory/local_site_specific_vars.yml stratum1.yml +# -b to run as root, optionally use -K if a sudo password is required, and optionally include your site-specific variables +ansible-playbook -b [-K] [-e @inventory/local_site_specific_vars.yml] stratum1.yml ``` - -Option2: - -Create a ssh key pair and make sure the `ansible-host-keys.pub` is in the -`$HOME/.ssh/authorized_keys` file on your Stratum 1 server. - -```bash -ssh-keygen -b 2048 -t rsa -f ~/.ssh/ansible-host-keys -q -N "" -``` - -Then run the playbook: - -```bash -ansible-playbook -b --private-key ~/.ssh/ansible-host-keys -e @inventory/local_site_specific_vars.yml stratum1.yml -``` - Running the playbook will automatically make replicas of all the repositories defined in `group_vars/all.yml`. -## Step 2: request a firewall exception - -(This step is not implemented yet and can be skipped) - -You can request a firewall exception rule to be added for your Stratum 1 server by -[opening an issue on the GitHub page of the filesystem layer repository](https://github.com/EESSI/filesystem-layer/issues/new). +### Verification of the Stratum 1 using `curl` -Make sure to include the IP address of your server. - -## Step 3: Verification of the Stratum 1 - -When the playbook has finished your Stratum 1 should be ready. In order to test your Stratum 1, even -without a client installed, you can use `curl`. +When the playbook has finished, your Stratum 1 should be ready. In order to test your Stratum 1, +even without a client installed, you can use `curl`: ```bash curl --head http:///cvmfs/software.eessi.io/.cvmfspublished ``` -This should return: +This should return something like: ```bash HTTP/1.1 200 OK ... -X-Cache: MISS from -``` - -The second time you run it, you should get a cache hit: - -```bash -X-Cache: HIT from - +Content-Type: application/x-cvmfs ``` -Example with the Norwegian Stratum 1: +Example with the EESSI Stratum 1 running in AWS: ```bash -curl --head http://bgo-no.stratum1.cvmfs.eessi-infra.org/cvmfs/software.eessi.io/.cvmfspublished +curl --head http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io/.cvmfspublished ``` -You can also test access to your Stratum 1 from a client, for which you will have to install the CVMFS -[client](https://github.com/EESSI/filesystem-layer#clients). - -Then run the following command to add your newly created Stratum 1 to the existing list of EESSI Stratum 1 servers by creating a local CVMFS configuration file: +### Verification of the Stratum 1 using a CVMFS client -```bash -echo 'CVMFS_SERVER_URL="http:///cvmfs/@fqrn@;$CVMFS_SERVER_URL"' | sudo tee -a /etc/cvmfs/domain.d/eessi-hpc.org.local -``` +You can, of course, also test access to your Stratum 1 from a client. +This requires you to install a CernVM-FS client and add the Stratum 1 to the client configuration; +this is explained in more detail on the [native installation page](../getting_access/native_installation.md). -If this is the first time you set up the client you now run: - -```bash -sudo cvmfs_config setup -``` - -If you already had configured the client before, you can simply reload the config: - -```bash -sudo cvmfs_config reload -c software.eessi.io -``` - -Finally, verify that the client connects to your new Stratum 1 by running: +Then verify that the client connects to your new Stratum 1 by running: ```bash cvmfs_config stat -v software.eessi.io ``` -Assuming that your new Stratum 1 is the geographically closest one to your client, this should return: +Assuming that your new Stratum 1 is working properly, this should return something like: ```bash Connection: http:///cvmfs/software.eessi.io through proxy DIRECT (online) ``` - - -## Step 4: request an EESSI DNS name - -In order to keep the configuration clean and easy, all the EESSI Stratum 1 servers have a DNS name -`.stratum1.cvmfs.eessi-infra.org`, where `` is often a short name or -abbreviation followed by the country code (e.g. `rug-nl` or `bgo-no`). You can request this for -your Stratum 1 by mentioning this in the issue that you created in Step 2, or by opening another -issue. - -## Step 5: include your Stratum 1 in the EESSI configuration - -If you want to include your Stratum 1 in the EESSI configuration, i.e. allow any (nearby) client to be able to use it, -you can open a pull request with updated configuration files. You will only have to add the URL to your Stratum 1 to the -`urls` list of the `eessi_cvmfs_server_urls` variable in the -[`all.yml` file](https://github.com/EESSI/filesystem-layer/blob/main/inventory/group_vars/all.yml). diff --git a/docs/getting_access/native_installation.md b/docs/getting_access/native_installation.md index b5cd197f0..e35b3df41 100644 --- a/docs/getting_access/native_installation.md +++ b/docs/getting_access/native_installation.md @@ -1,5 +1,7 @@ # Native installation +## Installation for single clients + Setting up native access to EESSI, that is a system-wide deployment that does not require workarounds like [using a container](eessi_container.md), requires the installation and configuration of [CernVM-FS](https://cernvm.cern.ch/fs). @@ -62,14 +64,58 @@ The good news is that all of this only requires a handful commands :astonished: sudo cvmfs_config setup ``` +## Installation for larger systems (e.g. clusters) + +When using CernVM-FS on a larger number of local clients, e.g. on a HPC cluster or set of workstations, +it is very strongly recommended to at least set up some Squid proxies close to your clients. +These Squid proxies will be used to cache content that was recently accessed by your clients, +which reduces the load on the Stratum 1 servers and reduces the latency for your clients. +As a rule of thumb, you should use about one proxy per 500 clients, and have a minimum of two. +Instructions for setting up a Squid proxy can be found in the [CernVM-FS documentation](https://cvmfs.readthedocs.io/en/stable/cpt-squid.html) and +in the [CernVM-FS tutorial](https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/03_stratum1_proxies/#32-setting-up-a-proxy). + +Additionally, setting up a private Stratum 1, which will make a full copy of the repository, + can be beneficial to improve the latency and bandwidth even further, and to be better protected against network outages. +Instructions for setting up your own EESSI Stratum 1 can be found in [setting up your own CernVM-FS Stratum 1 mirror server](../filesystem_layer/stratum1.md). + +### Configuring your client to use a Squid proxy + +If you have set up one or more Squid proxies, you will have to add them to your CernVM-FS client configuration. +This can be done by removing `CVMFS_CLIENT_PROFILE="single"` from `/etc/cvmfs/default.local`, and add the following line: + +``` +CVMFS_HTTP_PROXY="http://ip-of-your-1st-proxy:port|http://ip-of-your-2nd-proxy:port" +``` + +In this case, both proxies are equally preferable. +More advanced use cases can be found in [the CernVM-FS documentation](https://cvmfs.readthedocs.io/en/stable/cpt-configure.html#proxy-list-examples). + +### Configuring your client to use a private Stratum 1 mirror server + +If you have set up your own Stratum 1 mirror server that replicates the EESSI CernVM-FS repositories, +you can instruct your CernVM-FS client(s) to use it by prepending your newly created Stratum 1 to the existing list of EESSI Stratum 1 servers by creating a local CVMFS configuration file for the EESSI domain: + +```bash +echo 'CVMFS_SERVER_URL="http:///cvmfs/@fqrn@;$CVMFS_SERVER_URL"' | sudo tee -a /etc/cvmfs/domain.d/eessi.io.local +``` + !!! note + By prepending your new Stratum 1 to the list of existing Stratum 1 servers, your clients should by default use the private Stratum 1. + In case of downtime of your private Stratum 1, they will also still be able to make use of the public EESSI Stratum 1 servers. + + +### Applying changes in the CernVM-FS client configuration files + +After you have made any changes to the CernVM-FS client configuration, you will have to apply them. +If this is the first time you set up the client, you can simply run: - :point_up: The commands above only cover the basic installation of EESSI. +```bash +sudo cvmfs_config setup +``` - This is good enough for an individual client, or for testing purposes, - but for a production-quality setup you should also set up a Squid proxy cache. +If you already had configured the client before, you can reload the configuration for the EESSI repository (or, similarly, for any other repository) using: - For large-scale systems, like an HPC cluster, you should also consider setting up your own CernVM-FS Stratum-1 mirror server. +```bash +sudo cvmfs_config reload -c software.eessi.io +``` - For more details on this, please refer to the - [*Stratum 1 and proxies section* of the CernVM-FS tutorial](https://cvmfs-contrib.github.io/cvmfs-tutorial-2021/03_stratum1_proxies/).