-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update page about setting up a private Stratum 1 #157
Changes from 7 commits
0ecd2bb
d2cbf84
12900ff
6bbc5a0
0ff336a
f830adb
b018fb4
a4851b0
d555b4b
2b32fa9
ba51497
40668b1
7c6742c
0fac17c
93b1c34
46a2f2f
59e2f8d
9825341
c6adda4
5ac0a48
52bbfd7
d8bb91e
47e3824
8c8ce18
020139c
9e97d4f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,44 +1,69 @@ | ||
# Setting up a Stratum 1 | ||
|
||
Setting up a Stratum 1 involves the following steps: | ||
|
||
- set up the Stratum 1, preferably by running the Ansible playbook that we provide; | ||
- request a Stratum 0 firewall exception for your Stratum 1 server; | ||
- request a `<your site>.stratum1.cvmfs.eessi-infra.org` DNS entry; | ||
- open a pull request to include the URL to your Stratum 1 in the EESSI configuration. | ||
|
||
The last two steps can be skipped if you want to host a "private" Stratum 1 for your site. | ||
|
||
The EESSI project provides a number of geographically distributed public Stratum 1 servers that you can use to make EESSI available on your machine(s). | ||
If you want to be better protected against network outages and increase the bandwidth between your cluster nodes and the Stratum 1 servers, | ||
you could consider setting up a local (private) Stratum 1 server that replicates the EESSI CVMFS repository. | ||
This guarantees that you always have a full and up-to-date copy of the entire stack available in your local network. | ||
|
||
## Requirements for a Stratum 1 | ||
|
||
The main requirements for a Stratum 1 server are a good network connection to the clients it is going to serve, | ||
and sufficient disk space. For the EESSI repository, a few hundred gigabytes should suffice, but for production | ||
environments at least 1 TB would be recommended. | ||
and sufficient disk space. As the EESSI repository is constantly growing, make sure that the disk space can easily be extended if necessary. | ||
Currently, we recommend to have at least 1 TB available. | ||
|
||
In terms of cores and memory, a machine with just a few (~4) cores and 4-8 GB of memory should suffice. | ||
|
||
Various Linux distributions are supported, but we recommend one based on RHEL 7 or 8. | ||
Various Linux distributions are supported, but we recommend one based on RHEL 8 or 9. | ||
|
||
Finally, make sure that ports 80 and 8000 are open to clients. | ||
|
||
|
||
## Configure the Stratum 1 | ||
|
||
Stratum 1 servers usually replicate from the Stratum 0 server. | ||
bedroge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
In order to ensure the stability and security of the EESSI Stratum 0 server, it has a strict firewall, and only the EESSI-maintained public Stratum 1 servers are allowed to replicate from it. | ||
However, EESSI provides a synchronisation server that can be used for setting up private Stratum 1 replica servers, and this is available at `http://aws-eu-west-s1-sync.eessi.science`. | ||
|
||
!!! warn Potential issues with intrusion prevention systems | ||
In the past we have seen a few occurrences of data transfer issues when files were being pulled in by or from a Stratum 1 server. | ||
In such cases the `cvmfs_server snapshot` command, used for synchronizing the Stratum 1, may break with errors like `failed to download <URL to file>`. | ||
Trying to manually download the mentioned file with `curl` will also not work, and result in errors like: | ||
``` | ||
curl: (56) Recv failure: Connection reset by peer | ||
``` | ||
In all cases this was due to an intrusion prevention system scanning the associated network, and hence scanning all files going in or out of the Stratum 1. | ||
Though it was a false-positive in all cases, this breaks the synchronization procedure of your Stratum 1. | ||
If this is the case, you can try switching to HTTPS by using `https://aws-eu-west-s1-sync.eessi.science` for synchronizing your Stratum 1. | ||
Even though there is no advantage for CVMFS itself in using HTTPS (it has built-in mechasnims for ensuring the integrity of the data), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why aren't we always recommending running with HTTPS? Is there a downside? I'd say there might be a small speed penalty due to the encryption / decription. Maybe mention that this is why plain HTTP is the default. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's indeed the main downside, as far as I know. Added a sentence about it in f830adb. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not only is there no advantage - it should be noted this is a disadvantage because it makes caching in forward proxies impossible (unless, hypothetically, you distribute the private TLS keys of the stratum servers to all the squids so they can do the TLS termination). I would not recommend it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is also a typo: mechasnims There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
this will prevent the described issues, as the intrusion prevention system will not be able to inspect the encrypted data. | ||
As HTTPS does introduce some overhead due to the encryption/decryption, it is still recommended to use HTTP as default. | ||
bedroge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Finally, make sure that ports 80 (for the Apache web server) and 8000 are open. | ||
### Manual configuration | ||
|
||
In order to set up a Stratum 1 manually, you can make use of the instructions in the [Private Stratum 1 replica server](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/access/stratum1/) | ||
section of the MultiXscale tutorial ["Best Practices for CernVM-FS in HPC"](https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/). | ||
|
||
## Step 1: set up the Stratum 1 | ||
### Configuration using Ansible | ||
|
||
The recommended way for setting up an EESSI Stratum 1 is by running the Ansible playbook `stratum1.yml` | ||
from the [filesystem-layer repository on GitHub](https://github.com/EESSI/filesystem-layer). | ||
For the commands in this section, we are assuming that you cloned this repository, and your working directory is `filesystem-layer`. | ||
|
||
Installing a Stratum 1 requires a GEO API license key, which will be used to find the (geographically) closest Stratum 1 server for your client and proxies. | ||
More information on how to (freely) obtain this key is available in the CVMFS documentation: https://cvmfs.readthedocs.io/en/stable/cpt-replica.html#geo-api-setup. | ||
!!! note GEO API | ||
Installing a Stratum 1 usually requires a GEO API license key, which will be used to find the (geographically) closest Stratum 1 server for your client and proxies. | ||
However, for a private Stratum 1 this can be skipped, as clients should just connect to your local Stratum 1 by default. | ||
|
||
If you do want to set up the GEO API, you can find more information on how to (freely) obtain this key in the CVMFS documentation: https://cvmfs.readthedocs.io/en/stable/cpt-replica.html#geo-api-setup. | ||
casparvl marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
You can put your license key in the local configuration file `inventory/local_site_specific_vars.yml`. | ||
|
||
You can put your license key in the local configuration file `inventory/local_site_specific_vars.yml`. | ||
!!! note Squid reverse proxy | ||
The Stratum 1 playbooks also installs and configures a Squid reverse proxy on the server. The template configuration file for Squid can be found at `templates/eessi_stratum1_squid.conf.j2`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wait, is that needed? What's the point of running a Squid next to a Stratum 1 on the same machine? The clients might as well directly connect to the Stratum 1 then, no? (probably my limited knowledge, but might be something to explain here as well) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Caching of data is actually not needed, since the data is already on the same disk. I think the main usecase of this Squid is then to cache GEO API lookups, but since we recommend to disable that on private stratum 1s, this doesn't make sense. Let me check if I can easily introduce an option for this in the playbook, so that it won't set up Squid by default, unless specifically requested (which we can then do in our playbooks for the public servers). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually it does do some in-memory caching as well:
That could perhaps still be somewhat beneficial... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @casparvl . You could just leave the memory to the OS to help cache data for httpd. IIRC Dave recommends using a reverse proxy only for some monitoring capability that OSG uses. A priori I wouldn't expect much performance benefit - if anything it could introduce a small latency. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. With the new version of our playbooks the Squid installation has been made optional, and it's disabled by default. So, to avoid any confusion, I've removed this part in 7c6742c, as I don't think it should be part of this documentation. |
||
If you want to customize it, for instance for limiting the access to the Stratum 1, you can make your own version of this template file | ||
and point to it by setting `local_stratum1_cvmfs_squid_conf_src` in `inventory/local_site_specific_vars.yml`. | ||
See the comments in the example file for more details. | ||
|
||
Furthermore, the Stratum 1 runs a Squid server. The template configuration file can be found at `templates/eessi_stratum1_squid.conf.j2`. | ||
If you want to customize it, for instance for limiting the access to the Stratum 1, you can make your own version of this template file | ||
and point to it by setting `local_stratum1_cvmfs_squid_conf_src` in `inventory/local_site_specific_vars.yml`. | ||
See the comments in the example file for more details. | ||
|
||
Start by installing Ansible: | ||
Start by installing Ansible, e.g.: | ||
|
||
```bash | ||
sudo yum install -y ansible | ||
|
@@ -47,58 +72,34 @@ sudo yum install -y ansible | |
Then install Ansible roles for EESSI: | ||
|
||
```bash | ||
ansible-galaxy role install -r requirements.yml -p ./roles --force | ||
ansible-galaxy role install -r ./requirements.yml -p ./roles --force | ||
``` | ||
|
||
Make sure you have enough space in `/srv` (on the Stratum 1) since the snapshot of the Stratum 0 | ||
will end up there by default. To alter the directory where the snapshot gets copied to you can add | ||
this variable in `inventory/host_vars/<url-or-ip-to-your-stratum1>`: | ||
Make sure you have enough space in `/srv` on the Stratum 1, since the snapshot of the repositories | ||
will end up there by default. To alter the directory where the snapshots get stored you can add | ||
the following variable in `inventory/host_vars/<url-or-ip-to-your-stratum1>`: | ||
|
||
```bash | ||
cvmfs_srv_mount: /srv | ||
cvmfs_srv_mount: /lots/of/space | ||
``` | ||
|
||
Make sure that you have added the hostname or IP address of your server to the | ||
`inventory/hosts` file. Finally, install the Stratum 1 using one of the two following options. | ||
Also make sure that you have added the hostname or IP address of your server to the | ||
`inventory/hosts` file, that you are able to log in to the server from the machine that is going to run the playbook | ||
(preferably using an SSH key), and that you can use `sudo`. | ||
|
||
Option 1: | ||
Finally, install the Stratum 1 using: | ||
|
||
``` bash | ||
# -b to run as root, optionally use -K if a sudo password is required | ||
ansible-playbook -b [-K] -e @inventory/local_site_specific_vars.yml stratum1.yml | ||
``` | ||
|
||
Option2: | ||
|
||
Create a ssh key pair and make sure the `ansible-host-keys.pub` is in the | ||
`$HOME/.ssh/authorized_keys` file on your Stratum 1 server. | ||
|
||
```bash | ||
ssh-keygen -b 2048 -t rsa -f ~/.ssh/ansible-host-keys -q -N "" | ||
# -b to run as root, optionally use -K if a sudo password is required, and optionally include your site-specific variables | ||
ansible-playbook -b [-K] [-e @inventory/local_site_specific_vars.yml] stratum1.yml | ||
``` | ||
|
||
Then run the playbook: | ||
|
||
```bash | ||
ansible-playbook -b --private-key ~/.ssh/ansible-host-keys -e @inventory/local_site_specific_vars.yml stratum1.yml | ||
``` | ||
|
||
Running the playbook will automatically make replicas of all the repositories defined in `group_vars/all.yml`. | ||
|
||
|
||
## Step 2: request a firewall exception | ||
|
||
(This step is not implemented yet and can be skipped) | ||
### Verification of the Stratum 1 using `curl` | ||
|
||
bedroge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
You can request a firewall exception rule to be added for your Stratum 1 server by | ||
[opening an issue on the GitHub page of the filesystem layer repository](https://github.com/EESSI/filesystem-layer/issues/new). | ||
|
||
Make sure to include the IP address of your server. | ||
|
||
## Step 3: Verification of the Stratum 1 | ||
|
||
When the playbook has finished your Stratum 1 should be ready. In order to test your Stratum 1, even | ||
without a client installed, you can use `curl`. | ||
When the playbook has finished, your Stratum 1 should be ready. In order to test your Stratum 1, | ||
even without a client installed, you can use `curl`: | ||
|
||
```bash | ||
curl --head http://<url-or-ip-to-your-stratum1>/cvmfs/software.eessi.io/.cvmfspublished | ||
|
@@ -115,25 +116,30 @@ The second time you run it, you should get a cache hit: | |
|
||
```bash | ||
X-Cache: HIT from <url-or-ip-to-your-stratum1> | ||
|
||
``` | ||
|
||
Example with the Norwegian Stratum 1: | ||
Example with the EESSI Stratum 1 running in AWS: | ||
|
||
```bash | ||
curl --head http://bgo-no.stratum1.cvmfs.eessi-infra.org/cvmfs/software.eessi.io/.cvmfspublished | ||
curl --head http://aws-eu-central-s1.eessi.science/cvmfs/software.eessi.io/.cvmfspublished | ||
``` | ||
|
||
bedroge marked this conversation as resolved.
Show resolved
Hide resolved
|
||
### Verification of the Stratum 1 using a CVMFS client | ||
|
||
You can also test access to your Stratum 1 from a client, for which you will have to install the CVMFS | ||
[client](https://github.com/EESSI/filesystem-layer#clients). | ||
|
||
Then run the following command to add your newly created Stratum 1 to the existing list of EESSI Stratum 1 servers by creating a local CVMFS configuration file: | ||
Then run the following command to prepend your newly created Stratum 1 to the existing list of EESSI Stratum 1 servers by creating a local CVMFS configuration file: | ||
|
||
```bash | ||
echo 'CVMFS_SERVER_URL="http://<url-or-ip-to-your-stratum1>/cvmfs/@fqrn@;$CVMFS_SERVER_URL"' | sudo tee -a /etc/cvmfs/domain.d/eessi-hpc.org.local | ||
``` | ||
|
||
If this is the first time you set up the client you now run: | ||
!!! note | ||
By prepending your new Stratum 1 to the list of existing Stratum 1 servers, your clients should by default use the private Stratum 1. | ||
In case of downtime of your private Stratum 1, they will also still be able to make use of the public EESSI Stratum 1 servers. | ||
|
||
If this is the first time you set up the client, you now run: | ||
|
||
```bash | ||
sudo cvmfs_config setup | ||
|
@@ -151,24 +157,8 @@ Finally, verify that the client connects to your new Stratum 1 by running: | |
cvmfs_config stat -v software.eessi.io | ||
``` | ||
|
||
Assuming that your new Stratum 1 is the geographically closest one to your client, this should return: | ||
Assuming that your new Stratum 1 is working properly, this should return something like: | ||
|
||
```bash | ||
Connection: http://<url-or-ip-to-your-stratum1>/cvmfs/software.eessi.io through proxy DIRECT (online) | ||
``` | ||
|
||
|
||
## Step 4: request an EESSI DNS name | ||
|
||
In order to keep the configuration clean and easy, all the EESSI Stratum 1 servers have a DNS name | ||
`<your site>.stratum1.cvmfs.eessi-infra.org`, where `<your site>` is often a short name or | ||
abbreviation followed by the country code (e.g. `rug-nl` or `bgo-no`). You can request this for | ||
your Stratum 1 by mentioning this in the issue that you created in Step 2, or by opening another | ||
issue. | ||
|
||
## Step 5: include your Stratum 1 in the EESSI configuration | ||
|
||
If you want to include your Stratum 1 in the EESSI configuration, i.e. allow any (nearby) client to be able to use it, | ||
you can open a pull request with updated configuration files. You will only have to add the URL to your Stratum 1 to the | ||
`urls` list of the `eessi_cvmfs_server_urls` variable in the | ||
[`all.yml` file](https://github.com/EESSI/filesystem-layer/blob/main/inventory/group_vars/all.yml). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should say one or multiple (private) Stratum 1 (s)? I don't remember what our recommendation was in the best practice training, I think it was about 1 per 500 clients or so. Or was the recommendation: one stratum 1 + one proxy per 500 clients?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, maybe add that after this sentence, or even at the end of the paragraph: "For large systems, consider setting up multiple Stratum 1 servers. Approximately one stratum 1 per 500 clients is recommended."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was about proxies, indeed. I could still say multiple here, but the advantage of having multiple is really minimal I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could have a call out at the end to mention these kinds of points, doesn't need to be here already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may even want to know how to "upgrade" your own S1 to an S0 so you can sync within your network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is much benefit to have multiple private stratum 1 servers, but multiple proxies is a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a4851b0 adds a sentence with a recommendation to at least have local proxies (bit outside the scope of this page, but probably good to mention that here as well). Left the recommendation for a stratum 1 unchanged, as I don't see much value in having more than one either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, good point by @rptaylor that multiple proxies is sufficient - the only reason to have a stratum 1 is to be resilient against network outage. For that purpose, one is enough.