Skip to content

Commit

Permalink
Parallelize CVMFS monitoring to bring back the CVMFS Grafana dashboard
Browse files Browse the repository at this point in the history
The script `/usr/bin/check_cvmfs_repos` installed by the CVMFS monitoring role `hxr.monitor-cvmfs` takes longer than 2 minutes to run (the Telegraf timeout for this script) due to misbehaving CVMFS servers and serial execution. This results in no measurements being registered.

```
Jul 25 13:52:00 cvmfs1-ufr0.internal.galaxyproject.eu telegraf[2616631]: 2024-07-25T11:52:00Z E! [inputs.exec] Error in plugin: exec: command timed out for command "/usr/bin/check_cvmfs_repos": /usr/bin/check_cvmfs_repos: line 9: [: : integer expression expected...
```

Add timeout to `curl` calls in `check_cvmfs_repos` script from CVMFS monitoring role `hxr.monitor-cvmfs` and parallelize all `check_repo` calls so that the script is guaranteed to exit before it times out.
  • Loading branch information
kysrpex committed Jul 25, 2024
1 parent 9e1bbfb commit d50c83d
Showing 1 changed file with 5 additions and 3 deletions.
8 changes: 5 additions & 3 deletions roles/hxr.monitor-cvmfs/templates/main.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ check_repo() {
host="$1"
repo="$2"

http_code="$(curl http://$host/cvmfs/$repo/.cvmfspublished -I --silent | head -n 1 | cut -f2 -d' ')"
header="$(curl http://$host/cvmfs/$repo/.cvmfspublished --silent | head -n 12)"
http_code="$(curl --max-time 20 http://$host/cvmfs/$repo/.cvmfspublished -I --silent | head -n 1 | cut -f2 -d' ')"
header="$(curl --max-time 20 http://$host/cvmfs/$repo/.cvmfspublished --silent | head -n 12)"

if [ "$http_code" -eq "200" ]; then
# https://cvmfs.readthedocs.io/en/stable/cpt-details.html#repository-manifest-cvmfspublished
Expand All @@ -21,6 +21,8 @@ check_repo() {

{% for host in cvmfs_check_servers.hosts %}
{% for repo in cvmfs_check_servers.repos %}
check_repo {{ host }} {{ repo }}
check_repo {{ host }} {{ repo }} &
{% endfor %}
{% endfor %}

wait

0 comments on commit d50c83d

Please sign in to comment.