Skip to content

Commit

Permalink
Merge pull request #1 from EESSI/develop
Browse files Browse the repository at this point in the history
Merge bot changes since we have software.eessi.io
  • Loading branch information
Neves-Bot authored Dec 18, 2023
2 parents e7c6d65 + e883264 commit b661632
Show file tree
Hide file tree
Showing 9 changed files with 233 additions and 50 deletions.
85 changes: 71 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,21 +18,39 @@ The bot consists of two main components provided in this repository:

## <a name="prerequisites"></a>Prerequisites

- GitHub account(s) (two needed for a development scenario), referring to them as `YOU_1` and `YOU_2` below
- A fork, say `YOU_1/software-layer`, of [EESSI/software-layer](https://github.com/EESSI/software-layer) and a fork, say `YOU_2/software-layer` of your first fork if you want to emulate the bot's behaviour but not change EESSI's repository. The EESSI bot will act on events triggered for the target repository (in this context, either `EESSI/software-layer` or `YOU_1/software-layer`).
- Access to a frontend/login node/service node of a Slurm cluster where the EESSI bot components will run. For the sake of brevity, we call this node simply `bot machine`.
- `singularity` with version 3.6 or newer _OR_ `apptainer` with version 1.0 or newer on the compute nodes of the Slurm cluster.
- GitHub account(s) (two needed for a development scenario), referring to them
as `YOU_1` and `YOU_2` below
- A fork, say `YOU_1/software-layer`, of
[EESSI/software-layer](https://github.com/EESSI/software-layer) and a fork,
say `YOU_2/software-layer` of your first fork if you want to emulate the
bot's behaviour but not change EESSI's repository. The EESSI bot will act on
events triggered for the target repository (in this context, either
`EESSI/software-layer` or `YOU_1/software-layer`).
- Access to a frontend/login node/service node of a Slurm cluster where the
EESSI bot components will run. For the sake of brevity, we call this node
simply `bot machine`.
- `singularity` with version 3.6 or newer _OR_ `apptainer` with version 1.0 or
newer on the compute nodes of the Slurm cluster.
- On the cluster frontend (or where the bot components run), different tools
may be needed to run the Smee client. For `x86_64`, `singularity` or
`apptainer` are sufficient. For `aarch64`, the package manager `npm` is
needed.
- The EESSI bot components and the (build) jobs will frequently access the
Internet. Hence, worker nodes and the `bot machine` of the Slurm cluster need
access to the Internet (either directly or via an HTTP proxy).
Internet. Hence, worker nodes and the `bot machine` of the Slurm cluster
need access to the Internet (either directly or via an HTTP proxy).

## <a name="step1"></a>Step 1: Smee.io channel and smee client

We use [smee.io](https://smee.io) as a service to relay events from GitHub to the EESSI bot. To do so, create a new channel via https://smee.io and note the URL, e.g., `https://smee.io/CHANNEL-ID`.
We use [smee.io](https://smee.io) as a service to relay events from GitHub
to the EESSI bot. To do so, create a new channel via https://smee.io and note
the URL, e.g., `https://smee.io/CHANNEL-ID`.

On the `bot machine` we need a tool which receives events relayed from
`https://smee.io/CHANNEL-ID` and forwards it to the EESSI bot. We use the Smee
client for this. The Smee client can be run via a container as follows
client for this.

On machines with `x86_64` architecture, the Smee client can be run via a
container as follows

```
singularity pull docker://deltaprojects/smee-client
Expand All @@ -48,6 +66,25 @@ singularity run smee-client_latest.sif --port 3030 --url https://smee.io/CHANNEL

for specifying a different port than the default (3000).

On machines with `aarch64` architecture, we can install the the smee client via
the `npm` package manager as follows

```
npm install smee-client
```

and then running it with the default port (3000)

```
node_modules/smee-client/bin/smee.js --url https://smee.io/CHANNEL-ID
```

Another port can be used by adding the `--port PORT` argument, for example,

```
node_modules/smee-client/bin/smee.js --port 3030 --url https://smee.io/CHANNEL-ID
```

## <a name="step2"></a>Step 2: Registering GitHub App

We need to:
Expand Down Expand Up @@ -402,10 +439,24 @@ endpoint_url = URL_TO_S3_SERVER
```
`endpoint_url` provides an endpoint (URL) to a server hosting an S3 bucket. The server could be hosted by a commercial cloud provider like AWS or Azure, or running in a private environment, for example, using Minio. The bot uploads tarballs to the bucket which will be periodically scanned by the ingestion procedure at the Stratum 0 server.


```ini
# example: same bucket for all target repos
bucket_name = "eessi-staging"
```
bucket_name = eessi-staging
```ini
# example: bucket to use depends on target repo
bucket_name = {
"eessi-pilot-2023.06": "eessi-staging-2023.06",
"eessi.io-2023.06": "software.eessi.io-2023.06",
}
```
`bucket_name` is the name of the bucket used for uploading of tarballs. The bucket must be available on the default server (`https://${bucket_name}.s3.amazonaws.com`), or the one provided via `endpoint_url`.

`bucket_name` is the name of the bucket used for uploading of tarballs.
The bucket must be available on the default server (`https://${bucket_name}.s3.amazonaws.com`), or the one provided via `endpoint_url`.

`bucket_name` can be specified as a string value to use the same bucket for all target repos, or it can be mapping from target repo id to bucket name.


```
upload_policy = once
Expand Down Expand Up @@ -473,10 +524,10 @@ repos_cfg_dir = PATH_TO_SHARED_DIRECTORY/cfg_bundles
The `repos.cfg` file also uses the `ini` format as follows
```ini
[eessi-2023.06]
repo_name = pilot.eessi-hpc.org
repo_name = software.eessi.io
repo_version = 2023.06
config_bundle = eessi-hpc.org-cfg_files.tgz
config_map = { "eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub", "eessi-hpc.org/ci.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/ci.eessi-hpc.org.pub", "eessi-hpc.org/pilot.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/pilot.eessi-hpc.org.pub", "default.local":"/etc/cvmfs/default.local", "eessi-hpc.org.conf":"/etc/cvmfs/domain.d/eessi-hpc.org.conf"}
config_bundle = eessi.io-cfg_files.tgz
config_map = {"eessi.io/eessi.io.pub":"/etc/cvmfs/keys/eessi.io/eessi.io.pub", "default.local":"/etc/cvmfs/default.local", "eessi.io.conf":"/etc/cvmfs/domain.d/eessi.io.conf"}
container = docker://ghcr.io/eessi/build-node:debian11
```
The repository id is given in brackets (`[eessi-2023.06]`). Then the name of the repository (`repo_name`) and the
Expand Down Expand Up @@ -595,11 +646,17 @@ multiple_tarballs = Found {num_tarballs} tarballs in job dir - only 1 matching `
`multiple_tarballs` is used to report that multiple tarballs have been found.

```
job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job results file `{filename}` does not exist in job directory or reading it failed.</li><li>No artefacts were found/reported.</li></ul></details>
job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job results file `{filename}` does not exist in job directory, or parsing it failed.</li><li>No artefacts were found/reported.</li></ul></details>
```
`job_result_unknown_fmt` is used in case no result file (produced by `bot/check-build.sh`
provided by target repository) was found.

```
job_test_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job test file `{filename}` does not exist in job directory, or parsing it failed.</li></ul></details>
```
`job_test_unknown_fmt` is used in case no test file (produced by `bot/check-test.sh`
provided by target repository) was found.

# Instructions to run the bot components

The bot consists of three components:
Expand Down
16 changes: 16 additions & 0 deletions RELEASE_NOTES
Original file line number Diff line number Diff line change
@@ -1,6 +1,22 @@
This file contains a description of the major changes to the EESSI
build-and-deploy bot. For more detailed information, please see the git log.

v0.2.0 (26 November 2023)
--------------------------

This is a minor release of the EESSI build-and-deploy bot.

Bug fixes:
* adds information on installing and using the smee client on `aarch64` (#233)

Improvements:
* support for running tests inside the same job but after the build step (#222)
* runs `bot/test.sh` and `bot/check-test.sh` if these are provided in the GitHub repository
* adds a new setting (`job_test_unknown_fmt`) in the bot's configuration file
* ensure the bot can build for both the EESSI pilot repository (`pilot.eessi-hpc.org`) and `software.eessi.io` (#229)
* support specifying repository-specific buckets via `bucket_name` in configuration file (#230)


v0.1.1 (14 November 2023)
--------------------------

Expand Down
10 changes: 7 additions & 3 deletions app.cfg.example
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,10 @@ tarball_upload_script = PATH_TO_EESSI_BOT/scripts/eessi-upload-to-staging
# - The latter variant is used for AWS S3 services.
endpoint_url = URL_TO_S3_SERVER

# bucket name
# bucket name:
# can be a string value, to always use same bucket regardless of target repo,
# or can be a mapping of target repo id (see also repo_target_map) to bucket name
# like: bucket_name = {"eessi-pilot-2023.06": "eessi-staging-pilot-2023.06", "eessi.io-2023.06": "software.eessi.io-2023.06"}
bucket_name = eessi-staging

# upload policy: defines what policy is used for uploading built artefacts
Expand Down Expand Up @@ -157,7 +160,7 @@ arch_target_map = { "linux/x86_64/generic" : "--constraint shape=c4.2xlarge", "l
# EESSI/2021.12 and NESSI/2022.11
repo_target_map = { "linux/x86_64/amd/zen2" : ["eessi-2021.12","nessi.no-2022.11"] }

# points to definition of repositories (default EESSI-pilot defined by build container)
# points to definition of repositories (default repository defined by build container)
repos_cfg_dir = PATH_TO_SHARED_DIRECTORY/cfg_bundles


Expand Down Expand Up @@ -214,4 +217,5 @@ missing_modules = Slurm output lacks message "No missing modules!".
no_tarball_message = Slurm output lacks message about created tarball.
no_matching_tarball = No tarball matching `{tarball_pattern}` found in job dir.
multiple_tarballs = Found {num_tarballs} tarballs in job dir - only 1 matching `{tarball_pattern}` expected.
job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_<summary/><ul><li>Job results file `{filename}` does not exist in job directory or reading it failed.</li><li>No artefacts were found/reported.</li></ul></details>
job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_</summary><ul><li>Job results file `{filename}` does not exist in job directory, or parsing it failed.</li><li>No artefacts were found/reported.</li></ul></details>
job_test_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_</summary><ul><li>Job test file `{filename}` does not exist in job directory, or parsing it failed.</li></ul></details>
73 changes: 64 additions & 9 deletions eessi_bot_job_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@
FINISHED_JOB_COMMENTS = "finished_job_comments"
JOB_RESULT_COMMENT_DESCRIPTION = "comment_description"
JOB_RESULT_UNKNOWN_FMT = "job_result_unknown_fmt"
JOB_TEST_COMMENT_DESCRIPTION = "comment_description"
JOB_TEST_UNKNOWN_FMT = "job_test_unknown_fmt"
MISSING_MODULES = "missing_modules"
MULTIPLE_TARBALLS = "multiple_tarballs"
NEW_JOB_COMMENTS = "new_job_comments"
Expand Down Expand Up @@ -285,6 +287,24 @@ def read_job_result(self, job_result_file_path):
else:
return None

def read_job_test(self, job_test_file_path):
"""
Read job test file and return the contents of the 'TEST' section.
Args:
job_test_file_path (string): path to job test file
Returns:
(ConfigParser): instance of ConfigParser corresponding to the
'TEST' section or None
"""
# reuse function from module tools.job_metadata to read metadata file
test = read_metadata_file(job_test_file_path, self.logfile)
if test and "TEST" in test:
return test["TEST"]
else:
return None

def process_new_job(self, new_job):
"""
Process a new job by verifying that it is a bot job and if so
Expand Down Expand Up @@ -470,7 +490,8 @@ def process_finished_job(self, finished_job):
"""
Process a finished job by
- moving the symlink to the directory storing finished jobs,
- updating the PR comment with information from '*.result' file
- updating the PR comment with information from '*.result' and '*.test'
files
Args:
finished_job (dict): dictionary with information about the job
Expand All @@ -497,14 +518,21 @@ def process_finished_job(self, finished_job):
os.rename(old_symlink, new_symlink)

# REPORT status (to logfile in any case, to PR comment if accessible)
# rely fully on what bot/check-build.sh has returned
# check if file _bot_jobJOBID.result exists --> if so read it and
# update PR comment
# contents of *.result file (here we only use section [RESULT])
# [RESULT]
# comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
# status = {SUCCESS,FAILURE,UNKNOWN}
# artefacts = _LIST_OF_ARTEFACTS_TO_BE_DEPLOYED_
# - rely fully on what bot/check-build.sh and bot/check-test.sh have
# returned
# - check if file _bot_jobJOBID.result exists --> if so read it and
# update PR comment
# . contents of *.result file (here we only use section [RESULT])
# [RESULT]
# comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
# status = {SUCCESS,FAILURE,UNKNOWN}
# artefacts = _LIST_OF_ARTEFACTS_TO_BE_DEPLOYED_
# - check if file _bot_jobJOBID.test exists --> if so read it and
# update PR comment
# . contents of *.test file (here we only use section [TEST])
# [TEST]
# comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
# status = {SUCCESS,FAILURE,UNKNOWN}

# obtain format templates from app.cfg
finished_job_comments_cfg = config.read_config()[FINISHED_JOB_COMMENTS]
Expand Down Expand Up @@ -533,6 +561,33 @@ def process_finished_job(self, finished_job):
comment_update = f"\n|{dt.strftime('%b %d %X %Z %Y')}|finished|"
comment_update += f"{comment_description}|"

# check if _bot_jobJOBID.test exits
# TODO if not found, assume test was not run (or failed, or ...) and add
# a message noting that ('not tested' + 'test suite not run or failed')
# --> bot/test.sh and bot/check-test.sh scripts are run in job script used by bot for 'build' action
job_test_file = f"_bot_job{job_id}.test"
job_test_file_path = os.path.join(new_symlink, job_test_file)
job_tests = self.read_job_test(job_test_file_path)

job_test_unknown_fmt = finished_job_comments_cfg[JOB_TEST_UNKNOWN_FMT]
# set fallback comment_description in case no test file was found
# (self.read_job_result returned None)
comment_description = job_test_unknown_fmt.format(filename=job_test_file)
if job_tests:
# get preformatted comment_description or use previously set default for unknown
comment_description = job_tests.get(JOB_TEST_COMMENT_DESCRIPTION, comment_description)

# report to log
log(f"{fn}(): finished job {job_id}, test suite result\n"
f"########\n"
f"comment_description: {comment_description}\n"
f"########\n", self.logfile)

dt = datetime.now(timezone.utc)

comment_update += f"\n|{dt.strftime('%b %d %X %Z %Y')}|test result|"
comment_update += f"{comment_description}|"

# obtain id of PR comment to be updated (from file '_bot_jobID.metadata')
metadata_file = f"_bot_job{job_id}.metadata"
job_metadata_path = os.path.join(new_symlink, metadata_file)
Expand Down
25 changes: 24 additions & 1 deletion scripts/bot-build.slurm
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,27 @@ status = UNKNOWN
artefacts =
EOF
fi
echo "check result step finished"
echo "check build step finished"
TEST_SCRIPT=bot/test.sh
if [ -f ${TEST_SCRIPT} ]; then
echo "${TEST_SCRIPT} script found in '${PWD}', so running it!"
${TEST_SCRIPT}
echo "${TEST_SCRIPT} finished"
else
echo "could not find ${TEST_SCRIPT} script in '${PWD}'" >&2
fi
CHECK_TEST_SCRIPT=bot/check-test.sh
if [ -f ${CHECK_TEST_SCRIPT} ]; then
echo "${CHECK_TEST_SCRIPT} script found in '${PWD}', so running it!"
${CHECK_TEST_SCRIPT}
else
echo "could not find ${CHECK_TEST_SCRIPT} script in '${PWD}' ..."
echo "... depositing default _bot_job${SLURM_JOB_ID}.test file in '${PWD}'"
cat << 'EOF' > _bot_job${SLURM_JOB_ID}.test
[RESULT]
comment_description = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_<summary/><ul><li>Did not find `bot/check-test.sh` script in job's work directory.</li><li>*Check job manually or ask an admin of the bot instance to assist you.*</li></ul></details>
status = UNKNOWN
artefacts =
EOF
fi
echo "check test step finished"
19 changes: 10 additions & 9 deletions tasks/build.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
BUILD_JOB_SCRIPT = "build_job_script"
BUILD_LOGS_DIR = "build_logs_dir"
BUILD_PERMISSION = "build_permission"
CFG_DIRNAME = "cfg"
CONTAINER_CACHEDIR = "container_cachedir"
CVMFS_CUSTOMIZATIONS = "cvmfs_customizations"
DEFAULT_JOB_TIME_LIMIT = "24:00:00"
Expand All @@ -47,6 +48,7 @@
INITIAL_COMMENT = "initial_comment"
JOBS_BASE_DIR = "jobs_base_dir"
JOB_ARCHITECTURE = "architecture"
JOB_CFG_FILENAME = "job.cfg"
JOB_CONTAINER = "container"
JOB_LOCAL_TMP = "local_tmp"
JOB_HTTPS_PROXY = "https_proxy"
Expand All @@ -64,7 +66,6 @@
LOCAL_TMP = "local_tmp"
NO_BUILD_PERMISSION_COMMENT = "no_build_permission_comment"
REPOS_CFG_DIR = "repos_cfg_dir"
REPOS_ID = "repo_id"
REPOS_REPO_NAME = "repo_name"
REPOS_REPO_VERSION = "repo_version"
REPOS_CONFIG_BUNDLE = "config_bundle"
Expand Down Expand Up @@ -198,8 +199,8 @@ def get_repo_cfg(cfg):
(dict): dictionary containing repository settings as follows
- {REPOS_CFG_DIR: path to repository config directory as defined in 'app.cfg'}
- {REPO_TARGET_MAP: json of REPO_TARGET_MAP value as defined in 'app.cfg'}
- for all sections [REPO_ID] defined in REPOS_CFG_DIR/repos.cfg add a
mapping {REPO_ID: dictionary containing settings of that section}
- for all sections [JOB_REPO_ID] defined in REPOS_CFG_DIR/repos.cfg add a
mapping {JOB_REPO_ID: dictionary containing settings of that section}
"""
fn = sys._getframe().f_code.co_name

Expand Down Expand Up @@ -469,9 +470,9 @@ def prepare_jobs(pr, cfg, event_info, action_filter):
log(f"{fn}(): skipping arch {arch} because repo target map does not define repositories to build for")
continue
for repo_id in repocfg[REPO_TARGET_MAP][arch]:
# ensure repocfg contains information about the repository repo_id if repo_id != EESSI-pilot
# Note, EESSI-pilot is a bad/misleading name, it should be more like AS_IN_CONTAINER
if repo_id != "EESSI-pilot" and repo_id not in repocfg:
# ensure repocfg contains information about the repository repo_id if repo_id != EESSI
# Note, EESSI is a bad/misleading name, it should be more like AS_IN_CONTAINER
if (repo_id != "EESSI" and repo_id != "EESSI-pilot") and repo_id not in repocfg:
log(f"{fn}(): skipping repo {repo_id}, it is not defined in repo config {repocfg[REPOS_CFG_DIR]}")
continue

Expand Down Expand Up @@ -529,7 +530,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
"""
fn = sys._getframe().f_code.co_name

jobcfg_dir = os.path.join(job_dir, 'cfg')
jobcfg_dir = os.path.join(job_dir, CFG_DIRNAME)
# create ini file job.cfg with entries:
# [site_config]
# local_tmp = LOCAL_TMP_VALUE
Expand All @@ -538,7 +539,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
#
# [repository]
# repos_cfg_dir = JOB_CFG_DIR
# repo_id = REPO_ID
# repo_id = JOB_REPO_ID
# container = CONTAINER
# repo_name = REPO_NAME
# repo_version = REPO_VERSION
Expand Down Expand Up @@ -595,7 +596,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
# make sure that <jobcfg_dir> exists
os.makedirs(jobcfg_dir, exist_ok=True)

jobcfg_file = os.path.join(jobcfg_dir, 'job.cfg')
jobcfg_file = os.path.join(jobcfg_dir, JOB_CFG_FILENAME)
with open(jobcfg_file, "w") as jcf:
job_cfg.write(jcf)

Expand Down
Loading

0 comments on commit b661632

Please sign in to comment.