Merge pull request #1 from EESSI/develop

Merge bot changes since we have software.eessi.io
EESSI · Dec 18, 2023 · b661632 · b661632
2 parents e7c6d65 + e883264
commit b661632
Show file tree

Hide file tree

Showing 9 changed files with 233 additions and 50 deletions.
diff --git a/README.md b/README.md
@@ -18,21 +18,39 @@ The bot consists of two main components provided in this repository:
 
 ## <a name="prerequisites"></a>Prerequisites
 
-- GitHub account(s) (two needed for a development scenario), referring to them as `YOU_1` and `YOU_2` below
-- A fork, say `YOU_1/software-layer`, of [EESSI/software-layer](https://github.com/EESSI/software-layer) and a fork, say `YOU_2/software-layer` of your first fork if you want to emulate the bot's behaviour but not change EESSI's repository. The EESSI bot will act on events triggered for the target repository (in this context, either `EESSI/software-layer` or `YOU_1/software-layer`).
-- Access to a frontend/login node/service node of a Slurm cluster where the EESSI bot components will run. For the sake of brevity, we call this node simply `bot machine`.
-- `singularity` with version 3.6 or newer _OR_ `apptainer` with version 1.0 or newer on the compute nodes of the Slurm cluster.
+- GitHub account(s) (two needed for a development scenario), referring to them
+  as `YOU_1` and `YOU_2` below
+- A fork, say `YOU_1/software-layer`, of
+  [EESSI/software-layer](https://github.com/EESSI/software-layer) and a fork,
+  say `YOU_2/software-layer` of your first fork if you want to emulate the
+  bot's behaviour but not change EESSI's repository. The EESSI bot will act on
+  events triggered for the target repository (in this context, either
+  `EESSI/software-layer` or `YOU_1/software-layer`).
+- Access to a frontend/login node/service node of a Slurm cluster where the
+  EESSI bot components will run. For the sake of brevity, we call this node
+  simply `bot machine`.
+- `singularity` with version 3.6 or newer _OR_ `apptainer` with version 1.0 or
+  newer on the compute nodes of the Slurm cluster.
+- On the cluster frontend (or where the bot components run), different tools
+  may be needed to run the Smee client. For `x86_64`, `singularity` or
+  `apptainer` are sufficient. For `aarch64`, the package manager `npm` is
+  needed.
 - The EESSI bot components and the (build) jobs will frequently access the
-  Internet. Hence, worker nodes and the `bot machine` of the Slurm cluster need
-access to the Internet (either directly or via an HTTP proxy).
+  Internet. Hence, worker nodes and the `bot machine` of the Slurm cluster
+  need access to the Internet (either directly or via an HTTP proxy).
 
 ## <a name="step1"></a>Step 1: Smee.io channel and smee client
 
-We use [smee.io](https://smee.io) as a service to relay events from GitHub to the EESSI bot. To do so, create a new channel via https://smee.io and note the URL, e.g., `https://smee.io/CHANNEL-ID`.
+We use [smee.io](https://smee.io) as a service to relay events from GitHub
+to the EESSI bot. To do so, create a new channel via https://smee.io and note
+the URL, e.g., `https://smee.io/CHANNEL-ID`.
 
 On the `bot machine` we need a tool which receives events relayed from
 `https://smee.io/CHANNEL-ID` and forwards it to the EESSI bot. We use the Smee
-client for this. The Smee client can be run via a container as follows
+client for this.
+
+On machines with `x86_64` architecture, the Smee client can be run via a
+container as follows
 
 ```
 singularity pull docker://deltaprojects/smee-client
@@ -48,6 +66,25 @@ singularity run smee-client_latest.sif --port 3030 --url https://smee.io/CHANNEL
 
 for specifying a different port than the default (3000).
 
+On machines with `aarch64` architecture, we can install the the smee client via
+the `npm` package manager as follows
+
+```
+npm install smee-client
+```
+
+and then running it with the default port (3000)
+
+```
+node_modules/smee-client/bin/smee.js --url https://smee.io/CHANNEL-ID
+```
+
+Another port can be used by adding the `--port PORT` argument, for example,
+
+```
+node_modules/smee-client/bin/smee.js --port 3030 --url https://smee.io/CHANNEL-ID
+```
+
 ## <a name="step2"></a>Step 2: Registering GitHub App
 
 We need to:
@@ -402,10 +439,24 @@ endpoint_url = URL_TO_S3_SERVER
 ```
 `endpoint_url` provides an endpoint (URL) to a server hosting an S3 bucket. The server could be hosted by a commercial cloud provider like AWS or Azure, or running in a private environment, for example, using Minio. The bot uploads tarballs to the bucket which will be periodically scanned by the ingestion procedure at the Stratum 0 server.
 
+
+```ini
+# example: same bucket for all target repos
+bucket_name = "eessi-staging"
 ```
-bucket_name = eessi-staging
+```ini
+# example: bucket to use depends on target repo
+bucket_name = {
+    "eessi-pilot-2023.06": "eessi-staging-2023.06",
+    "eessi.io-2023.06": "software.eessi.io-2023.06",
+}
 ```
-`bucket_name` is the name of the bucket used for uploading of tarballs. The bucket must be available on the default server (`https://${bucket_name}.s3.amazonaws.com`), or the one provided via `endpoint_url`.
+
+`bucket_name` is the name of the bucket used for uploading of tarballs.
+The bucket must be available on the default server (`https://${bucket_name}.s3.amazonaws.com`), or the one provided via `endpoint_url`.
+
+`bucket_name` can be specified as a string value to use the same bucket for all target repos, or it can be mapping from target repo id to bucket name.
+
 
 ```
 upload_policy = once
@@ -473,10 +524,10 @@ repos_cfg_dir = PATH_TO_SHARED_DIRECTORY/cfg_bundles
 The `repos.cfg` file also uses the `ini` format as follows
 ```ini
 [eessi-2023.06]
-repo_name = pilot.eessi-hpc.org
+repo_name = software.eessi.io
 repo_version = 2023.06
-config_bundle = eessi-hpc.org-cfg_files.tgz
-config_map = { "eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/cvmfs-config.eessi-hpc.org.pub", "eessi-hpc.org/ci.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/ci.eessi-hpc.org.pub", "eessi-hpc.org/pilot.eessi-hpc.org.pub":"/etc/cvmfs/keys/eessi-hpc.org/pilot.eessi-hpc.org.pub", "default.local":"/etc/cvmfs/default.local", "eessi-hpc.org.conf":"/etc/cvmfs/domain.d/eessi-hpc.org.conf"}
+config_bundle = eessi.io-cfg_files.tgz
+config_map = {"eessi.io/eessi.io.pub":"/etc/cvmfs/keys/eessi.io/eessi.io.pub", "default.local":"/etc/cvmfs/default.local", "eessi.io.conf":"/etc/cvmfs/domain.d/eessi.io.conf"}
 container = docker://ghcr.io/eessi/build-node:debian11
 ```
 The repository id is given in brackets (`[eessi-2023.06]`). Then the name of the repository (`repo_name`) and the
@@ -595,11 +646,17 @@ multiple_tarballs = Found {num_tarballs} tarballs in job dir - only 1 matching `
 `multiple_tarballs` is used to report that multiple tarballs have been found.
 
 ```
-job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job results file `{filename}` does not exist in job directory or reading it failed.</li><li>No artefacts were found/reported.</li></ul></details>
+job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job results file `{filename}` does not exist in job directory, or parsing it failed.</li><li>No artefacts were found/reported.</li></ul></details>
 ```
 `job_result_unknown_fmt` is used in case no result file (produced by `bot/check-build.sh`
 provided by target repository) was found.
 
+```
+job_test_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for details)_</summary><ul><li>Job test file `{filename}` does not exist in job directory, or parsing it failed.</li></ul></details>
+```
+`job_test_unknown_fmt` is used in case no test file (produced by `bot/check-test.sh`
+provided by target repository) was found.
+
 # Instructions to run the bot components
 
 The bot consists of three components:

diff --git a/RELEASE_NOTES b/RELEASE_NOTES
@@ -1,6 +1,22 @@
 This file contains a description of the major changes to the EESSI
 build-and-deploy bot. For more detailed information, please see the git log.
 
+v0.2.0 (26 November 2023)
+--------------------------
+
+This is a minor release of the EESSI build-and-deploy bot.
+
+Bug fixes:
+* adds information on installing and using the smee client on `aarch64` (#233)
+
+Improvements:
+* support for running tests inside the same job but after the build step (#222)
+  * runs `bot/test.sh` and `bot/check-test.sh` if these are provided in the GitHub repository
+  * adds a new setting (`job_test_unknown_fmt`) in the bot's configuration file
+* ensure the bot can build for both the EESSI pilot repository (`pilot.eessi-hpc.org`) and `software.eessi.io` (#229)
+* support specifying repository-specific buckets via `bucket_name` in configuration file (#230)
+
+
 v0.1.1 (14 November 2023)
 --------------------------
 

diff --git a/app.cfg.example b/app.cfg.example
@@ -123,7 +123,10 @@ tarball_upload_script = PATH_TO_EESSI_BOT/scripts/eessi-upload-to-staging
 # - The latter variant is used for AWS S3 services.
 endpoint_url = URL_TO_S3_SERVER
 
-# bucket name
+# bucket name:
+# can be a string value, to always use same bucket regardless of target repo,
+# or can be a mapping of target repo id (see also repo_target_map) to bucket name
+# like: bucket_name = {"eessi-pilot-2023.06": "eessi-staging-pilot-2023.06", "eessi.io-2023.06": "software.eessi.io-2023.06"}
 bucket_name = eessi-staging
 
 # upload policy: defines what policy is used for uploading built artefacts
@@ -157,7 +160,7 @@ arch_target_map = { "linux/x86_64/generic" : "--constraint shape=c4.2xlarge", "l
 # EESSI/2021.12 and NESSI/2022.11
 repo_target_map = { "linux/x86_64/amd/zen2" : ["eessi-2021.12","nessi.no-2022.11"] }
 
-# points to definition of repositories (default EESSI-pilot defined by build container)
+# points to definition of repositories (default repository defined by build container)
 repos_cfg_dir = PATH_TO_SHARED_DIRECTORY/cfg_bundles
 
 
@@ -214,4 +217,5 @@ missing_modules = Slurm output lacks message "No missing modules!".
 no_tarball_message = Slurm output lacks message about created tarball.
 no_matching_tarball = No tarball matching `{tarball_pattern}` found in job dir.
 multiple_tarballs = Found {num_tarballs} tarballs in job dir - only 1 matching `{tarball_pattern}` expected.
-job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_<summary/><ul><li>Job results file `{filename}` does not exist in job directory or reading it failed.</li><li>No artefacts were found/reported.</li></ul></details>
+job_result_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_</summary><ul><li>Job results file `{filename}` does not exist in job directory, or parsing it failed.</li><li>No artefacts were found/reported.</li></ul></details>
+job_test_unknown_fmt = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_</summary><ul><li>Job test file `{filename}` does not exist in job directory, or parsing it failed.</li></ul></details>
diff --git a/eessi_bot_job_manager.py b/eessi_bot_job_manager.py
@@ -51,6 +51,8 @@
 FINISHED_JOB_COMMENTS = "finished_job_comments"
 JOB_RESULT_COMMENT_DESCRIPTION = "comment_description"
 JOB_RESULT_UNKNOWN_FMT = "job_result_unknown_fmt"
+JOB_TEST_COMMENT_DESCRIPTION = "comment_description"
+JOB_TEST_UNKNOWN_FMT = "job_test_unknown_fmt"
 MISSING_MODULES = "missing_modules"
 MULTIPLE_TARBALLS = "multiple_tarballs"
 NEW_JOB_COMMENTS = "new_job_comments"
@@ -285,6 +287,24 @@ def read_job_result(self, job_result_file_path):
         else:
             return None
 
+    def read_job_test(self, job_test_file_path):
+        """
+        Read job test file and return the contents of the 'TEST' section.
+
+        Args:
+            job_test_file_path (string): path to job test file
+
+        Returns:
+            (ConfigParser): instance of ConfigParser corresponding to the
+                'TEST' section or None
+        """
+        # reuse function from module tools.job_metadata to read metadata file
+        test = read_metadata_file(job_test_file_path, self.logfile)
+        if test and "TEST" in test:
+            return test["TEST"]
+        else:
+            return None
+
     def process_new_job(self, new_job):
         """
         Process a new job by verifying that it is a bot job and if so
@@ -470,7 +490,8 @@ def process_finished_job(self, finished_job):
         """
         Process a finished job by
         - moving the symlink to the directory storing finished jobs,
-        - updating the PR comment with information from '*.result' file
+        - updating the PR comment with information from '*.result' and '*.test'
+          files
 
         Args:
             finished_job (dict): dictionary with information about the job
@@ -497,14 +518,21 @@ def process_finished_job(self, finished_job):
         os.rename(old_symlink, new_symlink)
 
         # REPORT status (to logfile in any case, to PR comment if accessible)
-        #   rely fully on what bot/check-build.sh has returned
-        #   check if file _bot_jobJOBID.result exists --> if so read it and
-        #   update PR comment
-        # contents of *.result file (here we only use section [RESULT])
-        #   [RESULT]
-        #   comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
-        #   status = {SUCCESS,FAILURE,UNKNOWN}
-        #   artefacts = _LIST_OF_ARTEFACTS_TO_BE_DEPLOYED_
+        #  - rely fully on what bot/check-build.sh and bot/check-test.sh have
+        #    returned
+        #  - check if file _bot_jobJOBID.result exists --> if so read it and
+        #    update PR comment
+        #    . contents of *.result file (here we only use section [RESULT])
+        #      [RESULT]
+        #      comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
+        #      status = {SUCCESS,FAILURE,UNKNOWN}
+        #      artefacts = _LIST_OF_ARTEFACTS_TO_BE_DEPLOYED_
+        #  - check if file _bot_jobJOBID.test exists --> if so read it and
+        #    update PR comment
+        #    . contents of *.test file (here we only use section [TEST])
+        #      [TEST]
+        #      comment_description = _FULLY_DEFINED_UPDATE_TO_PR_COMMENT_
+        #      status = {SUCCESS,FAILURE,UNKNOWN}
 
         # obtain format templates from app.cfg
         finished_job_comments_cfg = config.read_config()[FINISHED_JOB_COMMENTS]
@@ -533,6 +561,33 @@ def process_finished_job(self, finished_job):
         comment_update = f"\n|{dt.strftime('%b %d %X %Z %Y')}|finished|"
         comment_update += f"{comment_description}|"
 
+        # check if _bot_jobJOBID.test exits
+        # TODO if not found, assume test was not run (or failed, or ...) and add
+        # a message noting that ('not tested' + 'test suite not run or failed')
+        # --> bot/test.sh and bot/check-test.sh scripts are run in job script used by bot for 'build' action
+        job_test_file = f"_bot_job{job_id}.test"
+        job_test_file_path = os.path.join(new_symlink, job_test_file)
+        job_tests = self.read_job_test(job_test_file_path)
+
+        job_test_unknown_fmt = finished_job_comments_cfg[JOB_TEST_UNKNOWN_FMT]
+        # set fallback comment_description in case no test file was found
+        # (self.read_job_result returned None)
+        comment_description = job_test_unknown_fmt.format(filename=job_test_file)
+        if job_tests:
+            # get preformatted comment_description or use previously set default for unknown
+            comment_description = job_tests.get(JOB_TEST_COMMENT_DESCRIPTION, comment_description)
+
+        # report to log
+        log(f"{fn}(): finished job {job_id}, test suite result\n"
+            f"########\n"
+            f"comment_description: {comment_description}\n"
+            f"########\n", self.logfile)
+
+        dt = datetime.now(timezone.utc)
+
+        comment_update += f"\n|{dt.strftime('%b %d %X %Z %Y')}|test result|"
+        comment_update += f"{comment_description}|"
+
         # obtain id of PR comment to be updated (from file '_bot_jobID.metadata')
         metadata_file = f"_bot_job{job_id}.metadata"
         job_metadata_path = os.path.join(new_symlink, metadata_file)

diff --git a/scripts/bot-build.slurm b/scripts/bot-build.slurm
@@ -46,4 +46,27 @@ status = UNKNOWN
 artefacts =
 EOF
 fi
-echo "check result step finished"
+echo "check build step finished"
+TEST_SCRIPT=bot/test.sh
+if [ -f ${TEST_SCRIPT} ]; then
+    echo "${TEST_SCRIPT} script found in '${PWD}', so running it!"
+    ${TEST_SCRIPT}
+    echo "${TEST_SCRIPT} finished"
+else
+    echo "could not find ${TEST_SCRIPT} script in '${PWD}'" >&2
+fi
+CHECK_TEST_SCRIPT=bot/check-test.sh
+if [ -f ${CHECK_TEST_SCRIPT} ]; then
+    echo "${CHECK_TEST_SCRIPT} script found in '${PWD}', so running it!"
+    ${CHECK_TEST_SCRIPT}
+else
+    echo "could not find ${CHECK_TEST_SCRIPT} script in '${PWD}' ..."
+    echo "... depositing default _bot_job${SLURM_JOB_ID}.test file in '${PWD}'"
+    cat << 'EOF' > _bot_job${SLURM_JOB_ID}.test
+[RESULT]
+comment_description = <details><summary>:shrug: UNKNOWN _(click triangle for detailed information)_<summary/><ul><li>Did not find `bot/check-test.sh` script in job's work directory.</li><li>*Check job manually or ask an admin of the bot instance to assist you.*</li></ul></details>
+status = UNKNOWN
+artefacts =
+EOF
+fi
+echo "check test step finished"
diff --git a/tasks/build.py b/tasks/build.py
@@ -38,6 +38,7 @@
 BUILD_JOB_SCRIPT = "build_job_script"
 BUILD_LOGS_DIR = "build_logs_dir"
 BUILD_PERMISSION = "build_permission"
+CFG_DIRNAME = "cfg"
 CONTAINER_CACHEDIR = "container_cachedir"
 CVMFS_CUSTOMIZATIONS = "cvmfs_customizations"
 DEFAULT_JOB_TIME_LIMIT = "24:00:00"
@@ -47,6 +48,7 @@
 INITIAL_COMMENT = "initial_comment"
 JOBS_BASE_DIR = "jobs_base_dir"
 JOB_ARCHITECTURE = "architecture"
+JOB_CFG_FILENAME = "job.cfg"
 JOB_CONTAINER = "container"
 JOB_LOCAL_TMP = "local_tmp"
 JOB_HTTPS_PROXY = "https_proxy"
@@ -64,7 +66,6 @@
 LOCAL_TMP = "local_tmp"
 NO_BUILD_PERMISSION_COMMENT = "no_build_permission_comment"
 REPOS_CFG_DIR = "repos_cfg_dir"
-REPOS_ID = "repo_id"
 REPOS_REPO_NAME = "repo_name"
 REPOS_REPO_VERSION = "repo_version"
 REPOS_CONFIG_BUNDLE = "config_bundle"
@@ -198,8 +199,8 @@ def get_repo_cfg(cfg):
         (dict): dictionary containing repository settings as follows
            - {REPOS_CFG_DIR: path to repository config directory as defined in 'app.cfg'}
            - {REPO_TARGET_MAP: json of REPO_TARGET_MAP value as defined in 'app.cfg'}
-           - for all sections [REPO_ID] defined in REPOS_CFG_DIR/repos.cfg add a
-             mapping {REPO_ID: dictionary containing settings of that section}
+           - for all sections [JOB_REPO_ID] defined in REPOS_CFG_DIR/repos.cfg add a
+             mapping {JOB_REPO_ID: dictionary containing settings of that section}
     """
     fn = sys._getframe().f_code.co_name
 
@@ -469,9 +470,9 @@ def prepare_jobs(pr, cfg, event_info, action_filter):
             log(f"{fn}(): skipping arch {arch} because repo target map does not define repositories to build for")
             continue
         for repo_id in repocfg[REPO_TARGET_MAP][arch]:
-            # ensure repocfg contains information about the repository repo_id if repo_id != EESSI-pilot
-            # Note, EESSI-pilot is a bad/misleading name, it should be more like AS_IN_CONTAINER
-            if repo_id != "EESSI-pilot" and repo_id not in repocfg:
+            # ensure repocfg contains information about the repository repo_id if repo_id != EESSI
+            # Note, EESSI is a bad/misleading name, it should be more like AS_IN_CONTAINER
+            if (repo_id != "EESSI" and repo_id != "EESSI-pilot") and repo_id not in repocfg:
                 log(f"{fn}(): skipping repo {repo_id}, it is not defined in repo config {repocfg[REPOS_CFG_DIR]}")
                 continue
 
@@ -529,7 +530,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
     """
     fn = sys._getframe().f_code.co_name
 
-    jobcfg_dir = os.path.join(job_dir, 'cfg')
+    jobcfg_dir = os.path.join(job_dir, CFG_DIRNAME)
     # create ini file job.cfg with entries:
     # [site_config]
     # local_tmp = LOCAL_TMP_VALUE
@@ -538,7 +539,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
     #
     # [repository]
     # repos_cfg_dir = JOB_CFG_DIR
-    # repo_id = REPO_ID
+    # repo_id = JOB_REPO_ID
     # container = CONTAINER
     # repo_name = REPO_NAME
     # repo_version = REPO_VERSION
@@ -595,7 +596,7 @@ def prepare_job_cfg(job_dir, build_env_cfg, repos_cfg, repo_id, software_subdir,
     # make sure that <jobcfg_dir> exists
     os.makedirs(jobcfg_dir, exist_ok=True)
 
-    jobcfg_file = os.path.join(jobcfg_dir, 'job.cfg')
+    jobcfg_file = os.path.join(jobcfg_dir, JOB_CFG_FILENAME)
     with open(jobcfg_file, "w") as jcf:
         job_cfg.write(jcf)