Skip to content

Commit

Permalink
Merge pull request #285 from EESSI/develop
Browse files Browse the repository at this point in the history
merge develop into main for release v0.6.0
  • Loading branch information
boegel authored Sep 18, 2024
2 parents a5b40e9 + 1be8f58 commit 357f9df
Show file tree
Hide file tree
Showing 13 changed files with 655 additions and 86 deletions.
69 changes: 64 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -375,6 +375,13 @@ package repositories. Typically these settings are set in the prologue of a
Slurm job. However, when entering the [EESSI compatibility layer](https://www.eessi.io/docs/compatibility_layer),
most environment settings are cleared. Hence, they need to be set again at a later stage.

```
job_name = JOB_NAME
```
Replace `JOB_NAME` with a string of at least 3 characters that is used as job
name when a job is submitted. This is used to filter jobs, e.g., should be used
to make sure that multiple bot instances can run in the same Slurm environment.

```
jobs_base_dir = PATH_TO_JOBS_BASE_DIR
```
Expand Down Expand Up @@ -419,6 +426,13 @@ no_build_permission_comment = The `bot: build ...` command has been used by user
`no_build_permission_comment` defines a comment (template) that is used when
the account trying to trigger build jobs has no permission to do so.

```
allow_update_submit_opts = false
```
`allow_update_submit_opts` determines whether or not to allow updating the submit
options via custom module `det_submit_opts` provided by the pull request being
processed.


#### `[bot_control]` section

Expand Down Expand Up @@ -631,17 +645,24 @@ scontrol_command = /usr/bin/scontrol
#### `[submitted_job_comments]` section

The `[submitted_job_comments]` section specifies templates for messages about newly submitted jobs.
```
initial_comment = New job on instance `{app_name}` for architecture `{arch_name}` for repository `{repo_id}` in job dir `{symlink}`
```
`initial_comment` is used to create a comment to a PR when a new job has been created.

```
awaits_release = job id `{job_id}` awaits release by job manager
```
`awaits_release` is used to provide a status update of a job (shown as a row in the job's status
table).

```
initial_comment = New job on instance `{app_name}` for architecture `{arch_name}`{accelerator_spec} for repository `{repo_id}` in job dir `{symlink}`
```
`initial_comment` is used to create a comment to a PR when a new job has been
created. Note, the part '{accelerator_spec}' is only filled-in by the bot if the
argument 'accelerator' to the `bot: build` command has been used.
```
with_accelerator =  and accelerator `{accelerator}`
```
`with_accelerator` is used to provide information about the accelerator the job
should build for if and only if the argument `accelerator:X/Y` has been provided.

#### `[new_job_comments]` section

The `[new_job_comments]` section sets templates for messages about jobs whose `hold` flag was released.
Expand Down Expand Up @@ -720,6 +741,21 @@ git_apply_tip = _Tip: This can usually be resolved by syncing your branch and re
`git_apply_tip` should guide the contributor/maintainer about resolving the cause
of `git apply` failing.

#### `[clean_up]` section

The `[clean_up]` section includes settings related to cleaning up disk used by merged (and closed) PRs.
```
trash_bin_dir = PATH/TO/TRASH_BIN_DIRECTORY
```
Ideally this is on the same filesystem used by `jobs_base_dir` and `job_ids_dir` to efficiently move data
into the trash bin. If it resides on a different filesystem, the data will be copied.

```
moved_job_dirs_comment = PR merged! Moved `{job_dirs}` to `{trash_bin_dir}`
```
Template that is used by the bot to add a comment to a PR noting down which directories have been
moved and where.

# Instructions to run the bot components

The bot consists of three components:
Expand Down Expand Up @@ -784,3 +820,26 @@ The job manager can run on a different machine than the event handler, as long a

For information on how to make pull requests and let the bot build software, see
[the bot section of the EESSI documentation](https://www.eessi.io/docs/bot/).

# Private target repos

Both Git and Curl need to have access to the target repo. A convenient way to
access a private repo via a Github token is by adding the following lines to
your `~/.netrc` and `~/.curlrc` files:

```
# ~/.netrc
machine github.com
login oauth
password <Github token>
machine api.github.com
login oauth
password <Github token>
```

```
# ~/.curlrc
--netrc
```

28 changes: 28 additions & 0 deletions RELEASE_NOTES
Original file line number Diff line number Diff line change
@@ -1,6 +1,34 @@
This file contains a description of the major changes to the EESSI
build-and-deploy bot. For more detailed information, please see the git log.

v0.6.0 (18 September 2024)
--------------------------

This is a minor release of the EESSI build-and-deploy bot.

Improvements:
* move merged PR job directories to 'trash_bin_dir' (#271)
* the target directory can be defined with the 'app.cfg' setting 'trash_bin_dir'
* it uses 'shutil.move' which tries to use 'mv' if source and target are on the
same filesystem
* add setting to give all jobs a unique name (#273)
* move closed PR job directories to 'trash_bin_dir' (#275)
* add filter for accelerators (#276)
* add support for updating Slurm options through user-defined python module in
target PR (#277)
* use GitHub API for downloading the diff of a PR (#278)
* add documentation about private repos (#279)
* pass accelerator value to job scripts (via job.cfg) and extend PR comment if
the 'accelerator' argument is used (#280, #282)

New 'app.cfg' settings (see README.md and app.cfg.example for details):
* (optional) 'allow_update_submit_opts' in section '[buildenv]'
* (required) 'job_name' in section '[buildenv]'
* (required) 'moved_job_dirs_comment' in section '[clean_up]'
* (required) 'trash_bin_dir' in section '[clean_up]'
* (required) 'with_accelerator' in section '[submitted_job_comments]'


v0.5.0 (16 May 2024)
--------------------------

Expand Down
15 changes: 14 additions & 1 deletion app.cfg.example
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
# author: Jonas Qvigstad (@jonas-lq)
# author: Pedro Santos Neves (@Neves-P)
# author: Thomas Roeblitz (@trz42)
# author: Sam Moors (@smoors)
#
# license: GPLv2
#
Expand Down Expand Up @@ -87,6 +88,10 @@ container_cachedir = PATH_TO_SHARED_DIRECTORY
# http_proxy = http://PROXY_DNS:3128/
# https_proxy = http://PROXY_DNS:3128/

# Used to give all jobs of a bot instance the same name. Can be used to allow
# multiple bot instances running on the same Slurm cluster.
job_name = prod

# directory under which the bot prepares directories per job
# structure created is as follows: YYYY.MM/pr_PR_NUMBER/event_EVENT_ID/run_RUN_NUMBER/OS+SUBDIR
jobs_base_dir = $HOME/jobs
Expand Down Expand Up @@ -124,6 +129,9 @@ build_permission =
# template for comment when user who set a label has no permission to trigger build jobs
no_build_permission_comment = Label `bot:build` has been set by user `{build_labeler}`, but this person does not have permission to trigger builds

# whether or not to allow updating the submit options via custom module det_submit_opts
allow_update_submit_opts = false


[deploycfg]
# script for uploading built software packages
Expand Down Expand Up @@ -235,8 +243,9 @@ scontrol_command = /usr/bin/scontrol
# are removed, the output (in PR comments) will lack important
# information.
[submitted_job_comments]
initial_comment = New job on instance `{app_name}` for architecture `{arch_name}` for repository `{repo_id}` in job dir `{symlink}`
awaits_release = job id `{job_id}` awaits release by job manager
initial_comment = New job on instance `{app_name}` for CPU micro-architecture `{arch_name}`{accelerator_spec} for repository `{repo_id}` in job dir `{symlink}`
with_accelerator = &nbsp;and accelerator `{accelerator}`


[new_job_comments]
Expand All @@ -259,3 +268,7 @@ curl_failure = Unable to download the `.diff` file.
curl_tip = _Tip: This could be a connection failure. Try again and if the issue remains check if the address is correct_
git_apply_failure = Unable to download or merge changes between the source branch and the destination branch.
git_apply_tip = _Tip: This can usually be resolved by syncing your branch and resolving any merge conflicts._

[clean_up]
trash_bin_dir = $HOME/trash_bin
moved_job_dirs_comment = PR merged! Moved `{job_dirs}` to `{trash_bin_dir}`
69 changes: 67 additions & 2 deletions eessi_bot_event_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,12 +12,15 @@
# author: Jonas Qvigstad (@jonas-lq)
# author: Lara Ramona Peeters (@laraPPr)
# author: Thomas Roeblitz (@trz42)
# author: Pedro Santos Neves (@Neves-P)
# author: Sam Moors (@smoors)
#
# license: GPLv2
#

# Standard library imports
import sys
from datetime import datetime, timezone

# Third party imports (anything installed into the local Python environment)
from pyghee.lib import create_app, get_event_info, PyGHee, read_event_from_json
Expand All @@ -28,7 +31,8 @@
from connections import github
from tasks.build import check_build_permission, get_architecture_targets, get_repo_cfg, \
request_bot_build_issue_comments, submit_build_jobs
from tasks.deploy import deploy_built_artefacts
from tasks.deploy import deploy_built_artefacts, determine_job_dirs
from tasks.clean_up import move_to_trash_bin
from tools import config
from tools.args import event_handler_parse
from tools.commands import EESSIBotCommand, EESSIBotCommandError, \
Expand All @@ -44,20 +48,25 @@
config.BOT_CONTROL_SETTING_COMMAND_PERMISSION, # required
config.BOT_CONTROL_SETTING_COMMAND_RESPONSE_FMT], # required
config.SECTION_BUILDENV: [
# config.BUILDENV_SETTING_ALLOW_UPDATE_SUBMIT_OPTS # optional
config.BUILDENV_SETTING_BUILD_JOB_SCRIPT, # required
config.BUILDENV_SETTING_BUILD_LOGS_DIR, # optional+recommended
config.BUILDENV_SETTING_BUILD_PERMISSION, # optional+recommended
config.BUILDENV_SETTING_CONTAINER_CACHEDIR, # optional+recommended
# config.BUILDENV_SETTING_CVMFS_CUSTOMIZATIONS, # optional
# config.BUILDENV_SETTING_HTTPS_PROXY, # optional
# config.BUILDENV_SETTING_HTTP_PROXY, # optional
config.BUILDENV_SETTING_JOB_NAME, # required
config.BUILDENV_SETTING_JOBS_BASE_DIR, # required
# config.BUILDENV_SETTING_LOAD_MODULES, # optional
config.BUILDENV_SETTING_LOCAL_TMP, # required
config.BUILDENV_SETTING_NO_BUILD_PERMISSION_COMMENT, # required
config.BUILDENV_SETTING_SHARED_FS_PATH, # optional+recommended
# config.BUILDENV_SETTING_SLURM_PARAMS, # optional
config.BUILDENV_SETTING_SUBMIT_COMMAND], # required
config.SECTION_CLEAN_UP: [
config.CLEAN_UP_SETTING_TRASH_BIN_ROOT_DIR, # required
config.CLEAN_UP_SETTING_MOVED_JOB_DIRS_COMMENT], # required
config.SECTION_DEPLOYCFG: [
config.DEPLOYCFG_SETTING_ARTEFACT_PREFIX, # (required)
config.DEPLOYCFG_SETTING_ARTEFACT_UPLOAD_SCRIPT, # required
Expand Down Expand Up @@ -88,7 +97,8 @@
config.REPO_TARGETS_SETTING_REPOS_CFG_DIR], # required
config.SECTION_SUBMITTED_JOB_COMMENTS: [
config.SUBMITTED_JOB_COMMENTS_SETTING_INITIAL_COMMENT, # required
config.SUBMITTED_JOB_COMMENTS_SETTING_AWAITS_RELEASE] # required
config.SUBMITTED_JOB_COMMENTS_SETTING_AWAITS_RELEASE, # required
config.SUBMITTED_JOB_COMMENTS_SETTING_WITH_ACCELERATOR], # required
}


Expand Down Expand Up @@ -599,6 +609,61 @@ def start(self, app, port=3000):
self.log(log_file_info)
waitress.serve(app, listen='*:%s' % port)

def handle_pull_request_closed_event(self, event_info, pr):
"""
Handle events of type pull_request with the action 'closed'. It
determines used by the PR and moves them to the trash_bin. It also adds
information to the logs and a comment to the PR.
Args:
event_info (dict): event received by event_handler
pr (github.PullRequest.PullRequest): instance representing the pull request
Returns:
github.IssueComment.IssueComment instance or None (note, github refers to
PyGithub, not the github from the internal connections module)
"""

# Detect event and report if PR was merged or closed
request_body = event_info['raw_request_body']
# next value: True -> PR merged, False -> PR closed
mergedOrClosed = request_body['pull_request']['merged']
status = "merged" if mergedOrClosed else "closed"

self.log(f"PR {pr.number}: PR got {status} (json value: {mergedOrClosed})")

# 1) determine the jobs that have been run for the PR
self.log(f"PR {pr.number}: determining directories to be moved to trash bin")
job_dirs = determine_job_dirs(pr.number)

# 2) Get trash_bin_dir from configs
trash_bin_root_dir = self.cfg[config.SECTION_CLEAN_UP][config.CLEAN_UP_SETTING_TRASH_BIN_ROOT_DIR]

repo_name = request_body['repository']['full_name']
dt_start = datetime.now(timezone.utc)
trash_bin_dir = "/".join([trash_bin_root_dir, repo_name, dt_start.strftime('%Y.%m.%d')])

# Subdirectory with date of move. Also with repository name. Handle symbolic links (later?)
# cron job deletes symlinks?

# 3) move the directories to the trash_bin
self.log(f"PR {pr.number}: moving directories to trash bin {trash_bin_dir}")
move_to_trash_bin(trash_bin_dir, job_dirs)
dt_end = datetime.now(timezone.utc)
dt_delta = dt_end - dt_start
seconds_elapsed = dt_delta.days * 24 * 3600 + dt_delta.seconds
self.log(f"PR {pr.number}: moved directories to trash bin {trash_bin_dir} (took {seconds_elapsed} seconds)")

# 4) report move to pull request
repo_name = pr.base.repo.full_name
gh = github.get_instance()
repo = gh.get_repo(repo_name)
pull_request = repo.get_pull(pr.number)
clean_up_comment = self.cfg[config.SECTION_CLEAN_UP][config.CLEAN_UP_SETTING_MOVED_JOB_DIRS_COMMENT]
moved_comment = clean_up_comment.format(job_dirs=job_dirs, trash_bin_dir=trash_bin_dir)
issue_comment = pull_request.create_issue_comment(moved_comment)
return issue_comment


def main():
"""
Expand Down
8 changes: 8 additions & 0 deletions eessi_bot_job_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@

# settings that are required in 'app.cfg'
REQUIRED_CONFIG = {
config.SECTION_BUILDENV: [
config.BUILDENV_SETTING_JOB_NAME], # required
config.SECTION_FINISHED_JOB_COMMENTS: [
config.FINISHED_JOB_COMMENTS_SETTING_JOB_RESULT_UNKNOWN_FMT, # required
config.FINISHED_JOB_COMMENTS_SETTING_JOB_TEST_UNKNOWN_FMT], # required
Expand Down Expand Up @@ -85,6 +87,10 @@ def __init__(self):
cfg = config.read_config()
job_manager_cfg = cfg[config.SECTION_JOB_MANAGER]
self.logfile = job_manager_cfg.get(config.JOB_MANAGER_SETTING_LOG_PATH)
buildenv_cfg = cfg[config.SECTION_BUILDENV]
self.job_name = buildenv_cfg.get(config.BUILDENV_SETTING_JOB_NAME)
if self.job_name and len(self.job_name) < 3:
raise Exception(f"job name ({self.job_name}) is shorter than 3 characters")

def get_current_jobs(self):
"""
Expand All @@ -106,6 +112,8 @@ def get_current_jobs(self):
raise Exception("Unable to find username")

squeue_cmd = "%s --long --noheader --user=%s" % (self.poll_command, username)
if self.job_name:
squeue_cmd += " --name='%s'" % self.job_name
squeue_output, squeue_err, squeue_exitcode = run_cmd(
squeue_cmd,
"get_current_jobs(): squeue command",
Expand Down
Loading

0 comments on commit 357f9df

Please sign in to comment.