-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added 24h default time limit #156
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general idea is good. However, it does not consider all cases yet. See comments added.
Small addition, instead of writing the limit somewhere in the code, maybe add a variable at the start of the file, e.g.,
DEFAULT_JOB_TIME_LIMIT="24:00:00"
and later in the code write
time_limit = f"--time={DEFAULT_JOB_TIME_LIMIT}"
tasks/build.py
Outdated
# Add a default time limit of 24h to the command if nothing else is specified by the user | ||
if "--time=" not in build_env_cfg[SLURM_PARAMS]: | ||
time_limit = "--time=24:00:00" | ||
else: | ||
time_limit = "" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would work only if a limit is specified via the setting slurm_params
in app.cfg
and if the notation with =
is used. The sbatch
command also allows to provide values with a space, i.e., --time SOME_LIMIT
and the short form -t SOME_LIMIT
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should include job.slurm_opts
(which are read from arch_target_map
in app.cfg
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better. Some not so unusual cases we should catch. See comment.
tasks/build.py
Outdated
# determine CPU target to use: job.arch_target, but without the linux/ part | ||
cpu_target = '/'.join(job.arch_target.split('/')[1:]) | ||
|
||
# Add a default time limit of 24h to the command if nothing else is specified by the user | ||
all_opts = " ".join([build_env_cfg[SLURM_PARAMS], job.slurm_opts, '--export=ALL,CPU_TARGET=%s' % cpu_target]) | ||
if ("--time" in all_opts) or ("-t" in all_opts): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a big improvement. However, there are a few cases this way of checking is not catching. See test cases 8 & 9 in trz42/software-layer#58 (comment)
So, we may need to iterate over all options and check if they begin with --time
or -t
, e.g.,
opt.startswith("--time")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. All test cases that failed before succeed now. Repeated some tests for the other cases as well, see trz42/software-layer#58 (comment)
One final suggestion to polish a bit the code. Then it can be merged.
tasks/build.py
Outdated
@@ -269,8 +269,9 @@ def submit_job(job, submitted_jobs, build_env_cfg, ym, pr_id): | |||
cpu_target = '/'.join(job.arch_target.split('/')[1:]) | |||
|
|||
# Add a default time limit of 24h to the command if nothing else is specified by the user | |||
all_opts = " ".join([build_env_cfg[SLURM_PARAMS], job.slurm_opts, '--export=ALL,CPU_TARGET=%s' % cpu_target]) | |||
if ("--time" in all_opts) or ("-t" in all_opts): | |||
all_opts_str = " ".join([build_env_cfg[SLURM_PARAMS], job.slurm_opts, '--export=ALL,CPU_TARGET=%s' % cpu_target]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove the third argument , '--export=ALL,CPU_TARGET=%' % cpu_target]
. This is not a configuration parameter that we expect to include a time limit. In a future version of the bot this code is probably going away anyway.
Fix for issue #146