Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR for bot development tests #58

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

poksumdo
Copy link
Collaborator

Target stack is EESSI/2021.12

@poksumdo poksumdo added the bot:build Instruct bot to build software stack label Feb 23, 2023
@eessi-bot-devel-trz42

This comment was marked as outdated.

@eessi-bot-devel-trz42

This comment was marked as outdated.

@poksumdo

This comment was marked as outdated.

@eessi-bot-devel-trz42

This comment was marked as outdated.

@eessi-bot-devel-trz42

This comment was marked as outdated.

@trz42
Copy link
Owner

trz42 commented Mar 6, 2023

Testing eessi bot PR#154:

  • rebuild first (jobs above where run in a different directory)
  • run into some issue because setting upload_to_s3_script was not renamed to tarball_upload_script
  • tarball getting uploaded to nessi-2022.11 after renaming setting

@trz42 trz42 added bot:deploy Instruct bot to deploy built artefacts to Stratum 0 bot:build Instruct bot to build software stack and removed bot:deploy Instruct bot to deploy built artefacts to Stratum 0 bot:build Instruct bot to build software stack labels Mar 6, 2023
@eessi-bot-devel-trz42

This comment was marked as outdated.

@trz42 trz42 added bot:deploy Instruct bot to deploy built artefacts to Stratum 0 and removed bot:deploy Instruct bot to deploy built artefacts to Stratum 0 bot:build Instruct bot to build software stack labels Mar 6, 2023
@trz42
Copy link
Owner

trz42 commented Mar 7, 2023

Testing eessi bot PR#156: default time limit

  • only run event_handler.sh to prevent job manager from releasing jobs (releasing is not necessary)

Test cases

  • 1. run without any time limit set in app.cfg --> scontrol show job JOBID should report 24:00:00 hours as time limit

    [trz42@mgmt PR156]$ scontrol show job 3952 | grep -E '(JobId|TimeLimit)'
    JobId=3952 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
    

    Note, actual format for time limit being reported is days-hours:minutes:seconds. Hence, the result TimeLimit=1-00:00:00 is equal to 24 hours.

  • 2. run with time limit specified via --time=12:00:00 set via slurm_params --> scontrol show job JOBID should report 12:00:00 hours as time limit

    [trz42@mgmt PR156]$ scontrol show job 3953 | grep -E '(JobId|TimeLimit)'
    JobId=3953 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=12:00:00 TimeMin=N/A
    
  • 3. run with time limit specified via --time 09:00:00 set via slurm_params --> scontrol show job JOBID should report 09:00:00 hours as time limit

    [trz42@mgmt PR156]$ scontrol show job 3954 | grep -E '(JobId|TimeLimit)'
    JobId=3954 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=09:00:00 TimeMin=N/A
    
  • 4. run with time limit specified via -t 06:00:00 set via slurm_params --> scontrol show job JOBID should report 06:00:00 hours as time limit

    [trz42@mgmt PR156]$ scontrol show job 3955 | grep -E '(JobId|TimeLimit)'
    JobId=3955 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=06:00:00 TimeMin=N/A
    
  • 5. run with time limit specified via --time=03:00:00 set via arch_target_map --> scontrol show job JOBID should report 03:00:00 hours as time limit

    [trz42@mgmt PR156]$ scontrol show job 3956 | grep -E '(JobId|TimeLimit)'
    JobId=3956 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=03:00:00 TimeMin=N/A
    
  • 6. run with malformed time limit spec, e.g., --TimeLimit=01:00:00 set via slurm_params --> scontrol show job JOBID should report 1-00:00:00 as time limit
    That actually crashed the event handler because --TimeLimit is not a known argument for sbatch.

  • 7. run with malformed time limit spec, e.g., --Time=01:00:00 set via slurm_params --> scontrol show job JOBID should report 1-00:00:00 as time limit
    Same result as with the above case.

  • 8. one more test to check if the algorithm can be mislead: no time limit specified, however, a job name is provided via --job-name=real-test --> scontrol show job JOBID should report 1-00:00:00 as time limit
    FAILED

    [trz42@mgmt PR156]$ scontrol show job 3957 | grep -E '(JobId|TimeLimit)'
    JobId=3957 JobName=real-test
       RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
    
  • 9. yet another test to check if the algorithm works: no time limit specified, however, CPUs per task provided via --cpus-per-task=2 --> scontrol show job JOBID should report 1-00:00:00 as time limit
    FAILED

    [trz42@mgmt PR156]$ scontrol show job 3959 | grep -E '(JobId|TimeLimit)'
    JobId=3959 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
    

@trz42 trz42 added the bot:build Instruct bot to build software stack label Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3952

date job status comment
Mar 07 06:45:25 PM UTC 2023 submitted job id 3952 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3953

date job status comment
Mar 07 06:51:49 PM UTC 2023 submitted job id 3953 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3954

date job status comment
Mar 07 06:56:27 PM UTC 2023 submitted job id 3954 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3955

date job status comment
Mar 07 06:59:07 PM UTC 2023 submitted job id 3955 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3957

date job status comment
Mar 07 07:20:10 PM UTC 2023 submitted job id 3957 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3958

date job status comment
Mar 07 07:26:26 PM UTC 2023 submitted job id 3958 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 7, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3959

date job status comment
Mar 07 07:27:45 PM UTC 2023 submitted job id 3959 awaits release by job manager

@trz42
Copy link
Owner

trz42 commented Mar 10, 2023

Repeating the same test protocol after code was changed to also cover cases 8 & 9.

Testing eessi bot PR#156: default time limit

  • only run event_handler.sh to prevent job manager from releasing jobs (releasing is not necessary)

Test cases

  • 1. run without any time limit set in app.cfg --> scontrol show job JOBID should report 1-00:00:00 hours as time limit

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3974 | grep -E '(JobId|TimeLimit)'
    JobId=3974 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
    
  • 2. run with time limit specified via --time=12:00:00 set via slurm_params --> scontrol show job JOBID should report 12:00:00 hours as time limit

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3975 | grep -E '(JobId|TimeLimit)'
    JobId=3975 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=12:00:00 TimeMin=N/A
    
  • 3. run with time limit specified via --time 09:00:00 set via slurm_params --> scontrol show job JOBID should report 09:00:00 hours as time limit
    skipped

  • 4. run with time limit specified via -t 06:00:00 set via slurm_params --> scontrol show job JOBID should report 06:00:00 hours as time limit

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3976 | grep -E '(JobId|TimeLimit)'
    JobId=3976 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=06:00:00 TimeMin=N/A
    
  • 5. run with time limit specified via --time 03:00:00 set via arch_target_map --> scontrol show job JOBID should report 03:00:00 hours as time limit

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3977 | grep -E '(JobId|TimeLimit)'
    JobId=3977 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=03:00:00 TimeMin=N/A
    
  • 6. run with malformed time limit spec, e.g., --TimeLimit=01:00:00 set via slurm_params --> scontrol show job JOBID should report 1-00:00:00 as time limit
    skipped

  • 7. run with malformed time limit spec, e.g., --Time=01:00:00 set via slurm_params --> scontrol show job JOBID should report 1-00:00:00 as time limit
    skipped

  • 8. one more test to check if the algorithm can be mislead: no time limit specified, however, a job name is provided via --job-name=real-test --> scontrol show job JOBID should report 1-00:00:00 as time limit
    now, it works as expected

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3973 | grep -E '(JobId|TimeLimit)'
    JobId=3973 JobName=eessi-bot-build.slurm
       RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
    
  • 9. yet another test to check if the algorithm works: no time limit specified, however, CPUs per task provided via --cpus-per-task=2 --> scontrol show job JOBID should report 1-00:00:00 as time limit
    now, it works as expected

    [trz42@mgmt bot-side-4-bot-build]$ scontrol show job 3972 | grep -E '(JobId|TimeLimit)'
    JobId=3972 JobName=real-test
       RunTime=00:00:00 TimeLimit=1-00:00:00 TimeMin=N/A
    

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3972

date job status comment
Mar 10 07:57:26 AM UTC 2023 submitted job id 3972 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3973

date job status comment
Mar 10 08:01:27 AM UTC 2023 submitted job id 3973 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3974

date job status comment
Mar 10 08:04:23 AM UTC 2023 submitted job id 3974 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3975

date job status comment
Mar 10 08:06:39 AM UTC 2023 submitted job id 3975 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3976

date job status comment
Mar 10 08:17:42 AM UTC 2023 submitted job id 3976 awaits release by job manager

@trz42 trz42 added bot:build Instruct bot to build software stack and removed bot:build Instruct bot to build software stack labels Mar 10, 2023
@eessi-bot-devel-trz42
Copy link

New job on instance CitC-PR156 for architecture x86_64-amd-zen2 in job dir /mnt/shared/home/trz42/pilot.nessi/PR156/jobs/2023.03/pr_58/3977

date job status comment
Mar 10 08:20:42 AM UTC 2023 submitted job id 3977 awaits release by job manager

trz42 pushed a commit that referenced this pull request Mar 18, 2023
pull in fixes from EESSI/software-layer PR238 and PR239
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:build Instruct bot to build software stack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants