Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stick to x86_64/amd/zen3 when AMD Genoa (Zen4) is detected, until optimized software installations are available for Zen4 #569

Conversation

boegel
Copy link
Contributor

@boegel boegel commented May 7, 2024

Tested, works like a charm (extra text is in yellow):

Found EESSI repo @ /cvmfs/software.eessi.io/versions/2023.06!
archdetect says x86_64/amd/zen4
Sticking to x86_64/amd/zen3 for now, since optimized installations for AMD Genoa (Zen4) are a work in progress, see https://gitlab.com/eessi/support/-/issues/37 for more information
Using x86_64/amd/zen3 as software subdirectory.
...
$ echo $MODULEPATH
/cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/modules/all

…imized software installations are available for Zen4
@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label May 7, 2024
Copy link

eessi-bot bot commented May 7, 2024

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

Copy link

eessi-bot bot commented May 7, 2024

Instance eessi-bot-mc-azure is configured to build:

  • arch x86_64/amd/zen4 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen4 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen4 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen4 for repo eessi.io-2023.06-software

@boegel boegel requested a review from casparvl May 7, 2024 16:16
@boegel boegel changed the title stick to x86_64/amd/zen3 when AMD Genoa (Zen4) is detected, until optimized software installations are available for Zen4 stick to x86_64/amd/zen3 when AMD Genoa (Zen4) is detected, until optimized software installations are available for Zen4 May 7, 2024
casparvl
casparvl previously approved these changes May 7, 2024
Copy link
Collaborator

@casparvl casparvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm:

[casparl@tcn892 software-layer]$ git checkout 2023.06-software.eessi.io_zen4-zen3
...
[casparl@tcn892 software-layer]$ source init/bash
Sticking to x86_64/amd/zen3 for now, since optimized installations for AMD Genoa (Zen4) are a work in progress, see https://gitlab.com/eessi/support/-/issues/37 for more information

Due to MODULEPATH changes, the following have been reloaded:
  1) EasyBuild/4.9.1

Environment set up to use EESSI (2023.06), have fun!
{EESSI 2023.06} [casparl@tcn892 software-layer]$ module av

------------------------------------------------------------------------------------------------ /cvmfs/software.eessi.io/versions/2023.06/software/linux/x86_64/amd/zen3/modules/all -------------------------------------------------------------------------------------------------
   Abseil/20230125.2-GCCcore-12.2.0                 flit/3.9.0-GCCcore-13.2.0                           LERC/4.0.0-GCCcore-12.3.0               NASM/2.16.01-GCCcore-13.2.0                             Qhull/2020.2-GCCcore-12.3.0
   Abseil/20230125.3-GCCcore-12.3.0                 fontconfig/2.14.1-GCCcore-12.2.0                    LHAPDF/6.5.4-GCC-12.3.0                 NCCL/2.18.3-GCCcore-12.3.0-CUDA-12.1.1           (g)    Qhull/2020.2-GCCcore-13.2.0
   ALL/0.9.2-foss-2023a                             fontconfig/2.14.2-GCCcore-12.3.0                    libaec/1.0.6-GCCcore-12.3.0             ncdu/1.18-GCC-12.3.0                                    Qt5/5.15.7-GCCcore-12.2.0
...

@casparvl
Copy link
Collaborator

casparvl commented May 7, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/generic

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account casparvl has NO permission to send commands to the bot

Copy link

eessi-bot bot commented May 7, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_569/10314

date job status comment
May 07 16:32:23 UTC 2024 submitted job id 10314 awaits release by job manager
May 07 16:33:21 UTC 2024 released job awaits launch by Slurm scheduler
May 07 16:38:23 UTC 2024 running job 10314 is running
May 07 16:50:35 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-10314.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
May 07 16:50:35 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-10314.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

@bedroge Deploying this via the bot should work, right?

@casparvl
Copy link
Collaborator

casparvl commented May 7, 2024

It should. I think we only need to 'build' for one architecture, since it isn't architecture specific anyway, right? (Or do we have checks in place that prevent deploying if not all architectures have a tarball?)

Anyway, I started one 'build'. If more are needed, feel free. I'm afraid I have to go now, maybe someone alse can check & deploy later tonight...

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

bot: build repo:eessi.io-2023.06-software arch:aarch64/generic

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/generic from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/generic resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 7, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_569/10315

date job status comment
May 07 17:28:52 UTC 2024 submitted job id 10315 awaits release by job manager
May 07 17:29:41 UTC 2024 released job awaits launch by Slurm scheduler
May 07 17:30:43 UTC 2024 running job 10315 is running
May 07 17:43:48 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-10315.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1715102988.tar.gzsize: 0 MiB (2679 bytes)
entries: 3
modules under 2023.06/software/linux/aarch64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/generic/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/generic
2023.06/init/bash
2023.06/init/eessi_environment_variables
2023.06/init/Magic_Castle/bash
May 07 17:43:48 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-10315.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

We should first complete deploy & merge of #371

…-layer into 2023.06-software.eessi.io_zen4-zen3
@trz42
Copy link
Collaborator

trz42 commented May 7, 2024

Rebuilding after #371 has been merged and this PR has been updated

bot: build repo:eessi.io-2023.06-software arch:aarch64/generic

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/generic from trz42

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/generic resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 7, 2024

New job on instance eessi-bot-mc-aws for architecture aarch64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_569/10332

date job status comment
May 07 19:40:01 UTC 2024 submitted job id 10332 awaits release by job manager
May 07 19:40:33 UTC 2024 released job awaits launch by Slurm scheduler
May 07 19:45:36 UTC 2024 running job 10332 is running
May 07 19:57:48 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-10332.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1715111089.tar.gzsize: 0 MiB (1877 bytes)
entries: 1
modules under 2023.06/software/linux/aarch64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/generic/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/generic
2023.06/init/eessi_environment_variables
May 07 19:57:48 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-10332.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
May 07 20:36:45 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-aarch64-generic-1715111089.tar.gz to S3 bucket succeeded

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

@trz42 Looks good now, ready to deploy?

@ocaisa ocaisa added the bot:deploy Ask bot to deploy missing software installations to EESSI label May 7, 2024
@trz42 trz42 merged commit 4afaecb into EESSI:2023.06-software.eessi.io May 7, 2024
35 checks passed
@boegel boegel deleted the 2023.06-software.eessi.io_zen4-zen3 branch May 7, 2024 21:16
@bedroge bedroge mentioned this pull request May 8, 2024
@ocaisa ocaisa mentioned this pull request May 15, 2024
@bedroge bedroge added the zen4 label Jun 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io bot:deploy Ask bot to deploy missing software installations to EESSI zen4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants