Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06,zen4} foss/2022b #567

Open
wants to merge 1 commit into
base: 2023.06-software.eessi.io
Choose a base branch
from

Conversation

boegel
Copy link
Contributor

@boegel boegel commented May 7, 2024

No description provided.

@boegel boegel added the 2023.06-software.eessi.io 2023.06 version of software.eessi.io label May 7, 2024
Copy link

eessi-bot bot commented May 7, 2024

Instance eessi-bot-mc-aws is configured to build:

  • arch x86_64/generic for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/generic for repo eessi-hpc.org-2023.06-software
  • arch x86_64/generic for repo eessi.io-2023.06-compat
  • arch x86_64/generic for repo eessi.io-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi.io-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi.io-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi.io-2023.06-software
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/generic for repo eessi-hpc.org-2023.06-software
  • arch aarch64/generic for repo eessi.io-2023.06-compat
  • arch aarch64/generic for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi.io-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-hpc.org-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi.io-2023.06-software

Copy link

eessi-bot bot commented May 7, 2024

Instance eessi-bot-mc-azure is configured to build:

  • arch x86_64/amd/zen4 for repo eessi-hpc.org-2023.06-compat
  • arch x86_64/amd/zen4 for repo eessi-hpc.org-2023.06-software
  • arch x86_64/amd/zen4 for repo eessi.io-2023.06-compat
  • arch x86_64/amd/zen4 for repo eessi.io-2023.06-software

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

Copy link

eessi-bot bot commented May 7, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_567/73

date job status comment
May 07 13:31:58 UTC 2024 submitted job id 73 awaits release by job manager
May 07 13:32:10 UTC 2024 released job awaits launch by Slurm scheduler
May 07 13:49:49 UTC 2024 running job 73 is running
May 07 13:57:02 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-73.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
May 07 13:57:02 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-73.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

I expect the OpenBLAS tests to fail here, see also notes available in https://gitlab.com/eessi/support/-/issues/37

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

GCC build failed with g++: fatal error: Killed signal terminated program cc1plus because not enough memory is available, bot configuration needs to be tweaked on build cluster in Azure

@boegel
Copy link
Contributor Author

boegel commented May 7, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 7, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

Copy link

eessi-bot bot commented May 7, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_567/84

date job status comment
May 07 20:34:50 UTC 2024 submitted job id 84 awaits release by job manager
May 07 20:35:50 UTC 2024 released job awaits launch by Slurm scheduler
May 07 20:36:55 UTC 2024 running job 84 is running
May 07 22:53:48 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-84.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1715122155.tar.gzsize: 1348 MiB (1414095513 bytes)
entries: 24582
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
BLIS/0.9.0-GCC-12.3.0.lua
CMake/3.26.3-GCCcore-12.3.0.lua
FFTW.MPI/3.3.10-gompi-2023a.lua
FFTW/3.3.10-GCC-12.3.0.lua
FlexiBLAS/3.3.1-GCC-12.3.0.lua
GCC/12.3.0.lua
GCCcore/12.3.0.lua
OpenBLAS/0.3.23-GCC-12.3.0.lua
OpenMPI/4.1.5-GCC-12.3.0.lua
OpenSSL/1.1.lua
PMIx/4.2.4-GCCcore-12.3.0.lua
Perl/5.36.1-GCCcore-12.3.0.lua
Python/3.11.3-GCCcore-12.3.0.lua
SQLite/3.42.0-GCCcore-12.3.0.lua
ScaLAPACK/2.2.0-gompi-2023a-fb.lua
Tcl/8.6.13-GCCcore-12.3.0.lua
UCC/1.2.0-GCCcore-12.3.0.lua
UCX/1.14.1-GCCcore-12.3.0.lua
UnZip/6.0-GCCcore-12.3.0.lua
cURL/8.0.1-GCCcore-12.3.0.lua
foss/2023a.lua
gompi/2023a.lua
hwloc/2.9.1-GCCcore-12.3.0.lua
libarchive/3.6.2-GCCcore-12.3.0.lua
libevent/2.1.12-GCCcore-12.3.0.lua
libfabric/1.18.0-GCCcore-12.3.0.lua
libffi/3.4.4-GCCcore-12.3.0.lua
libpciaccess/0.17-GCCcore-12.3.0.lua
libxml2/2.11.4-GCCcore-12.3.0.lua
make/4.4.1-GCCcore-12.3.0.lua
numactl/2.0.16-GCCcore-12.3.0.lua
pkgconf/1.8.0.lua
pkgconf/1.9.5-GCCcore-12.3.0.lua
xorg-macros/1.20.0-GCCcore-12.3.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
BLIS/0.9.0-GCC-12.3.0
CMake/3.26.3-GCCcore-12.3.0
FFTW.MPI/3.3.10-gompi-2023a
FFTW/3.3.10-GCC-12.3.0
FlexiBLAS/3.3.1-GCC-12.3.0
GCC/12.3.0
GCCcore/12.3.0
OpenBLAS/0.3.23-GCC-12.3.0
OpenMPI/4.1.5-GCC-12.3.0
OpenSSL/1.1
PMIx/4.2.4-GCCcore-12.3.0
Perl/5.36.1-GCCcore-12.3.0
Python/3.11.3-GCCcore-12.3.0
SQLite/3.42.0-GCCcore-12.3.0
ScaLAPACK/2.2.0-gompi-2023a-fb
Tcl/8.6.13-GCCcore-12.3.0
UCC/1.2.0-GCCcore-12.3.0
UCX/1.14.1-GCCcore-12.3.0
UnZip/6.0-GCCcore-12.3.0
cURL/8.0.1-GCCcore-12.3.0
foss/2023a
gompi/2023a
hwloc/2.9.1-GCCcore-12.3.0
libarchive/3.6.2-GCCcore-12.3.0
libevent/2.1.12-GCCcore-12.3.0
libfabric/1.18.0-GCCcore-12.3.0
libffi/3.4.4-GCCcore-12.3.0
libpciaccess/0.17-GCCcore-12.3.0
libxml2/2.11.4-GCCcore-12.3.0
make/4.4.1-GCCcore-12.3.0
numactl/2.0.16-GCCcore-12.3.0
pkgconf/1.8.0
pkgconf/1.9.5-GCCcore-12.3.0
xorg-macros/1.20.0-GCCcore-12.3.0
other under 2023.06/software/linux/x86_64/amd/zen4
no other files in tarball
May 07 22:53:48 UTC 2024 test result
😢 FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
✅ job output file slurm-84.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
May 08 05:31:48 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-amd-zen4-1715122155.tar.gz to S3 bucket succeeded

@bedroge bedroge added the bot:deploy Ask bot to deploy missing software installations to EESSI label May 8, 2024
@boegel boegel force-pushed the 2023.06-software.eessi.io_zen4-foss-2022b branch from 5167efa to 09cacc2 Compare May 12, 2024 09:59
@boegel
Copy link
Contributor Author

boegel commented May 12, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented May 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

@boegel
Copy link
Contributor Author

boegel commented May 12, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented May 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 12, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

Copy link

eessi-bot bot commented May 12, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_567/87

date job status comment
May 12 10:08:39 UTC 2024 submitted job id 87 awaits release by job manager
May 12 10:09:05 UTC 2024 released job awaits launch by Slurm scheduler
May 12 10:22:24 UTC 2024 running job 87 is running
May 12 10:30:39 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-87.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
May 12 10:30:39 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-87.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@bedroge bedroge removed the bot:deploy Ask bot to deploy missing software installations to EESSI label May 12, 2024
@boegel
Copy link
Contributor Author

boegel commented May 16, 2024

Problem fixed by #573, so time to try again...

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4

Copy link

eessi-bot bot commented May 16, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented May 16, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

Copy link

eessi-bot bot commented May 16, 2024

New job on instance eessi-bot-mc-azure for architecture x86_64-amd-zen4 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.05/pr_567/99

date job status comment
May 16 10:51:07 UTC 2024 submitted job id 99 awaits release by job manager
May 16 10:51:18 UTC 2024 released job awaits launch by Slurm scheduler
May 16 10:55:27 UTC 2024 running job 99 is running
May 16 13:01:15 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-99.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen4-1715863989.tar.gzsize: 1327 MiB (1392195174 bytes)
entries: 37336
modules under 2023.06/software/linux/x86_64/amd/zen4/modules/all
BLIS/0.9.0-GCC-12.2.0.lua
CMake/3.24.3-GCCcore-12.2.0.lua
DB/18.1.40-GCCcore-12.2.0.lua
FFTW/3.3.10-GCC-12.2.0.lua
GCC/12.2.0.lua
GCCcore/12.2.0.lua
OpenSSL/1.1.lua
Perl/5.36.0-GCCcore-12.2.0.lua
Python/3.10.8-GCCcore-12.2.0-bare.lua
SQLite/3.39.4-GCCcore-12.2.0.lua
Tcl/8.6.12-GCCcore-12.2.0.lua
UCX/1.13.1-GCCcore-12.2.0.lua
UnZip/6.0-GCCcore-12.2.0.lua
cURL/7.86.0-GCCcore-12.2.0.lua
expat/2.4.9-GCCcore-12.2.0.lua
groff/1.22.4-GCCcore-12.2.0.lua
libarchive/3.6.1-GCCcore-12.2.0.lua
libevent/2.1.12-GCCcore-12.2.0.lua
libfabric/1.16.1-GCCcore-12.2.0.lua
libffi/3.4.4-GCCcore-12.2.0.lua
libxml2/2.10.3-GCCcore-12.2.0.lua
make/4.3-GCCcore-12.2.0.lua
numactl/2.0.16-GCCcore-12.2.0.lua
pkgconf/1.8.0.lua
pkgconf/1.9.3-GCCcore-12.2.0.lua
software under 2023.06/software/linux/x86_64/amd/zen4/software
BLIS/0.9.0-GCC-12.2.0
CMake/3.24.3-GCCcore-12.2.0
DB/18.1.40-GCCcore-12.2.0
FFTW/3.3.10-GCC-12.2.0
GCC/12.2.0
GCCcore/12.2.0
OpenSSL/1.1
Perl/5.36.0-GCCcore-12.2.0
Python/3.10.8-GCCcore-12.2.0-bare
SQLite/3.39.4-GCCcore-12.2.0
Tcl/8.6.12-GCCcore-12.2.0
UCX/1.13.1-GCCcore-12.2.0
UnZip/6.0-GCCcore-12.2.0
cURL/7.86.0-GCCcore-12.2.0
expat/2.4.9-GCCcore-12.2.0
groff/1.22.4-GCCcore-12.2.0
libarchive/3.6.1-GCCcore-12.2.0
libevent/2.1.12-GCCcore-12.2.0
libfabric/1.16.1-GCCcore-12.2.0
libffi/3.4.4-GCCcore-12.2.0
libxml2/2.10.3-GCCcore-12.2.0
make/4.3-GCCcore-12.2.0
numactl/2.0.16-GCCcore-12.2.0
pkgconf/1.8.0
pkgconf/1.9.3-GCCcore-12.2.0
other under 2023.06/software/linux/x86_64/amd/zen4
2023.06/init/eessi_environment_variables
May 16 13:01:15 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 10/10 test case(s) from 10 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-99.out
❌ found message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

@ocaisa
Copy link
Member

ocaisa commented Aug 7, 2024

@boegel Should we close this? I thought we'd decided not to support 2022b on zen4

@boegel
Copy link
Contributor Author

boegel commented Aug 8, 2024

I guess we could (unless we want to figure out how to fix the broken tests for older OpenBLAS versions, which is the main issue here), but shouldn't we then also come up with a way to generate fake module files that clearly mention that the 2022b toolchain is not supported for zen4?

@ocaisa
Copy link
Member

ocaisa commented Aug 8, 2024

Is OpenBLAS the only problem? We can sidestep the issue by forcing a Zen3 build (with a modloadmsg warning of potentially poor performance?), or accepting the failing tests?

@ocaisa
Copy link
Member

ocaisa commented Aug 8, 2024

Another options is to symlink in the entire set of Zen3 modules for 2022b and add a hook for GCCcore that warns of unoptimised performance for this toolchain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2023.06-software.eessi.io 2023.06 version of software.eessi.io
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants