Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{2023.06}[foss/2021b] TensorFlow v2.7.1 #321

Closed

Conversation

laraPPr
Copy link
Collaborator

@laraPPr laraPPr commented Aug 30, 2023

edit (by @boegel, 2023-09-07): missing installations for TensorFlow-2.7.1-foss-2021b.eb in EESSI pilot 2023.06:


20 out of 69 required modules missing:

* Zip/3.0-GCCcore-11.2.0 (Zip-3.0-GCCcore-11.2.0.eb)
* double-conversion/3.1.5-GCCcore-11.2.0 (double-conversion-3.1.5-GCCcore-11.2.0.eb)
* protobuf/3.17.3-GCCcore-11.2.0 (protobuf-3.17.3-GCCcore-11.2.0.eb)
* giflib/5.2.1-GCCcore-11.2.0 (giflib-5.2.1-GCCcore-11.2.0.eb)
* pkgconfig/1.5.5-GCCcore-11.2.0-python (pkgconfig-1.5.5-GCCcore-11.2.0-python.eb)
* Bazel/3.7.2-GCCcore-11.2.0 (Bazel-3.7.2-GCCcore-11.2.0.eb)
* Ninja/1.10.2-GCCcore-11.2.0 (Ninja-1.10.2-GCCcore-11.2.0.eb)
* flatbuffers/2.0.0-GCCcore-11.2.0 (flatbuffers-2.0.0-GCCcore-11.2.0.eb)
* ICU/69.1-GCCcore-11.2.0 (ICU-69.1-GCCcore-11.2.0.eb)
* JsonCpp/1.9.4-GCCcore-11.2.0 (JsonCpp-1.9.4-GCCcore-11.2.0.eb)
* LMDB/0.9.29-GCCcore-11.2.0 (LMDB-0.9.29-GCCcore-11.2.0.eb)
* NASM/2.15.05-GCCcore-11.2.0 (NASM-2.15.05-GCCcore-11.2.0.eb)
* libjpeg-turbo/2.0.6-GCCcore-11.2.0 (libjpeg-turbo-2.0.6-GCCcore-11.2.0.eb)
* nsync/1.24.0-GCCcore-11.2.0 (nsync-1.24.0-GCCcore-11.2.0.eb)
* protobuf-python/3.17.3-GCCcore-11.2.0 (protobuf-python-3.17.3-GCCcore-11.2.0.eb)
* h5py/3.6.0-foss-2021b (h5py-3.6.0-foss-2021b.eb)
* flatbuffers-python/2.0-GCCcore-11.2.0 (flatbuffers-python-2.0-GCCcore-11.2.0.eb)
* libpng/1.6.37-GCCcore-11.2.0 (libpng-1.6.37-GCCcore-11.2.0.eb)
* snappy/1.1.9-GCCcore-11.2.0 (snappy-1.1.9-GCCcore-11.2.0.eb)
* TensorFlow/2.7.1-foss-2021b (TensorFlow-2.7.1-foss-2021b.eb)

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Instance eessi-bot-citc-aws is configured to build:

  • arch x86_64/generic for repo eessi-2021.12
  • arch x86_64/generic for repo eessi-2023.06-compat
  • arch x86_64/generic for repo eessi-2023.06-software
  • arch x86_64/intel/haswell for repo eessi-2021.12
  • arch x86_64/intel/haswell for repo eessi-2023.06-compat
  • arch x86_64/intel/haswell for repo eessi-2023.06-software
  • arch x86_64/intel/skylake_avx512 for repo eessi-2021.12
  • arch x86_64/intel/skylake_avx512 for repo eessi-2023.06-compat
  • arch x86_64/intel/skylake_avx512 for repo eessi-2023.06-software
  • arch x86_64/amd/zen2 for repo eessi-2021.12
  • arch x86_64/amd/zen2 for repo eessi-2023.06-compat
  • arch x86_64/amd/zen2 for repo eessi-2023.06-software
  • arch x86_64/amd/zen3 for repo eessi-2021.12
  • arch x86_64/amd/zen3 for repo eessi-2023.06-compat
  • arch x86_64/amd/zen3 for repo eessi-2023.06-software
  • arch aarch64/generic for repo eessi-2021.12
  • arch aarch64/generic for repo eessi-2023.06-compat
  • arch aarch64/generic for repo eessi-2023.06-software
  • arch aarch64/neoverse_n1 for repo eessi-2021.12
  • arch aarch64/neoverse_n1 for repo eessi-2023.06-compat
  • arch aarch64/neoverse_n1 for repo eessi-2023.06-software
  • arch aarch64/neoverse_v1 for repo eessi-2021.12
  • arch aarch64/neoverse_v1 for repo eessi-2023.06-compat
  • arch aarch64/neoverse_v1 for repo eessi-2023.06-software

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/generic

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/generic from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7005

date job status comment
Aug 30 09:54:40 UTC 2023 submitted job id 7005 awaits release by job manager
Aug 30 09:55:02 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 09:59:05 UTC 2023 running job 7005 is running
Aug 30 10:03:10 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7005.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1693389775.tar.gzsize: 0 MiB (91395 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi-2023.06-software arch:x86_64/intel/skylake_avx512
bot: build repo:eessi-2023.06-software arch:x86_64/amd/zen2
bot: build repo:eessi-2023.06-software arch:x86_64/amd/zen3
bot: build repo:eessi-2023.06-software arch:aarch64/generic
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_n1
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/haswell from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/skylake_avx512 from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/skylake_avx512
  • received bot command build repo:eessi-2023.06-software arch:x86_64/amd/zen2 from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/amd/zen2
  • received bot command build repo:eessi-2023.06-software arch:x86_64/amd/zen3 from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/amd/zen3
  • received bot command build repo:eessi-2023.06-software arch:aarch64/generic from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/generic
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_n1 from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1
  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/haswell resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/amd/zen2 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:x86_64/amd/zen3 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/generic resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-haswell for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7006

date job status comment
Aug 30 10:06:44 UTC 2023 submitted job id 7006 awaits release by job manager
Aug 30 10:07:23 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:08:29 UTC 2023 running job 7006 is running
Aug 30 10:12:56 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7006.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1693390327.tar.gzsize: 0 MiB (91460 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/intel/haswell/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/intel/haswell
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-skylake_avx512 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7007

date job status comment
Aug 30 10:06:52 UTC 2023 submitted job id 7007 awaits release by job manager
Aug 30 10:07:21 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:10:39 UTC 2023 running job 7007 is running
Aug 30 10:15:12 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7007.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-skylake_avx512-1693390474.tar.gzsize: 0 MiB (91584 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/intel/skylake_avx512/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/intel/skylake_avx512/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/intel/skylake_avx512
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-amd-zen2 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7008

date job status comment
Aug 30 10:06:58 UTC 2023 submitted job id 7008 awaits release by job manager
Aug 30 10:07:18 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:10:37 UTC 2023 running job 7008 is running
Aug 30 10:15:10 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7008.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen2-1693390484.tar.gzsize: 0 MiB (91435 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/amd/zen2/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen2/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen2
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-amd-zen3 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7009

date job status comment
Aug 30 10:07:05 UTC 2023 submitted job id 7009 awaits release by job manager
Aug 30 10:07:16 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:10:35 UTC 2023 running job 7009 is running
Aug 30 10:15:08 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7009.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-amd-zen3-1693390479.tar.gzsize: 0 MiB (91437 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/amd/zen3/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/amd/zen3/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/amd/zen3
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7010

date job status comment
Aug 30 10:07:11 UTC 2023 submitted job id 7010 awaits release by job manager
Aug 30 10:07:14 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:10:33 UTC 2023 running job 7010 is running
Aug 30 10:21:24 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7010.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1693390860.tar.gzsize: 0 MiB (91321 bytes)
entries: 3
modules under 2023.06/software/linux/aarch64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/generic/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_n1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7011

date job status comment
Aug 30 10:07:22 UTC 2023 submitted job id 7011 awaits release by job manager
Aug 30 10:08:28 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:11:49 UTC 2023 running job 7011 is running
Aug 30 10:17:19 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7011.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1693390569.tar.gzsize: 0 MiB (91371 bytes)
entries: 3
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/neoverse_n1/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/neoverse_n1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7012

date job status comment
Aug 30 10:07:28 UTC 2023 submitted job id 7012 awaits release by job manager
Aug 30 10:08:25 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 10:11:47 UTC 2023 running job 7012 is running
Aug 30 10:16:16 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7012.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1693390526.tar.gzsize: 0 MiB (91377 bytes)
entries: 3
modules under 2023.06/software/linux/aarch64/neoverse_v1/modules/all
no module files in tarball
software under 2023.06/software/linux/aarch64/neoverse_v1/software
no software packages in tarball
other under 2023.06/software/linux/aarch64/neoverse_v1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@laraPPr laraPPr added pilot-2023.06 ready-to-deploy Mark a PR as ready to deploy and removed ready-to-deploy Mark a PR as ready to deploy labels Aug 30, 2023
@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

forgot to add the new file to the list of easystackfiles

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/generic

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/generic from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7013

date job status comment
Aug 30 11:00:21 UTC 2023 submitted job id 7013 awaits release by job manager
Aug 30 11:00:37 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 11:04:40 UTC 2023 running job 7013 is running
Aug 30 11:08:45 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7013.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1693393716.tar.gzsize: 0 MiB (91401 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@boegel
Copy link
Contributor

boegel commented Aug 30, 2023

@laraPPr You will need to make the install script aware of EasyBuild 4.8.0 too, see https://github.com/EESSI/software-layer/blob/2023.06/EESSI-pilot-install-software.sh#L175

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/generic

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/generic from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.08/pr_321/7014

date job status comment
Aug 30 12:03:02 UTC 2023 submitted job id 7014 awaits release by job manager
Aug 30 12:03:59 UTC 2023 released job awaits launch by Slurm scheduler
Aug 30 12:08:01 UTC 2023 running job 7014 is running
Aug 30 12:11:05 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7014.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
❌ no message matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.

@laraPPr
Copy link
Collaborator Author

laraPPr commented Aug 30, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/generic

@eessi-bot
Copy link

eessi-bot bot commented Aug 30, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/generic from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi-2023.06-software architecture:x86_64/generic resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_n1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7199

date job status comment
Sep 07 11:15:28 UTC 2023 submitted job id 7199 awaits release by job manager
Sep 07 11:16:11 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 11:19:35 UTC 2023 running job 7199 is running
Sep 07 16:58:24 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7199.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1694091361.tar.gzsize: 66 MiB (69308687 bytes)
entries: 1321
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/aarch64/neoverse_n1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7200

date job status comment
Sep 07 11:15:34 UTC 2023 submitted job id 7200 awaits release by job manager
Sep 07 11:16:08 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 11:19:32 UTC 2023 running job 7200 is running
Sep 07 16:58:27 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7200.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1694090394.tar.gzsize: 66 MiB (69363405 bytes)
entries: 1321
modules under 2023.06/software/linux/aarch64/neoverse_v1/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/aarch64/neoverse_v1/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/aarch64/neoverse_v1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 7, 2023

== 2023-09-07 11:40:46,776 build_log.py:171 ERROR EasyBuild crashed with an error (at 
easybuild/tools/build_log.py:111 in caller_info): Couldn't find file 
libpng-1.6.37.tar.gz anywhere, and downloading it didn't work either...

@boegel
Copy link
Contributor

boegel commented Sep 7, 2023

@laraPPr That's probably a fluke download failure, I would just re-trigger the build again?

It's another example of why we need a better way to deal with (pre-)downloading of sources before submitting build jobs.

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 7, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/intel/haswell

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/haswell from laraPPr

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/haswell
  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/haswell resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-haswell for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7203

date job status comment
Sep 07 13:36:39 UTC 2023 submitted job id 7203 awaits release by job manager
Sep 07 16:58:21 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 17:39:43 UTC 2023 running job 7203 is running
Sep 07 17:44:53 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7203.out
❌ found message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1694108654.tar.gzsize: 0 MiB (96243 bytes)
entries: 3
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/intel/haswell/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/intel/haswell
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@laraPPr
Copy link
Collaborator Author

laraPPr commented Sep 7, 2023

@laraPPr That's probably a fluke download failure, I would just re-trigger the build again?

It's another example of why we need a better way to deal with (pre-)downloading of sources before submitting build jobs.

Should I make an issue for that with the different related errors that I have gotten the last few days?

@boegel
Copy link
Contributor

boegel commented Sep 7, 2023

@laraPPr That's probably a fluke download failure, I would just re-trigger the build again?
It's another example of why we need a better way to deal with (pre-)downloading of sources before submitting build jobs.

Should I make an issue for that with the different related errors that I have gotten the last few days?

Yes, that should probably become a feature in the bot, we can look into adding support for a pre-build fetch phase for example, so we can instruct the bot to first try and fetch all sources before letting it submit build jobs.

Ideally the bot would then automatically first submit a single fetch job when it gets the instruction to build, but that may be a bit harder (since the fetch part could take a while, and we don't want to block the bot to until fetch is done).

@boegel
Copy link
Contributor

boegel commented Sep 7, 2023

Had to restart bot job manager, it had crashed. Some builds completed, but not all, not fully out of the woods yet it seems...

@boegel
Copy link
Contributor

boegel commented Sep 7, 2023

haswell (job 7203)

ERROR: Failed to get data for PR #18320 from easybuilders/easybuild-easyconfigs (HTTP Error 403: rate limit exceeded)

neoverse_v1 (job 7200) and neoverse_n1 (job 7199)

== FAILED: Installation ended unsuccessfully (build directory: /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b): build failed (first 300 chars): At least 1 cpu tests failed:
//tensorflow/core/kernels:quantized_conv_ops_test (took 40 mins 0 secs)

looks like the actual problem here is a build issue, rather than actually running the test (same for both 7199 and 7200):

        WARNING: Download from https://storage.googleapis.com/mirror.tensorflow.org/github.com/llvm/llvm-project/archive/43d6991c2a4cc2ac374e68c029634f2b59ffdfdf.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
        WARNING: Download from http://mirror.tensorflow.org/github.com/tensorflow/runtime/archive/64c92c8013b557087351c91b5423b6046d10f206.tar.gz failed: class com.google.devtools.build.lib.bazel.repository.downloader.UnrecoverableHttpException GET returned 404 Not Found
        SUBCOMMAND: # //tensorflow/core/platform:error [action 'Linking tensorflow/core/platform/liberror.so', configuration: b656def731278652410a8870886794eb63a8b63cba4638ea88c6fb140daad619, execution platform: @local_execution_config_platform//:platform]
        ERROR: /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b/TensorFlow/tensorflow-2.7.1/tensorflow/core/kernels/BUILD:3535:18: C++ compilation of rule '//tensorflow/core/kernels:cwise_op' failed (Exit 1): gcc failed: error executing command
        FAILED: Build did NOT complete successfully
        //tensorflow/core/kernels:quantized_conv_ops_test               FAILED TO BUILD
        FAILED: Build did NOT complete successfully

could be related to https://www.githubstatus.com/incidents/2gy2gddtv23d, so worth a re-try.

Not sure why we only saw this for aarch64/neoverse_*, could be dumb luck.


aarch64/generic (job 7198)

== FAILED: Installation ended unsuccessfully (build directory: /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b): build failed (first 300 chars): At least 3 cpu tests failed:
//tensorflow/core/kernels:quantized_bias_add_op_test, //tensorflow/core/kernels:requantize_op_test, //tensorflow/core/kernels:sparse_matmul_op_test (took 1 hour 26 mins 22 secs)
==================== Test output for //tensorflow/core/kernels:requantize_op_test:
[==========] Running 3 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 3 tests from RequantizeTest
[ RUN      ] RequantizeTest.HandCraftedRequantize
tensorflow/core/framework/tensor_testutil.cc:127: Failure
Value of: IsEqual(Tx[i], Ty[i], t)
  Actual: false (128 not equal to 127)
Expected: true
i = 1
[  FAILED  ] RequantizeTest.HandCraftedRequantize (11 ms)
FAILED: //tensorflow/core/kernels:quantized_bias_add_op_test (Summary)
INFO: From Testing //tensorflow/core/kernels:quantized_bias_add_op_test:
==================== Test output for //tensorflow/core/kernels:quantized_bias_add_op_test:
[==========] Running 2 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 2 tests from QuantizedBiasAddTest
[ RUN      ] QuantizedBiasAddTest.Small
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (11 not close to 11.2518310546875)
Expected: true
i = 0 Tx[i] = 11 Ty[i] = 11.2518310546875
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (22 not close to 22.5018310546875)
Expected: true
i = 1 Tx[i] = 22 Ty[i] = 22.5018310546875
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (33 not close to 33.7518310546875)
Expected: true
i = 2 Tx[i] = 33 Ty[i] = 33.7518310546875
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (41 not close to 41.2518310546875)
Expected: true
i = 3 Tx[i] = 41 Ty[i] = 41.2518310546875
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (52 not close to 52.5018310546875)
Expected: true
i = 4 Tx[i] = 52 Ty[i] = 52.5018310546875
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (63 not close to 63.7518310546875)
Expected: true
i = 5 Tx[i] = 63 Ty[i] = 63.7518310546875
tensorflow/core/framework/tensor_testutil.cc:184: Failure
Expected equality of these values:
  num_failures
    Which is: 6
  0
Mismatches detected (atol = 0.20000000000000001 rtol = 0).
[  FAILED  ] QuantizedBiasAddTest.Small (12 ms)
[ RUN      ] QuantizedBiasAddTest.RealData
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (2006.72998046875 not close to 2029.0504150390625)
Expected: true
i = 8 Tx[i] = 2006.72998046875 Ty[i] = 2029.0504150390625
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (22.674400329589844 not close to 0.06604766845703125)
Expected: true
i = 17 Tx[i] = 22.674400329589844 Ty[i] = 0.06604766845703125
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (247.48899841308594 not close to 270.5972900390625)
Expected: true
i = 19 Tx[i] = 247.48899841308594 Ty[i] = 270.5972900390625
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (305.45098876953125 not close to 338.2301025390625)
Expected: true
i = 25 Tx[i] = 305.45098876953125 Ty[i] = 338.2301025390625
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (-486.60800170898438 not close to -507.1800537109375)
Expected: true
i = 37 Tx[i] = -486.60800170898438 Ty[i] = -507.1800537109375
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (241.60499572753906 not close to 270.5972900390625)
Expected: true
i = 50 Tx[i] = 241.60499572753906 Ty[i] = 270.5972900390625
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (-358.87399291992188 not close to -338.0980224609375)
Expected: true
i = 58 Tx[i] = -358.87399291992188 Ty[i] = -338.0980224609375
tensorflow/core/framework/tensor_testutil.cc:177: Failure
Value of: IsClose(Tx[i], Ty[i], typed_atol, typed_rtol)
  Actual: false (54.890399932861328 not close to 33.882453918457031)
Expected: true
i = 62 Tx[i] = 54.890399932861328 Ty[i] = 33.882453918457031
tensorflow/core/framework/tensor_testutil.cc:184: Failure
Expected equality of these values:
  num_failures
    Which is: 8
  0
Mismatches detected (atol = 20 rtol = 0).
[  FAILED  ] QuantizedBiasAddTest.RealData (0 ms)
[----------] 2 tests from QuantizedBiasAddTest (12 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 1 test suite ran. (12 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 2 tests, listed below:
[  FAILED  ] QuantizedBiasAddTest.Small
[  FAILED  ] QuantizedBiasAddTest.RealData

 2 FAILED TESTS
FAILED: //tensorflow/core/kernels:sparse_matmul_op_test (Summary)
      /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b/TensorFlow/bazel-root/c9c772a95da5d3d6edbd40ed737cfab2/execroot/org_tensorflow/bazel-out/aarch64-opt/testlogs/tensorflow/core/kernels/sparse_matmul_op_test/test.log
      /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b/TensorFlow/bazel-root/c9c772a95da5d3d6edbd40ed737cfab2/execroot/org_tensorflow/bazel-out/aarch64-opt/testlogs/tensorflow/core/kernels/sparse_matmul_op_test/test_attempts/attempt_1.log
      /tmp/bot/easybuild/build/TensorFlow/2.7.1/foss-2021b/TensorFlow/bazel-root/c9c772a95da5d3d6edbd40ed737cfab2/execroot/org_tensorflow/bazel-out/aarch64-opt/testlogs/tensorflow/core/kernels/sparse_matmul_op_test/test_attempts/attempt_2.log
INFO: From Testing //tensorflow/core/kernels:sparse_matmul_op_test:
==================== Test output for //tensorflow/core/kernels:sparse_matmul_op_test:
[==========] Running 4 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 4 tests from SparseMatmulOpTest
[ RUN      ] SparseMatmulOpTest.BroadcastPacketTest
[0.170094 0.170094 0.170094 0.170094] != [  0.170094    0.14922 -0.0823886   0.026985], differences: [         0 -0.0208738  -0.252482  -0.143109]
tensorflow/core/kernels/sparse_matmul_op_test.cc:329: Failure
Value of: areApprox(ref, data2, PacketSize)
  Actual: false
Expected: true
[  FAILED  ] SparseMatmulOpTest.BroadcastPacketTest (0 ms)
[ RUN      ] SparseMatmulOpTest.InterleavePacketTest
[       OK ] SparseMatmulOpTest.InterleavePacketTest (0 ms)
[ RUN      ] SparseMatmulOpTest.Bfloat16ExpandTest
[       OK ] SparseMatmulOpTest.Bfloat16ExpandTest (0 ms)
[ RUN      ] SparseMatmulOpTest.Bfloat16LoadTest
[       OK ] SparseMatmulOpTest.Bfloat16LoadTest (0 ms)
[----------] 4 tests from SparseMatmulOpTest (0 ms total)

[----------] Global test environment tear-down
[==========] 4 tests from 1 test suite ran. (0 ms total)
[  PASSED  ] 3 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] SparseMatmulOpTest.BroadcastPacketTest

 1 FAILED TEST

Common pattern here is that the commits that may fix these issues all seem to relate to Arm NEON instructions, which may explain why these issues only pop up for aarch64/generic.

@boegel
Copy link
Contributor

boegel commented Sep 7, 2023

bot: build repo:eessi-2023.06-software arch:x86_64/intel/haswell
bot: build repo:eessi-2023.06-software arch:aarch64/generic
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_n1
bot: build repo:eessi-2023.06-software arch:aarch64/neoverse_v1

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

Updates by the bot instance eessi-bot-citc-aws (click for details)
  • received bot command build repo:eessi-2023.06-software arch:x86_64/intel/haswell from boegel

    • expanded format: build repository:eessi-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi-2023.06-software arch:aarch64/generic from boegel

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/generic
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_n1 from boegel

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1
  • received bot command build repo:eessi-2023.06-software arch:aarch64/neoverse_v1 from boegel

    • expanded format: build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1
  • handling command build repository:eessi-2023.06-software architecture:x86_64/intel/haswell resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/generic resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

  • handling command build repository:eessi-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture x86_64-intel-haswell for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7212

date job status comment
Sep 07 20:16:41 UTC 2023 submitted job id 7212 awaits release by job manager
Sep 07 20:16:45 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 20:20:59 UTC 2023 running job 7212 is running
Sep 07 23:34:18 UTC 2023 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-7212.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-intel-haswell-1694129531.tar.gzsize: 307 MiB (322513735 bytes)
entries: 21363
modules under 2023.06/software/linux/x86_64/intel/haswell/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
TensorFlow/2.7.1-foss-2021b.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/x86_64/intel/haswell/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
TensorFlow/2.7.1-foss-2021b
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/x86_64/intel/haswell
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-generic for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7213

date job status comment
Sep 07 20:16:48 UTC 2023 submitted job id 7213 awaits release by job manager
Sep 07 20:17:56 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 20:21:08 UTC 2023 running job 7213 is running
Sep 07 21:46:37 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7213.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-generic-1694123047.tar.gzsize: 66 MiB (69331435 bytes)
entries: 1321
modules under 2023.06/software/linux/aarch64/generic/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/aarch64/generic/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/aarch64/generic
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_n1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7214

date job status comment
Sep 07 20:16:54 UTC 2023 submitted job id 7214 awaits release by job manager
Sep 07 20:17:51 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 20:21:05 UTC 2023 running job 7214 is running
Sep 07 22:40:33 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7214.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_n1-1694126028.tar.gzsize: 66 MiB (69310167 bytes)
entries: 1321
modules under 2023.06/software/linux/aarch64/neoverse_n1/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/aarch64/neoverse_n1/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/aarch64/neoverse_n1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@eessi-bot
Copy link

eessi-bot bot commented Sep 7, 2023

New job on instance eessi-bot-citc-aws for architecture aarch64-neoverse_v1 for repository eessi-2023.06-software in job dir /mnt/shared/home/bot/eessi-bot-software-layer/jobs/2023.09/pr_321/7215

date job status comment
Sep 07 20:17:01 UTC 2023 submitted job id 7215 awaits release by job manager
Sep 07 20:17:48 UTC 2023 released job awaits launch by Slurm scheduler
Sep 07 20:21:02 UTC 2023 running job 7215 is running
Sep 07 22:05:16 UTC 2023 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-7215.out
❌ found message matching ERROR:
❌ found message matching FAILED:
❌ found message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-neoverse_v1-1694123922.tar.gzsize: 66 MiB (69356183 bytes)
entries: 1321
modules under 2023.06/software/linux/aarch64/neoverse_v1/modules/all
Bazel/3.7.2-GCCcore-11.2.0.lua
double-conversion/3.1.5-GCCcore-11.2.0.lua
flatbuffers/2.0.0-GCCcore-11.2.0.lua
flatbuffers-python/2.0-GCCcore-11.2.0.lua
giflib/5.2.1-GCCcore-11.2.0.lua
h5py/3.6.0-foss-2021b.lua
ICU/69.1-GCCcore-11.2.0.lua
JsonCpp/1.9.4-GCCcore-11.2.0.lua
libjpeg-turbo/2.0.6-GCCcore-11.2.0.lua
libpng/1.6.37-GCCcore-11.2.0.lua
LMDB/0.9.29-GCCcore-11.2.0.lua
NASM/2.15.05-GCCcore-11.2.0.lua
Ninja/1.10.2-GCCcore-11.2.0.lua
nsync/1.24.0-GCCcore-11.2.0.lua
pkgconfig/1.5.5-GCCcore-11.2.0-python.lua
protobuf/3.17.3-GCCcore-11.2.0.lua
protobuf-python/3.17.3-GCCcore-11.2.0.lua
snappy/1.1.9-GCCcore-11.2.0.lua
Zip/3.0-GCCcore-11.2.0.lua
software under 2023.06/software/linux/aarch64/neoverse_v1/software
Bazel/3.7.2-GCCcore-11.2.0
double-conversion/3.1.5-GCCcore-11.2.0
flatbuffers/2.0.0-GCCcore-11.2.0
flatbuffers-python/2.0-GCCcore-11.2.0
giflib/5.2.1-GCCcore-11.2.0
h5py/3.6.0-foss-2021b
ICU/69.1-GCCcore-11.2.0
JsonCpp/1.9.4-GCCcore-11.2.0
libjpeg-turbo/2.0.6-GCCcore-11.2.0
libpng/1.6.37-GCCcore-11.2.0
LMDB/0.9.29-GCCcore-11.2.0
NASM/2.15.05-GCCcore-11.2.0
Ninja/1.10.2-GCCcore-11.2.0
nsync/1.24.0-GCCcore-11.2.0
pkgconfig/1.5.5-GCCcore-11.2.0-python
protobuf/3.17.3-GCCcore-11.2.0
protobuf-python/3.17.3-GCCcore-11.2.0
snappy/1.1.9-GCCcore-11.2.0
Zip/3.0-GCCcore-11.2.0
other under 2023.06/software/linux/aarch64/neoverse_v1
.lmod/cache/spiderT.lua
.lmod/cache/spiderT.luac_5.1
.lmod/cache/timestamp

@boegel
Copy link
Contributor

boegel commented Sep 8, 2023

aarch64/generic (job 7213) now failed due to "virtual memory exhausted: Cannot allocate memory". Whut.

@boegel
Copy link
Contributor

boegel commented Sep 8, 2023

aarch64/neoverse_n1 (job 7214) now failed with:

== 2023-09-07 22:27:46,149 build_log.py:171 ERROR EasyBuild crashed with an error (at easybuild/tools/build_log.py:111 in caller_info): At least 12 cpu tests failed:
//tensorflow/core/kernels:quantized_bias_add_op_test, //tensorflow/core/kernels:requantize_op_test, //tensorflow/core/kernels:sparse_matmul_op_test, //tensorflow/python/kernel_tests:conditional_accumulator_test, //tensorflow/python/kernel_tests:constant_op_test, //tensorflow/python/kernel_tests:fifo_queue_test, //tensorflow/python/kernel_tests:logging_ops_test, //tensorflow/python/kernel_tests:sparse_conditional_accumulator_test, //tensorflow/python/kernel_tests:variable_ops_test, //tensorflow/python/kernel_tests:xent_op_test, //tensorflow/python:collective_ops_test, //tensorflow/python:math_grad_test (at easybuild/framework/easyblock.py:2265 in report_test_failure)

Likewise for aarch64/neoverse_v1 (job 7215).

These issues may be fixed upstream already, see tensorflow/tensorflow#53260 for example for the math_grad_test problem on aarch64 (which should be fixed with tensorflow/tensorflow@d4ea582).

We should probably first try to fix the problem on aarch64/generic (since there are less, and they overlap), and also work around the memory problem by letting Bazel only use half the available cores (effectively doubling the available memory).

lara and others added 2 commits September 20, 2023 09:37
…software that requires a lot of memory during the build/test, like TensorFlow
@trz42
Copy link
Collaborator

trz42 commented Feb 8, 2024

Just trying out the new bot command.

bot: status

Copy link

eessi-bot bot commented Feb 8, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command status from trz42
    • expanded format: status

@laraPPr laraPPr closed this Apr 2, 2024
trz42 pushed a commit to trz42/software-layer that referenced this pull request Apr 14, 2024
{2023.06}[foss/2023a] BWA v0.7.17-20220923
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants