Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new init scripts for new initialization module #668

Merged

Conversation

MaKaNu
Copy link
Contributor

@MaKaNu MaKaNu commented Aug 12, 2024

This is a followup PR for #667

After successful merge of #667 I have some test cases prepared, to test against the different shell.

For the moment, I struggle with an implementation for csh. It only seems to happen when I try to load the module inside csh. Error response of csh is not the brightest I've seen.

Copy link

eessi-bot bot commented Aug 12, 2024

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

Copy link

eessi-bot bot commented Aug 12, 2024

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

init/bash Outdated
export MODULEPATH=/cvmfs/software.eessi.io/versions/"$EESSI_VERSION"/init/modules
. /cvmfs/software.eessi.io/versions/"$EESSI_VERSION"/compat/linux/$(uname -m)/usr/share/Lmod/init/bash

module load "$LMOD_SYSTEM_DEFAULT_MODULES"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module load should not be necessary, they should be loaded when Lmod is initialised

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The module load should not be necessary, they should be loaded when Lmod is initialised

In my test cases the EESSI/2023.06 module was not loaded automatically, that why I put it in.

Just setting StdEnv.lua doesn't do the trick also.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there already an Lmod loaded when you sourced it?

module load $LMOD_SYSTEM_DEFAULT_MODULES is a little brittle as this won't work if there is more than module in the variable (I think). It's probably best to take the official route as per the Lmod docs (shell dependent):

if [ -z "$__Init_Default_Modules" ]; then
   export __Init_Default_Modules=1;

   ## ability to predefine elsewhere the default list
   LMOD_SYSTEM_DEFAULT_MODULES=${LMOD_SYSTEM_DEFAULT_MODULES:-"StdEnv"}
   export LMOD_SYSTEM_DEFAULT_MODULES
   module --initial_load --no_redirect restore
else
   module refresh
fi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there already an Lmod loaded when you sourced it?

No, Lmod was not loaded at that moment. Only loaded with the script.

@MaKaNu
Copy link
Contributor Author

MaKaNu commented Aug 12, 2024

Ok it expanded the PS1 inside the git commit message 🥲

@MaKaNu
Copy link
Contributor Author

MaKaNu commented Aug 13, 2024

module load $LMOD_SYSTEM_DEFAULT_MODULES is a little brittle as this won't work if there is more than module in the variable (I think). It's probably best to take the official route as per the Lmod docs (shell dependent):

Implemented in last commit. Works as expected, with exclusion for csh.

MaKaNu added 7 commits August 16, 2024 13:58
- add test script
- add github actions
not sure what the output for generic would be.
But different arches should be already tested on module
- Fix Format
@MaKaNu
Copy link
Contributor Author

MaKaNu commented Aug 16, 2024

Pushed tests are expected to fail for now (getting interesting after merge of #667)

init/bash Outdated Show resolved Hide resolved
init/bash Outdated
export __Init_Default_Modules=1;

## ability to predefine elsewhere the default list
LMOD_SYSTEM_DEFAULT_MODULES=${LMOD_SYSTEM_DEFAULT_MODULES:-"StdEnv"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LMOD_SYSTEM_DEFAULT_MODULES=${LMOD_SYSTEM_DEFAULT_MODULES:-"StdEnv"}
LMOD_SYSTEM_DEFAULT_MODULES=${LMOD_SYSTEM_DEFAULT_MODULES:-"EESSI"}

You'll need this everywhere

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some sites also force full name matching so you probably can't leave it versionless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now implemented with Version number. It doesn't matter to me but like to understand the difference why StdEnv.lua is not the correct solution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no existing StdEnv.lua when Lmod is configured (we don't ship one)

@ocaisa
Copy link
Member

ocaisa commented Sep 5, 2024

We never included PS1 in #667 so your tests will fail for that currently. If you are keen to have that, it could be part of this PR or a follow-up after this is merged (this PR has higher priority in my opinion)

- move new scripts
- copy bash
restore original content of bash init
@boegel boegel mentioned this pull request Sep 12, 2024
csh is not as feature rich as other shells I have implemented.
Maybe someday I learn how to approach it!
init/lmod/csh Outdated Show resolved Hide resolved
MaKaNu and others added 3 commits September 12, 2024 15:47
before we run any tests we check if the shell provided is actually
testable.
If a shell is not tested yet only means we don't know how to test the
shell or the shell is not added to the TEST_SHELLS Array.
We actually don't know how to csh for now.
But if we figure out we don't need to update here.
remove leftover comment

Co-authored-by: ocaisa <[email protected]>
ocaisa
ocaisa previously approved these changes Sep 12, 2024
Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ocaisa
Copy link
Member

ocaisa commented Sep 12, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

Copy link

eessi-bot bot commented Sep 12, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_668/18438

date job status comment
Sep 12 16:50:38 UTC 2024 submitted job id 18438 awaits release by job manager
Sep 12 16:51:12 UTC 2024 released job awaits launch by Slurm scheduler
Sep 12 16:57:14 UTC 2024 running job 18438 is running
Sep 12 17:16:35 UTC 2024 finished
😢 FAILURE (click triangle for details)
Details
✅ job output file slurm-18438.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Sep 12 17:16:35 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-18438.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account ocaisa has NO permission to send commands to the bot

@ocaisa
Copy link
Member

ocaisa commented Sep 12, 2024

You need something similar to https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/install_scripts.sh#L94-L97 so that the new scripts are actually deployed

Add copy of init lmod scripts
@ocaisa
Copy link
Member

ocaisa commented Sep 12, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account ocaisa has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 12, 2024

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_668/18439

date job status comment
Sep 12 20:48:12 UTC 2024 submitted job id 18439 awaits release by job manager
Sep 12 20:48:54 UTC 2024 released job awaits launch by Slurm scheduler
Sep 12 20:54:57 UTC 2024 running job 18439 is running
Sep 12 21:14:17 UTC 2024 finished
😁 SUCCESS (click triangle for details)
Details
✅ job output file slurm-18439.out
✅ no message matching ERROR:
✅ no message matching FAILED:
✅ no message matching required modules missing:
✅ found message(s) matching No missing installations
✅ found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1726174458.tar.gzsize: 0 MiB (756 bytes)
entries: 5
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/init/lmod/bash
2023.06/init/lmod/csh
2023.06/init/lmod/fish
2023.06/init/lmod/ksh
2023.06/init/lmod/zsh
Sep 12 21:14:17 UTC 2024 test result
😁 SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
✅ job output file slurm-18439.out
✅ no message matching ERROR:
✅ no message matching [\s*FAILED\s*].*Ran .* test case
Sep 12 22:31:36 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1726174458.tar.gz to S3 bucket succeeded

@MaKaNu
Copy link
Contributor Author

MaKaNu commented Sep 12, 2024

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • account MaKaNu has NO permission to send commands to the bot

Copy link

eessi-bot bot commented Sep 12, 2024

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account MaKaNu has NO permission to send commands to the bot

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account MaKaNu has NO permission to send commands to the bot

Copy link
Member

@ocaisa ocaisa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for steering this over the line!

@ocaisa ocaisa added the bot:deploy Ask bot to deploy missing software installations to EESSI label Sep 12, 2024

Label bot:deploy has been set by user ocaisa, but this person does not have permission to trigger deployments

@ocaisa
Copy link
Member

ocaisa commented Sep 12, 2024

Staging PR merged, this is now in the wild!

@ocaisa ocaisa merged commit 4b5bd66 into EESSI:2023.06-software.eessi.io Sep 12, 2024
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bot:deploy Ask bot to deploy missing software installations to EESSI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants