Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updating modules within Bright Cluster with Lmod #227

Open
SomePersonSomeWhereInTheWorld opened this issue May 6, 2024 · 5 comments
Open

Comments

@SomePersonSomeWhereInTheWorld

We have a few Bright Computing (now owned by NVIDIA and renamed Base Command Manager) clusters where in one case Lmod version 8.3 is installed in the head node. We'd like to make Brainiak available as a shpc module for all users. What would the correct command be to replace the following?

And then you can tell lmod about your modules folder:

$ module use ./modules

Also based on the getting started instructions

if you install to a system python, meaning either of these commands:

> python setup.py install
> pip install .

If we are using several versions of Anaconda Python loadable as a module, should we just pick the latest and install this as root so the shpc command becomes available for everyone that loads that module?

@vsoch
Copy link
Member

vsoch commented May 6, 2024

The module command might depend on your module software, but generally speaking, shpc is going to create modules (lmod or environment modules) that you need to add to the equivalent of your module path. You should only need shpc to generate those original modules and pull the container, and then the subsequent commands depend on the module software you are using.

@SomePersonSomeWhereInTheWorld
Copy link
Author

The module command might depend on your module software, but generally speaking, shpc is going to create modules (lmod or environment modules) that you need to add to the equivalent of your module path. You should only need shpc to generate those original modules and pull the container, and then the subsequent commands depend on the module software you are using.

OK I see how shpc config edit can be used to edit module_base:

Now module use ./modules works. Perhaps a footnote or comment to make sure the path to the modulefiles are set before running this? Just a suggestion.

The following example:

singularity exec brainiak/brainiak_latest.sif "$@"
Error for command "exec": requires at least 2 arg(s), only received 1

Usage:
  singularity [global options...] exec [exec options...] <container> <command>

Run 'singularity --help' for more detailed usage information.

Should probably be updated.

I also get this:

$ singularity exec brainiak/brainiak_latest.sif  /mnt/brainiak/tutorials/run_jupyter_docker.sh 
/usr/bin/python3: No module named notebook
$ singularity run brainiak/brainiak_latest.sif  /mnt/brainiak/tutorials/run_jupyter_docker.sh 
/usr/bin/python3: No module named notebook

Are these issues related? Is the Docker container broken?
brainiak/brainiak#517
brainiak/brainiak#539

@vsoch
Copy link
Member

vsoch commented May 6, 2024

You'd want to test the specific commands that are associated with those containers, and open a PR if one isn't working. The registry is community maintained (and updated automatically) so it's possible (and easy) to see that a particular entrypoint would not work.

I'm not sure you are showing me all the commands you are running, but doing Singularity exec to a module owned image isn't exactly the use case for shpc, you should be using the wrapper scripts or aliases that shpc provides in the module file.

@SomePersonSomeWhereInTheWorld
Copy link
Author

You'd want to test the specific commands that are associated with those containers, and open a PR if one isn't working. The registry is community maintained (and updated automatically) so it's possible (and easy) to see that a particular entrypoint would not work.

I'm not sure you are showing me all the commands you are running, but doing Singularity exec to a module owned image isn't exactly the use case for shpc, you should be using the wrapper scripts or aliases that shpc provides in the module file.

Yes I see these:

       - brainiak-srm-run:
             singularity run -B <wrapperDir>/99-shpc.sh:/.singularity.d/env/99-shpc.sh <container> "$@"
       - brainiak-srm-shell:
             singularity shell -s /bin/sh -B <wrapperDir>/99-shpc.sh:/.singularity.d/env/99-shpc.sh <container>
       - brainiak-srm-exec:
             singularity exec -B <wrapperDir>/99-shpc.sh:/.singularity.d/env/99-shpc.sh <container> "$@"

My understanding is the left side of the colon for the -B option is what's on the actual server and the right side is what exists in the container.

Where is the '99-shpc.sh' coming from? I see it's in the container but is the idea to copy the file from the container to your local directory?

@vsoch
Copy link
Member

vsoch commented May 6, 2024

-B is a bind request (the other form of that flag is --bind). and those commands show binding the environment file from the wrapper directory to the shpc environment. That isn't the actual command, because the actual command has a full path to that. If you do an exec you do need to provide a command. if you use shell you should shell in, and run will hit the container entrypoint.

Try looking at the script directory in the wrapper directory (that is an shpc setting) to see what is actually being run. Then you can debug directly with the container. shpc can't take ownership of the scope of issues that can happen with containers, but we definitely accept fixes to any of the container.yaml files that generate the modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants