-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prefer using Singularity's support for scratch directories over bind mounting for /var/lib/cvmfs and /var/run/cvmfs #40
base: main
Are you sure you want to change the base?
Conversation
…mounting for /var/lib/cvmfs and /var/run/cvmfs
The downside of this is that you are removing your cache every time you shut the container. That was ok for me because I was using a pre-populated alien cache. |
@ocaisa So should we have separate sections? One for bind mounting (persistent cache), one using That's going to complicate the pilot instructions quite a bit... |
I think we don't really have too much choice but to cover various scenarios. Every option is going to have to be able to run MPI workloads and I believe bind mounting is not going to tick that box. After that there is whether we have an internet connection or not. This is actually not that complicated if we advise the use of an alien cache (but an alien cache is unmanaged which means it can grow arbitrarily large). Whether we need to pre-populate is defined by whether or not we have internet access from where the alien cache is being used (and if we don't prepopulate, anyone who uses the cache will need write permissions there). That's my "current" understanding of things. |
I've been continuously updating the script in EESSI/filesystem-layer#37 and that is now using 2 layers of alien cache, one shared (read only) and one local. The shared layer is pre-populated only with the requested stack (not the whole repo), and this could be restricted even further if I use |
In general, I think it is really important that we get this advice right. Another option is to create a squid proxy on the login if the login is connected to the outside world. As an unprivileged user you'd have to be able to have that running for the entire job queuing and execution time (I imagine). @rptaylor made some suggestions about how to do this as well. I think it's worth having a serious discussion about (and integrating some testing for). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always work if where you are trying things has an internet connection (but hey you need that to pull the image anyway). We should warn people though that the cache is getting blown away and if you want to do more than just try things out you will need additional (cache) configuration.
I had some issues with that |
Unfortunately I'm pretty sure there is a "but" here, and that is that using |
Our docs will contain a lot of |
Ah, yeah, you're right. It doesn't seem to create a unique dir inside the specified directory. So maybe we should then instruct to do something like
|
@bedroge That won't help? You basically need a unique scratch dir per MPI process, if I understand correctly... |
Whoops, right... then it has to be in the singularity command that gets invoked by srun. |
Or let srun call some wrapper script that creates a unique dir for that process/task? |
The solution with export
This was fixed by defining a singularity workdir:
The result would be:
This works well with a workdir on a local disk, but when redirecting to a shared FS (BeeGFS) it is still not a solution. |
It turns out that bind mounting
/var/lib/cvmfs
and/var/run/cvmfs
to temporary directories can lead to problems in some situations, leading to "Failed to initialize loader socket
" errors when mounting the/cvmfs
repositories.This has proven to be an issue in multiple different situations:
/tmp
(encountered by @trz42)Using the support that Singularity has for scratch directories (
singularity --scratch
or equivalently$SINGULARITY_SCRATCH
) circumvents these issues.