In this section of the tutorial you will learn how to share Spack built binaries across machines and users using build caches.
We will explore a few concepts that apply to all types of build caches, but the focus is primarily on OCI container registries like Docker Hub or Github Packages as a storage backend for binary caches. Spack supports a range of storage backends, like an ordinary filesystem, S3, and Google Cloud Storage, but OCI build caches have a few interesting properties that make them worth exploring more in depth.
Before we configure a build cache, let's install the julia
package, which is
an interesting example because it has some non-trivial dependencies like llvm
,
and features an interactive REPL that we can use to verify that the installation works.
$ mkdir ~/myenv && cd ~/myenv
$ spack env create --with-view view .
$ spack -e . add julia
$ spack -e . install
Let's run the julia
REPL
$ ./view/bin/julia
julia> 1 + 1
2
Now we'd like to share these executables with other users. First we will focus on sharing the binaries with other Spack users, and later we will see how users completely unfamiliar with Spack can easily use the applications too.
For this tutorial we will be using GitHub Packages as an OCI registry, since most people have a GitHub account and it's easy to use.
First go to https://github.com/settings/tokens to generate a Personal access
token with write:packages
permissions. Copy this token.
Next, we will add this token to the mirror config section of the Spack environment:
$ export MY_OCI_TOKEN=<token>
$ spack -e . mirror add \
--oci-username <user> \
--oci-password-variable MY_OCI_TOKEN \
--unsigned \
my-mirror \
oci://ghcr.io/<github_user>/buildcache-${USER}-${HOSTNAME}
Note
We talk about mirrors and build caches almost interchangeably, because every build cache is a binary mirror. Source mirrors exist too, which we will not cover in this tutorial.
Your spack.yaml
file should now contain the following:
spack:
specs:
- julia
mirrors:
my-mirror:
url: oci://ghcr.io/<github_user>/buildcache-<user>-<host>
access_pair:
id: <user>
secret_variable: MY_OCI_TOKEN
signed: false
Let's push julia
and its dependencies to the build cache
$ spack -e . buildcache push my-mirror
which outputs
==> Selected 66 specs to push to oci://ghcr.io/<github_user>/buildcache-<user>-<host>
==> Checking for existing specs in the buildcache
==> 66 specs need to be pushed to ghcr.io/<github_user>/buildcache-<user>-<host>
==> Uploaded sha256:d8d9a5f1fa443e27deea66e0994c7c53e2a4a618372b01a43499008ff6b5badb (0.83s, 0.11 MB/s)
...
==> Uploading manifests
==> Uploaded sha256:cdd443ede8f2ae2a8025f5c46a4da85c4ff003b82e68cbfc4536492fc01de053 (0.64s, 0.02 MB/s)
...
==> Pushed [email protected]/ew3aaos to ghcr.io/<user>/buildcache-<user>-<host>:zstd-1.5.6-ew3aaosbmf3ts2ylqgi4c6enfmf3m5dr.spack
...
==> Pushed [email protected]/dfzhutf to ghcr.io/<user>/buildcache-<user>-<host>:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack
The location of the pushed package
ghcr.io/<github_user>/buildcache-<user>-<host>:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack
looks very similar to a container image --- we will get to that in a bit.
Note
Binaries pushed to GitHub packages are private
by default, which means you need a token
to download them. You can change the visibility to public
by going to GitHub Packages
from your GitHub account, selecting the buildcache
package, go to package settings
,
and change the visibility to public
in the Danger Zone
section. This page can also
be directly accessed by going to
https://github.com/users/<user>/packages/container/buildcache/settings
We will now verify that the build cache works by reinstalling julia
.
Let's make sure that we only use the build cache that we just created, and not the
builtin one that is configured for the tutorial. The easiest way to do this is to
override the mirrors
config section in the environment by using a double colon
in the spack.yaml
file:
spack:
specs:
- julia
mirrors:: # <- note the double colon
my-mirror:
url: oci://ghcr.io/<github_user>/buildcache-<user>-<host>
access_pair:
id: <user>
secret_variable: MY_OCI_TOKEN
signed: false
An "overwrite install" should be enough to show that the build cache is used:
$ spack -e . install --overwrite julia
==> Fetching https://ghcr.io/v2/<user>/buildcache-<user>-<host>/blobs/sha256:34f4aa98d0a2c370c30fbea169a92dd36978fc124ef76b0a6575d190330fda51
==> Fetching https://ghcr.io/v2/<user>/buildcache-<user>-<host>/blobs/sha256:3c6809073fcea76083838f603509f10bd006c4d20f49f9644c66e3e9e730da7a
==> Extracting julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl from binary cache
[+] /home/spack/spack/opt/spack/linux-ubuntu22.04-x86_64_v3/gcc-11.4.0/julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl
Two blobs are fetched for each spec: a metadata file and the actual binary package. If you've
used docker pull
or other container runtimes before, these types of hashes may look
familiar. OCI registries are content addressed, which means that we see hashes like these
instead of human-readable file names.
Spack's concretizer optimizes for reuse. This means that it will avoid source builds if it can use specs for which binaries are readily available.
In the previous example we managed to install packages from our build cache, but we did not concretize our environment again. Users on other machines with different distributions will have to concretize, and therefore we should make sure that the build cache is indexed so that the concretizer can take it into account. This can be done by running
$ spack -e . buildcache update-index my-mirror
This operation can take a while for large build caches, since it fetches all metadata of
available packages. For convenience you can also run spack buildcache push --update-index ...
to avoid a separate step.
Note
As of Spack 0.22, build caches can be used across different Linux distros. The concretizer
will reuse specs that have a host compatible libc
dependency (e.g. glibc
or musl
).
For packages compiled with gcc
(and a few others), users do not have to install compilers
first, as the build cache contains the compiler runtime libraries as a separate package.
After an index is created, it's possible to list the available packages in the build cache:
$ spack -e . buildcache list --allarch
The build cache we have created uses an OCI registry, which is the same technology that is used
to store container images. So far we have used this build cache as any other build cache: the
concretizer can use it to avoid source builds, and spack install
will fetch binaries from it.
However, we can also use this build cache to share binaries directly as runnable container images.
We can already attempt to run the image associated with the julia
package that we have
pushed earlier:
$ docker run ghcr.io/<user>/buildcache-<user>-<host>:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack julia
exec /home/spack/spack/opt/spack/linux-ubuntu22.04-x86_64_v3/gcc-11.4.0/julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl/bin/julia: no such file or directory
but immediately we see it fails. The reason is that one crucial part is missing, and that is a
glibc
, which Spack always treats as an external package.
To fix this, we force push to the registry again, but this time we specify a base image with a
recent version of glibc
, for example from ubuntu:24.04
:
$ spack -e . buildcache push --force --base-image ubuntu:24.04 my-mirror
...
==> Pushed [email protected]/dfzhutf to ghcr.io/<user>/buildcache:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack
Now let's pull this image again and run it:
$ docker pull ghcr.io/<github_user>/buildcache-<user>-<host>:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack
$ docker run -it --rm ghcr.io/<github_user>/buildcache-<user>-<host>:julia-1.9.3-dfzhutfh3s2ekaltdmujjn575eip5uhl.spack
root@f53920f8695a:/# julia
julia> 1 + 1
2
This time it works! The minimal ubuntu:24.04
image provides us not only with glibc
, but
also other utilities like a shell.
Notice that you can use any base image of choice, like fedora
or rockylinux
. The only
constraint is that it has a libc
compatible with the external in the Spack built the binaries.
Spack does not validate this.
The previous container image is a good start, but it would be nice to add some more utilities to the image. If you've paid attention to the output of some of the commands we have run so far, you may have noticed that Spack generates exactly one image tag for each package it pushes to the registry. Every Spack package corresponds to a single layer in each image, and the layers are shared across the different image tags.
Because Spack installs every package into a unique prefix, it is incredibly easy to compose
multiple packages into a container image. In contrast to Docker images built from commands
in a Dockerfile
where each command is run in order, Spack package layers are independent,
and can in principle be combined in any order.
Let's add a simple text editor like vim
to our previous environment next to julia
, so that
we could both edit and run Julia code.
Note
You may want to change mirrors::
to mirrors:
in the spack.yaml
file to avoid
a source build of vim
--- but a source build should be quick.
$ spack -e . install --add vim
This time we push to the OCI registry, but also pass --tag julia-and-vim
to instruct Spack
to create an image for the environment as a whole, with a human-readable tag:
$ spack -e . buildcache push --base-image ubuntu:24.04 --tag julia-and-vim my-mirror
==> Tagged ghcr.io/<user>/buildcache:julia-and-vim
Now let's run a container from this image:
$ docker run -it --rm ghcr.io/<github_user>/buildcache-<user>-<host>:julia-and-vim
root@f53920f8695a:/# vim ~/example.jl # create a new file with some Julia code
root@f53920f8695a:/# julia ~/example.jl # and run it
In older versions of Spack it was common practice to generate a Dockerfile
from a
Spack environment using the spack containerize
command, and then use docker build
or other runtimes to create a container image.
This would trigger a multi-stage build, where the first stage would install Spack itself,
compilers and the environment, and the second stage would copy the installed environment
into a smaller image. For those familiar with Dockerfile
syntax, it would structurally look
like this:
FROM <base image> AS build
COPY spack.yaml /root/env/spack.yaml
RUN spack -e /root/env install
FROM <base image>
COPY --from=build /opt/spack/opt /opt/spack/opt
This approach is still valid, and the spack containerize
command continues to exist, but it
has a few downsides:
- When
RUN spack -e /root/env install
fails,docker
will not cache the layer, meaning that all dependencies that did install successfully are lost. Troubleshooting the build typically means starting from scratch indocker run
or on the host system. - In certain CI environments, it is not possible to use
docker build
. For example, the CI script itself may already run in a docker container, and runningdocker build
safely inside a container is tricky.
The takeaway is that Spack decouples the steps that docker build
combines:
build isolation, running the build, and creating an image. You can run
spack install
on your host machine or in a container, and run
spack buildcache push
separately to create an image.
Spack is different from many package managers in that it lets users choose where to install packages. This makes Spack very flexible, as users can install packages in their home directory and do not need root privileges. The downside is that sharing binaries is more complicated, as binaries may contain hard-coded, absolute paths to machine specific locations, which have to be adjusted when binaries are installed on a different machine.
Fortunately Spack handles this automatically upon install from a binary cache. But when you build binaries that are intended to be shared, there is one thing you have to keep in mind: Spack can relocate hard-coded paths in binaries provided that the target prefix is shorter than the prefix used during the build.
The reason is that binaries typically embed these absolute paths in string tables, which is a list of null terminated strings, to which the program stores offsets. That means we can only modify strings in-place, and if the new path is longer than the old one, we would overwrite the next string in the table.
To maximize the chances of successful relocation, you should build your binaries in a relative long path. Fortunately Spack can automatically pad paths to make them longer, using the following command:
$ spack -e . config add config:install_tree:padded_length:256
Build caches are a great way to speed up CI pipelines. Both GitHub Actions and Gitlab CI support container registries, and this tutorial should give you a good starting point to leverage them.
Spack also provides a basic GitHub Action to already provide you with a binary cache:
jobs:
build:
runs-on: ubuntu-22.04
steps:
- name: Set up Spack
uses: spack/setup-spack@v2
- run: spack install python # uses a shared build cache
and the setup-spack readme shows you how to cache further binaries that are not in the shared build cache.
In this tutorial we have created a build cache on top of an OCI registry, which can be used
- to
spack install julia vim
on machines without source builds - to automatically create container images for individual packages while pushing to the cache
- to create container images for multiple packages at once