Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use alma9 images on linux, while staying on 2.17 sysroot #6283

Closed
Tracked by #1941
h-vetinari opened this issue Aug 12, 2024 · 19 comments · Fixed by #6626
Closed
Tracked by #1941

Use alma9 images on linux, while staying on 2.17 sysroot #6283

h-vetinari opened this issue Aug 12, 2024 · 19 comments · Fixed by #6626

Comments

@h-vetinari
Copy link
Member

h-vetinari commented Aug 12, 2024

Since it's taken us so long to upgrade from centOS 6 to 7, we'll very quickly find ourselves in a situation that we struggled with already some time ago (see discussion in this issue): a growing number of feedstocks will require a newer glibc, and our images should provide a new enough baseline version, so that the only thing that feedstocks need to actively override is c_stdlib_version (and thus the sysroot), but nothing else.

For example, google's "foundational" support matrix defines a lower bound of glibc 2.27, meaning that things like abseil/bazel/grpc/protobuf/re2 etc. will start beginning to rely on glibc features >2.17 in the near future (and even though it's not a google project, one of the baseline dependencies of that stack is starting to require it).

We can handle the c_stdlib_version in the migrators (like we did for macOS 10.13, before the conda-forge-wide baseline was lifted), but changing more than one key of the mega-zip involving the docker images is really painful, especially if CUDA is involved (example), so having the images be alma8 by default will be very helpful there.

To a lesser extent, it will also save us from run-time bugs in older glibc versions, like we had with some broken trigonometry functions in 2.12 (before the image moved to cos7 while the sysroot still stayed at 2.12).

There are other scenarios still where this is necessary, see the discussion here for example.

While it's already possible to use alma8 images, the main thing we're blocked is the lack of CDTs for alma 8 (pending conda-forge/cdt-builds#66), c.f. conda-forge/conda-forge.github.io#1941

This issue is for tracking this effort, and any other eventual tasks that need to be resolved before doing so.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Oct 6, 2024

It would be great if we could do this for aarch + cuda12 to start. But I think we should generally move the base image forward.

xref: https://github.com/conda-forge/pytorch-cpu-feedstock/blob/main/recipe/conda_build_config.yaml#L17

@h-vetinari
Copy link
Member Author

AFAIU the only remaining issue is the reduction of CDTs from cos7 to alma8; we should try to do a special "remove/replace CDTs" migration because breaking 100+ feedstocks is not really a good option, even if we provide a way to opt into the old images again.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Oct 6, 2024

Ah i see, the tk feedstock throws quite a thorn in the "rip out all the CDTs too".

@isuruf
Copy link
Member

isuruf commented Oct 6, 2024

Using alma8 images and using an up-to-date sysroot are two different problems. In order to change to alma8 images, we just need to check to see if all the requirements in recipe/yum_requirements.txt are available in the alma8 images.

@isuruf
Copy link
Member

isuruf commented Oct 6, 2024

It doesn't have to be an exhaustive check, but just a few feedstocks that use yum requirements to see if it really works.

@jakirkham
Copy link
Member

Wonder if a smaller step would simply making it a bit easier for users to opt-in to using the newer images

Did a little exploration of this in PR: #6548

Likely needs more work, but maybe it is already useful for discussion/iteration at this stage

@h-vetinari
Copy link
Member Author

h-vetinari commented Oct 28, 2024

In order to change to alma8 images, we just need to check to see if all the requirements in recipe/yum_requirements.txt are available in the alma8 images.

So I took the content of all the yum_requirements.txt in conda-forge1. Many of them can be replaced (x11, mesa, alsa, various tool that we have packaged).

entries from
yum_reqirements.txt
is it still
necessary?
available
on alma8?
comment
alsa-lib use our alsa-lib
alsa-lib-devel use our alsa-lib
alsa-tools
binutils used to bootstrap binutils
csh there's tcsh in centos7/alma8, but I cannot find csh in either
chrpath used only on obsolete qt feedstock
dbus-devel
dbus-libs
dejavu-sans-mono-fonts used only for pymor
ed used only for bc
file used to bootstrap crosstool-ng
findutils used to bootstrap linux compilers and crosstool-ng
gcc-c++ used to bootstrap linux compilers and binutils
gcc-gfortran used to bootstrap linux compilers
gtk2 use our gtk2
gtk2-devel use our gtk2
gtk3 use our gtk3
gtk3-devel use our gtk3
gtkmm24 use our gtkmm?
hatchling cannot find it in either centos7/alma8;
use our hatchling; only used for biosiglive
help2man used to bootstrap linux compilers
httpd-devel only used for mod_wsgi
kernel-headers use our {{ cdt("kernel-headers") }}; only used for gdal
libglu1-mesa this is the Debian-style naming; RHEL-style has mesa-libGLU;
use our libglvnd{,-devel}
libglvnd-egl use our libglvnd{,-devel}
libglvnd-glx use our libglvnd{,-devel}
libglvnd-opengl use our libglvnd{,-devel}
libice use our xorg-libice
libice-devel use our xorg-libice
libselinux
libsm use our xorg-libsm
libsm-devel use our xorg-libsm
libudev this is the Debian-style naming; we also have our own libudev
libudev-devel this is the Debian-style naming; we also have our own libudev
libX11 use our xorg-libx11, but needed on python
libX11-devel use our xorg-libx11
libXau / libxau use our xorg-libxau, but needed on python
libXau-devel use our xorg-libxau
libxcb use our libxcb, but needed on python
libXcomposite use our xorg-libxcomposite
libXcomposite-devel use our xorg-libxcomposite
libXcursor use our xorg-libxcursor
libXcursor-devel use our xorg-libxcursor
libXdamage use our xorg-libxdamage
libXdmcp use our xorg-libxdmcp
libXdmcp-devel use our xorg-libxdmcp
libXext use our xorg-libxext
libXext-devel use our xorg-libxext
libXfixes use our xorg-libxfixes
libXi use our xorg-libxi
libXi-devel use our xorg-libxi
libXinerama use our xorg-libxinerama
libxkbcommon-x11 use our libxkbcommon
libXrandr use our xorg-libxrandr
libXrandr-devel use our xorg-libxrandr
libXrender use our xorg-libxrender
libXrender-devel use our xorg-libxrender
libXScrnSaver use our xorg-libxscrnsaver
libXScrnSaver-devel use our xorg-libxscrnsaver
libXt use our xorg-libxt
libXt-devel use our xorg-libxt
libXtst use our xorg-libxst
libXtst-devel use our xorg-libxst
libXxf86vm use our xorg-libxxf86vm
m4 used to bootstrap linux compilers
make there is make-latest / make43;
used to bootstrap our binutils and other infra
mesa-dri-drivers use our libglvnd{,-devel}
mesa-libEGL use our libglvnd{,-devel}
mesa-libEGL-devel use our libglvnd{,-devel}
mesa-libGL
mesa-libgl
use our libglvnd{,-devel}
mesa-libGL-devel use our libglvnd{,-devel}
mesa-libGLU use our libglvnd{,-devel}
mesa-libGLU-devel use our libglvnd{,-devel}
numactl use our numactl
numactl-devel use our numactl
patch used to bootstrap linux compilers
pciutils
pciutils-devel
pciutils-libs
perl use our perl
pulseaudio-libs-devel use our pulseaudio; only used for pocketsphinx-python
python-docutils there's python3-docutils; but use our docutils;
only used for rdma-core
rsh no idea what this is for
rsync used to bootstrap linux compilers
sed used to bootstrap linux compilers and crosstool-ng
systemd-devel we have our own libsystemd
texinfo used to bootstrap our binutils
util-linux use our util-linux
wget used to bootstrap linux compilers
xauth doesn't exist on either centos7/alma8,
but xorg-x11-xauth exists in both; but use our xorg-xauth
xorg-x11-server-Xorg
xorg-x11-server-Xvfb

Footnotes

  1. probably. depends on whether the github search for that is truly exhaustive

@jakirkham
Copy link
Member

Thanks for pulling together this list Axel and going over it in today's call! 🙏

Cleaned up the rdma-core case. It also uses systemd

What do we think about making a migrator? At least with X11, this seems essential given that gets pulled in all over the place. Though this may extend to other places

@h-vetinari
Copy link
Member Author

What do we think about making a migrator?

My understanding was that a migrator is not necessary. The yum_requirements will continue to work, and if CDTs end up missing they can be replaced with our own deps, or alternatively users set os_version: cos7 in their conda-forge.yml

@h-vetinari
Copy link
Member Author

h-vetinari commented Nov 2, 2024

In the discussion in #6548, @carterbox made the point (IIUC) that we may want to simply use alma 9(!) images by default, even if we keep the sysroot at 2.17 (== cos7). The reason being (paraphrasing according to my interpretation) that it removes one mostly unnecessary dimension from the whole pinning exercise.

In principle I think this sounds like a good idea to me - the actual glibc in the image doesn't matter from the POV of building packages or any of our metadata, it only needs to be new enough to run binaries for building or testing that need newer symbols (resp. to resolve the tests environment if any dependency - including the package being built - requires __glibc >=2.x).

So if someone decides to use the 2.34 sysroot in the near future, then the question about how to change the image version would simply be obsolete, if the containers are always the newest ones (matching the rest of our infrastructure, of course).

The question becomes what if any failures are possible if the container image is too new. I suspect that this would be very rare (otherwise we would have been hitting such cases all the time when we used cos7 images to build cos6 stuff).

@carterbox
Copy link
Member

carterbox commented Nov 2, 2024

This summarizes my point correctly. I am also curious if anyone can think of a case in which having a container that is too new would cause problems. Maybe this is something for the next core meeting.

edit: There aren't any meeting notes for the next meeting available yet.

@isuruf
Copy link
Member

isuruf commented Nov 2, 2024

I am also curious if anyone can think of a case in which having a container that is too new would cause problems.

Usually not. There are some rare cases. For eg:

  1. binary re-packaging which is done by cuda-* feedstocks.
  2. the build system might inspect the system to see if there are some bugs in older glibc to add workarounds.

@h-vetinari
Copy link
Member Author

I am also curious if anyone can think of a case in which having a container that is too new would cause problems.

Usually not. There are some rare cases.

That matches my understanding too. And obviously those cases could still choose alma8 (or even cos7) images. We can also leave the DEFAULT_LINUX_VERSION infrastructure in place to facilitate that. But I think we're inching towards agreement that we could be using alma9 images by default. :)

There aren't any meeting notes for the next meeting available yet.

Here you go: conda-forge/conda-forge.github.io#2350

@traversaro
Copy link
Contributor

So I took the content of all the yum_requirements.txt in conda-forge1. Many of them can be replaced (x11, mesa, alsa, various tool that we have packaged).

Minor comment: I do not think mesa-dri-drivers can be substituted with conda-forge's libglvnd-devel, as it contains the actual EGL/GLX drivers, while libglvnd-devel only contains the loader.

@h-vetinari
Copy link
Member Author

I do not think mesa-dri-drivers can be substituted with conda-forge's libglvnd-devel, as it contains the actual EGL/GLX drivers, while libglvnd-devel only contains the loader.

Yes, that's part of the compiletime-runtime split between CDTs and yum_requirements that I hadn't fully appreciated yet when writing that list. If you have suggestions how to improve that list or the table, please make a suggestion, and I'll happily link to your comment (or modify mine, if you prefer).

@h-vetinari
Copy link
Member Author

I'm encountering a strange issue in conda-forge/pyarrow-feedstock#139, where a dependency has been compiled against c_stdlib_version 2.28, but then the library fails to load

/lib64/libc.so.6: version `GLIBC_2.25' not found (required by $PREFIX/lib/python3.10/site-packages/pyarrow/../../../././libcares.so.2)

This happens only in the cross-compilation builds, but it happens in all of them. And that despite

quay.io/condaforge/linux-anvil-x86_64:alma9
quay.io/condaforge/linux-anvil-x86_64-cuda11.8:ubi8

clearly having a new enough glibc, as confirmed by conda info at the beginning of the respective logs:

       virtual packages : __archspec=1=x86_64_v4
                          __conda=24.9.2=0
                          __cuda=11.8=0
                          __glibc=2.28=0
                          __linux=6.5.0=0
                          __unix=0=0

So I have no idea what's going wrong there, though it could be something related to QEMU not providing full emulation for glibc 2.28 yet? I don't actually know how that bit works....

@jakirkham
Copy link
Member

Happy to chat if you are still seeing issues. Follow up in this thread: conda-forge/pyarrow-feedstock#139 (comment)

@h-vetinari
Copy link
Member Author

So I have no idea what's going wrong there, though it could be something related to QEMU not providing full emulation for glibc 2.28 yet? I don't actually know how that bit works....

Pretty sure the solution there was conda-forge/conda-forge-ci-setup-feedstock#368

@h-vetinari
Copy link
Member Author

Here's another follow-up for the image switch: conda-forge/docker-images#299

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants