Skip to content
This repository has been archived by the owner on Nov 7, 2024. It is now read-only.

Support opinionated flow for injecting containers into /usr/lib/containers #246

Closed
cgwalters opened this issue Feb 15, 2022 · 12 comments
Closed

Comments

@cgwalters
Copy link
Member

For e.g. coreos layering we've sort of inherently been focusing on the use case of e.g. rpm-ostree install usbguard inside the container.

However, there's no reason we can't support e.g.:

RUN ostree container install quay.io/example/somecontainer
ADD somecontainer.service /etc/systemd/system

And somecontainer.service would be a systemd unit that uses systemd native features to run the container, e.g. RootDirectory etc.; or runs crun/runc.

This crate is in a great position to implement this, because we're building up lots of tooling to bind ostree and containers.

@cgwalters
Copy link
Member Author

That said, this heavily intersects with #69 - to do this nicely we really want the installed container layers to be "flattened" into the outer container.

@mangelajo
Copy link

This is something which would provide what I'm trying to do here

We have use cases where we need an ostree image to contain a set of containers in the form of an additionalimagestore (storage.conf) which would allow the containers to run without the need for crio to download them from the network.

@cgwalters
Copy link
Member Author

It's important to understand that in the general case, it won't work to embed containers/storage written data inside ostree - or really, inside a container either. This is because the whiteouts don't nest. See e.g. https://stackoverflow.com/questions/67198603/overlayfs-inside-docker-container

There are a few solutions:

  • It will likely work to flatten the container outside of the build (i.e. squash all layers)
  • Don't use containers/storage, but directly invoke e.g. crun or use podman --root

Another approach instead is to have these containers in /var/lib/containers, but ensure that typing a single upgrade command atomically ensures that these containers are all updated to a target version.

(This is a bit like what we do in OCP with the "release image" - that single container image has references to a big pile of other container images. Having lots of little floating containers really demands a means to tie them together in a coherent upgrade story)

@mangelajo
Copy link

mangelajo commented Apr 29, 2022

Ok, I'm confused now.

What we are trying to achieve is something like

add a layer on ostree with:

   podman pull --root /usr/lib/containers quay.io/example/somecontainer
   podman pull --root /usr/lib/containers quay.io/example/some-other-container

What we are doing so far here is a

for container in $containers {
    podman pull --root /usr/lib/containers --arch $architecture $container
}

then we put that in tarballs, in an srpm, that goes into different archtiecture rpms, that, on install time will:

sed -i '/^additionalimagestores =*/a "/usr/lib/containers",' /etc/containers/storage.conf

I'm still in doubt if that's the purpose of this RFE, or if it is something different. We don't want the container layers to be squashed, we want a pristine copy (I guess that would help for container validation/signatures, not sure if it's 100% necessary to keep the integrity/authenticity of the layers though).

@cgwalters
Copy link
Member Author

We don't want the container layers to be squashed, we want a pristine copy (I guess that would help for container validation/signatures, not sure if it's 100% necessary to keep the integrity/authenticity of the layers though).

When the "sub-container" content is rolled into a a container image itself, we are inherently relying on the outer container image for signatures and such to pull.

@cgwalters
Copy link
Member Author

cgwalters commented May 6, 2022

There's actually two cases we could support; one, where the child image is "lifecycle bound" with the host ostree (container). The container images involved here are rolled into the toplevel container always. In theory we could support live updates to these containers in the same way we can support live updates to components of host userspace, but we'd normally expect system restarts.

The second case is "preload optimization". Here, even though there are distinct storage systems involved, assuming that e.g. /var/lib/containers is on the same physical filesystem, we can teach the two systems involved how to use reflinks (if available) to share physical storage. In this second case, it works to podman pull quay.io/othercontainer etc. And on further upgrade of the host system, it might pull new images for the "child containers", but these are not automatically changed in the separate/distinct /var/lib/containers. Here, podman storage would support updates and changes as it normally does, distinct from the host.

@dhellmann
Copy link

The fact that the files being added to the ostree image are themselves related to container images is only relevant insofar as it makes it feel weird to also put those files into an RPM. If the thing being added to the filesystem wasn't already in a package of some sort, it would be natural to say "just make an RPM" (or whatever format).

Life cycling a container image with the ostree image (use case 2) would still mean we would have to package the content we want to add to the filesystem. We don't want the ostree image to consider the content of the "application image" itself, we just want it to place that application image in a specific place on the filesystem. We could do that by wrapping the application image in another image, which also feels a little odd but somehow less than using an RPM. Maybe because it's easy to imagine building and publishing a simple wrapper image using the same build pipeline and registry that are used for the application image.

@cgwalters
Copy link
Member Author

The fact that the files being added to the ostree image are themselves related to container images is only relevant insofar as it makes it feel weird to also put those files into an RPM.

My understanding was these files aren't "related to" container images. They are container images. They're expected to be run as a container.

@dhellmann
Copy link

Yeah, that's true. I'm stumbling a bit trying to come up with a way to differentiate between a container image that should be treated as a layer of the ostree and an image that should be treated as any other file that would be added to the filesystem.

@cgwalters
Copy link
Member Author

cgwalters commented May 6, 2022

We don't want the ostree image to consider the content of the "application image" itself, we just want it to place that application image in a specific place on the filesystem.

I am not sure that makes sense. Broadly speaking, either it is managed, or it is not. If it is managed by ostree, it should be underneath the read-only bind mount, and we will apply transactional offline update semantics to it in the same way we do all other content. It should not be mutated in place by some other process (e.g. we can't support podman --root /usr/share/containers pull myimage) because that gets into the "multiple owners" and race conditions and "source of truth" problems.

We already have a place for content not managed by ostree: /var, which is where e.g. /var/lib/containers lives. Content there is completely orthogonal to OS upgrades in general. We rely on this in OCP - updating the node OS does not affect container images. Choosing the previous bootloader entry (i.e. previous ostree image) does not revert /var/log/journal or /var/lib/containers etc.

For the latter case, what may be desired here is the ability for e.g. Image Builder to inject these images into an ISO for the initial installation - distinct from the ostree updates.
I've thought about supporting this for OCP - basically we could create a Live ISO that actually had the release image preloaded too, to support even more convenient air-gapped installs.

@cgwalters
Copy link
Member Author

RUN ostree container install quay.io/example/somecontainer

This would be possible but ugly - among other problems we'd have to propagate any pull secrets and such for the image to inside the container build.

What seems more elegant here is a process that operates outside of the base image, something like:

ostree container append-sub-container --root /usr/share/containers/examplecorp-agent quay.io/coreos/fedora-coreos:latest quay.io/examplecorp/agent:latest oci:exampleos

OCI containers are just layers of tarballs. This tool would download quay.io/coreos/fedora-coreos:latest, and then for each layer in quay.io/examplecorp/agent:latest we'd parse the tarball and inject /usr/share/containers/examplecorp-agent before each path, then append those layers to the final image.

There'd be no code executed as part of this, it's just basically writing the sub-container into an alternative root in a derived container image.

@cgwalters
Copy link
Member Author

Closing this for now as the focus is on the bootc side, where we now have trackers for

@cgwalters cgwalters closed this as not planned Won't fix, can't repro, duplicate, stale Oct 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants