Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container upgrade --check function #4486

Merged
merged 2 commits into from
Sep 21, 2023

Conversation

lukewarmtemp
Copy link
Contributor

@lukewarmtemp lukewarmtemp commented Jun 28, 2023

Fixes: #4176

Feature Implemented:
When using sudo rpm-ostree update --check, it will now output the manifest difference between the container image of the current system and the corresponding remote container image. More details can be found in the commit message.

Sample Output:

$ sudo rpm-ostree update --check
Note: --check and --preview may be unreliable.  See https://github.com/coreos/rpm-ostree/issues/1579
AvailableUpdate:
   Total layers: 2
           Size: 1.9 GB
 Removed layers: 1
           Size: 1.5 GB
   Added layers: 2
           Size: 1.9 GB

End-to-End Test
Follow the testing setup instructions: https://coreos.github.io/rpm-ostree/HACKING/

$ kola run ext.rpm-ostree.destructive.container-update-check

Unit Test

$ cargo test --package rpmostree-rust --lib -- sysroot_upgrade::test_container_manifest_diff --exact --nocapture

Manual Testing Procedure:

  1. Create a Containerfile for Fedora CoreOS:
# Containerfile
FROM quay.io/fedora/fedora-coreos:stable
  1. Build and push the image to quay.io
$ podman build -t localhost/my-custom-fcos .
$ podman push localhost/my-custom-fcos quay.io/<YOURUSER>/my-custom-fcos
  1. Create a VM for Fedora CoreOS
  2. Rebase the VM to use your custom image:
$ sudo systemctl stop zincati.service
$ sudo rpm-ostree rebase \
    ostree-unverified-registry:quay.io/<YOURUSER>/my-custom-fcos
$ sudo systemctl reboot
  1. Get the rpm-ostree binary generated through this custom rpm-ostree repo (ie. mount shared directories between the host and VM, use --bind-ro in cosa, etc)
  2. Enable write permissions on the Fedora CoreOS VM (sudo ostree admin unlock --hotfix)
  3. Replace the rpm-ostree binary on the VM with the custom rpm-ostree binary (sudo cp /path/to/custom/rpm-ostree /usr/bin/rpm-ostree)
  4. Run sudo rpm-ostree update --check. You should see that it outputs "No updates available. Commits are the same"
  5. Return back to your host machine and edit the Containerfile:
# Modified Containerfile
FROM quay.io/fedora/fedora-coreos:stable

RUN rpm-ostree install man
  1. Rebuild and push the image:
$ podman build -t localhost/my-custom-fcos .
$ podman push localhost/my-custom-fcos quay.io/<YOURUSER>/my-custom-fcos
  1. Return to the VM and run sudo rpm-ostree update --check again. It should now show the ManifestDiff

@openshift-ci
Copy link

openshift-ci bot commented Jun 28, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@jmarrero
Copy link
Member

Given: #4176 (comment)
I wonder if modifying the code to skip the granular checks would be a good first step. @travier do you know what Gnome software is looking at?
I see:
https://gitlab.gnome.org/GNOME/gnome-software/-/blob/main/plugins/rpm-ostree/gs-plugin-rpm-ostree.c#L1200-1225
Which I might be completely misunderstanding but looks to me if we just don't return an error it will attempt to upgrade?

@cgwalters WDYT?

@lukewarmtemp
Copy link
Contributor Author

@cgwalters Are we saving the current container image manifest anywhere on the system when we do the rebase? I think the approach we're looking at right now is to use the ManifestDiff ostree-rs-ext function, but I'm having trouble finding the manifest.json file of the current system that can be used to compare with the manifest of the remote container.

@cgwalters
Copy link
Member

@cgwalters Are we saving the current container image manifest anywhere on the system when we do the rebase?

Yes, it's stored in the ostree commit metadata. That's how e.g. rpm-ostree status works to show you things. There's also e.g. ostree container image list etc.

See a similar comment I made over here containers/bootc#3

@lukewarmtemp lukewarmtemp self-assigned this Jul 5, 2023
@cgwalters
Copy link
Member

I filed ostreedev/ostree-rs-ext#496 related to this

@lukewarmtemp lukewarmtemp force-pushed the 4176-detect-container-updates branch 5 times, most recently from f817e1d to 5484ee2 Compare July 11, 2023 15:13
rust/src/sysroot_upgrade.rs Outdated Show resolved Hide resolved
rust/src/sysroot_upgrade.rs Outdated Show resolved Hide resolved
@lukewarmtemp lukewarmtemp force-pushed the 4176-detect-container-updates branch from 6bbbf61 to ba53c46 Compare July 19, 2023 13:10
@lukewarmtemp lukewarmtemp force-pushed the 4176-detect-container-updates branch from a44a070 to 61faf20 Compare August 2, 2023 14:32
@lukewarmtemp lukewarmtemp changed the title Implement basic check for container updates Add container upgrade --check function Aug 2, 2023
@lukewarmtemp
Copy link
Contributor Author

Applied a temporary patch in Cargo.toml with the changes in ostree-rs-ext to pass the GitHub tests. Will need remove these changes once a new release of ostree-rs-ext is made.

@lukewarmtemp lukewarmtemp marked this pull request as ready for review August 2, 2023 15:31
rust/src/lib.rs Outdated Show resolved Hide resolved
rust/src/sysroot_upgrade.rs Outdated Show resolved Hide resolved
src/app/rpmostree-clientlib.cxx Outdated Show resolved Hide resolved
src/daemon/rpmostreed-deployment-utils.cxx Outdated Show resolved Hide resolved
src/daemon/rpmostreed-deployment-utils.cxx Outdated Show resolved Hide resolved
src/daemon/rpmostreed-deployment-utils.cxx Outdated Show resolved Hide resolved
src/daemon/rpmostreed-deployment-utils.cxx Outdated Show resolved Hide resolved
@cgwalters
Copy link
Member

ostreedev/ostree-rs-ext#511 is queued, then we can change this PR to just pull in the new release

@cgwalters
Copy link
Member

Thanks! There's a few other tweaks I'd like to do here too before we merge.

cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Sep 13, 2023
Closes: ostreedev#496

In coreos/rpm-ostree#4486 we
were working on fixing `rpm-ostree upgrade --check` with containers.

However, what we really want here is to *persist* the updated
manifest (and config) that we fetch.  And if we do that, we might
as well just make it part of the current `prepare()` API so it
happens automatically.

In this change, we do so via detached commit metadata.  An important
thing here is that the data is then automatically lifecycle
bound to the merge commit - and the merge commit always
changes when we fetch a new manifest.

In order to do an offline query (e.g. in rpm-ostree we want to
re-synthesize a higher level summary of the queued update)
add an API which allows querying a previously saved cached update.

Hence a flow like this should work:

- OS boots
- OS updater does a background "check for updates" via calling `prepare()`
- OS updater finds an update, and renders metadata to the user
  or orchestration system
- <time passes; OS update is not downloaded - e.g. user is on
   metered data or whatever>
- system reboots for other reasons
- OS updater can re-render the fact that a queued update was
  found *without* touching the network
- User can initiate a full fetch (e.g. including image layers)
  targeting *exactly* the previously prepared fetch.  This
  makes things much more race-free; if the image was GC'd
  in the meantime we correctly fail.
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Sep 13, 2023
Closes: ostreedev#496

In coreos/rpm-ostree#4486 we
were working on fixing `rpm-ostree upgrade --check` with containers.

However, what we really want here is to *persist* the updated
manifest (and config) that we fetch.  And if we do that, we might
as well just make it part of the current `prepare()` API so it
happens automatically.

In this change, we do so via detached commit metadata.  An important
thing here is that the data is then automatically lifecycle
bound to the merge commit - and the merge commit always
changes when we fetch a new manifest.

In order to do an offline query (e.g. in rpm-ostree we want to
re-synthesize a higher level summary of the queued update)
add an API which allows querying a previously saved cached update.

Hence a flow like this should work:

- OS boots
- OS updater does a background "check for updates" via calling `prepare()`
- OS updater finds an update, and renders metadata to the user
  or orchestration system
- <time passes; OS update is not downloaded - e.g. user is on
   metered data or whatever>
- system reboots for other reasons
- OS updater can re-render the fact that a queued update was
  found *without* touching the network
- User can initiate a full fetch (e.g. including image layers)
  targeting *exactly* the previously prepared fetch.  This
  makes things much more race-free; if the image was GC'd
  in the meantime we correctly fail.
cgwalters added a commit to cgwalters/ostree-rs-ext that referenced this pull request Sep 13, 2023
Closes: ostreedev#496

In coreos/rpm-ostree#4486 we
were working on fixing `rpm-ostree upgrade --check` with containers.

However, what we really want here is to *persist* the updated
manifest (and config) that we fetch.  And if we do that, we might
as well just make it part of the current `prepare()` API so it
happens automatically.

In this change, we do so via detached commit metadata.  An important
thing here is that the data is then automatically lifecycle
bound to the merge commit - and the merge commit always
changes when we fetch a new manifest.

Then, add this "cached update" metadata to the existing structure
which has image state so it can be conveniently queried *without*
re-fetching.

Hence a flow like this should work:

- OS boots
- OS updater does a background "check for updates" via calling `prepare()`
- OS updater finds an update, and renders metadata to the user
  or orchestration system
- <time passes; OS update is not downloaded - e.g. user is on
   metered data or whatever>
- system reboots for other reasons
- OS updater can re-render the fact that a queued update was
  found *without* touching the network

There's one notable piece that is missing to do conveniently:

- User can initiate a full fetch (e.g. including image layers)
  targeting *exactly* the previously prepared fetch.  This
  makes things much more race-free; if the image was GC'd
  in the meantime we correctly fail.

But it can be done manually by e.g. using a digested pull spec
temporarily.
@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from df864bc to 91558cd Compare September 13, 2023 19:36
@cgwalters
Copy link
Member

OK, this now depends on ostreedev/ostree-rs-ext#537

Other changes:

  • Ported the "convert manifest diff to gvariant" to Rust which makes it way shorter and also memory safe
  • Significantly simplified the conditional logic in the cached diff generation - now it's just bool container_changed which matches the others

@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from 91558cd to 7ee1f4b Compare September 13, 2023 19:38
@cgwalters
Copy link
Member

Actually now that I think about it, we probably don't need ExportedManifestDiff at all! We can just pass the LayeredImageState from Rust -> C++ -> Rust now that the deployment-gvariant code is in Rust...

@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from 7ee1f4b to c8c9302 Compare September 13, 2023 19:45
@cgwalters
Copy link
Member

Actually now that I think about it, we probably don't need ExportedManifestDiff at all! We can just pass the LayeredImageState from Rust -> C++ -> Rust now that the deployment-gvariant code is in Rust...

Turns out no because of course ContainerImageState is itself a bridge, but we could probably just make that an opaque object with accessor methods instead.

@cgwalters cgwalters marked this pull request as ready for review September 19, 2023 18:25
@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from c8c9302 to 17f28a7 Compare September 19, 2023 18:25
@cgwalters cgwalters enabled auto-merge (rebase) September 19, 2023 18:26
@cgwalters
Copy link
Member

OK, rebased to pull in ostree-ext 0.12.1, so this should be good to go!

@jmarrero
Copy link
Member

jmarrero commented Sep 20, 2023

Seems to work but I see:

rpm-ostree upgrade --check
Note: --check and --preview may be unreliable.  See https://github.com/coreos/rpm-ostree/issues/1579
AvailableUpdate:
   Total layers: 67
           Size: 2.3 GB
 Removed layers: 21
           Size: 916.4 MB
   Added layers: 21
error: Retrieving cached update: Missing "added_size" key

is the error expected? or something I am doing wrong?

error: Retrieving cached update: Missing "added_size" key`

This is my system, so it's booting from my custom container build based on silverblue.

I do see:
src/app/rpmostree-clientlib.cxx'

 if (!g_variant_dict_lookup (&manifest_diff_dict, "added_size", "t", &added_size))
      return glnx_throw (error, "Missing \"added_size\" key");

I am just not sure if the intent is to print that error and still have it print the AvailableUpdate output or maybe this should be a warning?

@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch 2 times, most recently from f008243 to 6a2e5b7 Compare September 20, 2023 13:29
@cgwalters
Copy link
Member

Oh man right, the way we were doing || true was just silently masking errors (like crashes!).

And this is really a footgun with --check in that it exits with 77 if nothing changed...which makes it annoying to differentiate "failure" from "nothing changed"...

This has a few fixes, including a new update check API.
@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from 6a2e5b7 to 53711a8 Compare September 20, 2023 17:42
Previously, OS based on native containers could not retrieve the
manifest difference between the current system container image and its
corresponding remote container image without performing an actual
upgrade. This PR solves this issue by allowing the manifest difference
to be outputted when using the `rpm-ostree upgrade --check` command.

A `ManifestDiff` struct needed to be retrieved using `ostree-rs-ext`,
requring several new functions in both Rust and C that were bridged
through CXX Bridge.

The `cached_update` object also needed to be modifed and updated with
the `ManifestDiff` in order to extend compatibility with other products
using rpm-ostree. However, a notable difference between how the
`--upgrade --check` function works for ostree based system and a native
container based system is that native containers skip over the use of
`checksums`. This is because rebasing to locally stored container images
(as opposed to a remote repository) does not create a valid ostree
refspec.

A Rust unit test was implemented to confirm that a difference between
two manifests (stored locally in the rust/test folder) can be
successfully retrieved. A `kola` test was also implemented to confirm
that running `rpm-ostree upgrade --check` returns the correct manifest
difference for a potential upgrade.
@cgwalters cgwalters force-pushed the 4176-detect-container-updates branch from 53711a8 to c2b3110 Compare September 21, 2023 18:59
@jmarrero
Copy link
Member

This looks good on my manual test now too!

$rpm-ostree upgrade --check
Note: --check and --preview may be unreliable.  See https://github.com/coreos/rpm-ostree/issues/1579
AvailableUpdate:
   Total layers: 67
           Size: 2.3 GB
 Removed layers: 1
           Size: 152.9 MB
   Added layers: 1
           Size: 152.9 MB
$rpm-ostree upgrade --check
Note: --check and --preview may be unreliable.  See https://github.com/coreos/rpm-ostree/issues/1579
No updates available.

@cgwalters cgwalters merged commit c055c50 into coreos:main Sep 21, 2023
13 checks passed
@travier
Copy link
Member

travier commented Sep 25, 2023

Woohoo! Thanks for fixing this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update checks not functional with new container format
5 participants