Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for building images from ostree containers and the Fedora CoreOS qcow2 image type #243

Closed
wants to merge 20 commits into from

Conversation

achilleas-k
Copy link
Member

@achilleas-k achilleas-k commented Nov 6, 2023

This PR adds support for the org.osbuild.ostree.deploy.container stage and a new image type that uses it called coreos-qcow2 for Fedora.

I tested the new image type locally with the container we added to the gitlab registry for testing the sample manifest in osbuild.
Fedora CoreOS booted

Some notes and open questions:

  • I use the same pipeline as the regular ostree deployment with a condition on the type of source. I think it wasn't worth making a whole separate pipeline for this, but we can revisit this decision if the two start diverging.
  • The new image type is added to Fedora which means that if this is merged and updated in osbuild-composer it will be available to build on-prem. If we want this, we'll have to figure out the command line options for specifying a container source and potentially a target-imgref.
  • When specifying a container source to deploy, you must specify the ref (otherwise the default will be assumed). This requirement also exists for ostree commits from a repo, but I wonder: Is there a way to determine the container's ostree ref without downloading it?

You can generate the manifest for this image with:

go run ./cmd/gen-manifests --distros fedora-38 --arches x86_64 --images coreos-qcow2 --output .

or build it directly with

go build -o bin/build ./cmd/build
sudo ./bin/build --distro fedora-38 --image coreos-qcow2 --output builds --config ./test/configs/ostree-container-pull.json

You can also add a user to the config file which will create a user after the deployment before finalising the image:

{
  "name": "ostree-container-pull",
  "ostree": {
    "ref": "ostree/1/1/0",
    "container": "registry.gitlab.com/redhat/services/products/image-builder/ci/images/fedora-coreos:stable",
    "tls-verify": true
  },
  "blueprint": {
    "customizations": {
      "user": [
        {
          "name": "<login>",
          "password": "<password>",
          "key": "<pubkey>"
        }
      ]
    }
  }
}

@achilleas-k achilleas-k force-pushed the ostree.deploy.container branch 2 times, most recently from b8b990f to b19437c Compare November 7, 2023 11:08
pkg/image/ostree_disk.go Outdated Show resolved Hide resolved
@achilleas-k achilleas-k force-pushed the ostree.deploy.container branch 5 times, most recently from 617ce72 to 25ca0c0 Compare November 10, 2023 10:35
Copy link
Member

@ondrejbudai ondrejbudai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm very happy with this PR and I haven't found any issues. However, it's a big one, so I would love a third pair of eyes to take a look. @achilleas-k should we ask someone explicitly? Any ideas whom? :)

@@ -689,6 +707,7 @@ func newDistro(version int) distro.Distro {
UEFIVendor: "fedora",
},
iotQcow2ImgType,
coreosQcow2ImgType,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about aarch64?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it at first but then realised I needed to get the base container in our CI registry to test it and decided to get this open so people can review and I can add it in a follow-up.

teg
teg previously approved these changes Nov 10, 2023
Copy link
Member

@teg teg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I noticed an oversight, but would rather fix that in a follow-up than rebasing it tbh, this is plenty useful as it is.

cgwalters added a commit to cgwalters/bootc-image-builder that referenced this pull request Nov 13, 2023
- Take the code from https://github.com/achilleas-k/images/tree/bifrost-image/cmd/osbuild-deploy-container
  and merge it into this repository, using the code from osbuild/images#243
  as a `replace`
- Also merge in osbuildbootc
pkg/osbuild/ostree_deploy_container_stage.go Outdated Show resolved Hide resolved
pkg/manifest/ostree_deployment.go Outdated Show resolved Hide resolved
pkg/manifest/ostree_deployment.go Outdated Show resolved Hide resolved
pkg/manifest/ostree_deployment.go Outdated Show resolved Hide resolved
pkg/image/ostree_disk.go Outdated Show resolved Hide resolved
@@ -72,6 +77,9 @@ type ImageOptions struct {
// Indicate if the 'org.osbuild.rhsm.consumer' secret should be added when pulling from the
// remote.
RHSM bool `json:"rhsm"`

// TLS verification for container sources.
TLSVerify *bool `json:"tls-verify"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since there are multiple things here that are "remote" artifacts it might be a little confusing to have this just apply to one of them and still name it TLSVerify.

One thing we could do here is break out sets of options that go together into subordinate objects.

i.e.

type ContainerImageOptions struct {
    Container string `json:"container"`
    TLSVerify *bool `json:"tls-verify"`
}
...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's two things going on here that I don't like and I'm unsure of:

  • The ostree.ImageOptions and more generally the distro.ImageOptions that exist separately from things like the blueprint customizations. These I want to merge into a build config.
  • The way that users will be specifying an ostree container for a build when this gets into osbuild-composer.

The way I added the options now sort of reflects things that might happen in composer, but that's just leftover internal structure from the split.

All that said, none of this is considered "stable API" (yet) in images, so I did the quickest thing that felt natural at the time. You're right though, a completely separate object is nicer. Getting the "build config" done soon would be even better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with whatever.

img.Compression = t.compression

return img, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot going on here. Maybe @achilleas-k we can get together and step through some of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can do that. For a high level description though: Almost all of this function is essentially creating the img

img := image.NewOSTreeContainerDiskImage(container, ref)

followed by setting options on the image based on image type settings and user customizations.

buildPipelines: []string{"build"},
payloadPipelines: []string{"ostree-deployment", "image", "qcow2"},
exports: []string{"qcow2"},
basePartitionTables: iotBasePartitionTables,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably need to dig into iotBasePartitionTables a little bit more to understand why they are being used here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I based the image off the iot-raw-image and reused that partition table. I would say, instead of digging, point me to the default partition table you use in coreos images and I'll replicate it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is what I've been using for now.

There is one theme here, though, that I think we keep bumping up into, which is where definitions like this should live. I think it was also touched on in osbuild/bootc-image-builder#4
I

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Image Builder (osbuild-composer and the service), the defaults are defined here. For all the RHEL images we build for example as part of the composes in Brew and for all the images in c.rh.c/ib, this is the source of truth right now.

@@ -340,7 +340,8 @@ func (p *OSTreeDeployment) serialize() osbuild.Pipeline {
}
}

if !hasRoot {
if !hasRoot && p.ostreeSpec != nil {
// NOTE: fails on container-based deployments
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure either.. from my side this feels like something that would be done when composing the ostree commit or oscontainer rather than when composing an image. IOW I feel like this shouldn't be part of Image Builder at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locking the root password during deployment is something that's done for the official Fedora IoT image: https://pagure.io/fedora-kickstarts/blob/main/f/fedora-iot.ks#_9

We were replicating that here.

@@ -46,6 +46,10 @@ func TestImageType_PackageSetsChains(t *testing.T) {
URL: "https://example.com", // required by some image types
},
}
if imageType.Name() == "coreos-qcow2" {
options.OSTree.Container = "https://registry.example.com/image"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this supposed to be a container pullspec? those don't often include https:// do they?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be any string since the test doesn't really do anything with it, but you're right. Might as well make it look like a real spec.

Copy link
Contributor

@mvo5 mvo5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how helpful my review here is, I did a quick scan to get a feeling for the code and just added tiny coments/ideas inline. As a bit of a meta-comment, would it make sense to split something like ae57912 into it's own PR? It seems generally useful on its own and would make this PR smaller. But there are probably good reasons that I just don't know yet :) Hope this is still useful in some small way.

pkg/osbuild/ostree_deploy_container_stage.go Show resolved Hide resolved
@@ -269,7 +269,8 @@ func (t *imageType) checkOptions(bp *blueprint.Blueprint, options distro.ImageOp
}
}

if t.name == "iot-raw-image" || t.name == "iot-qcow2-image" {
if t.name == "iot-raw-image" || t.name == "iot-qcow2-image" || t.name == "coreos-qcow2" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if if slices.Contains([]string{"iot-raw-image", "iot-qcow2-image", "coreos-qcow2"}, t.name) { might be slightly nicer? It seems we are using this already. But size of the line is almost the same so maybe not a win.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a good idea. This keeps growing and it'll be nicer to be able to append a single name to a list instead of needing to rewrite the comparison every time. Generally, we should have a nicer way to test these, but that's for another day.

pkg/image/ostree_disk.go Show resolved Hide resolved
Creates an org.osbuild.ostree.deploy.container stage.
Validates the options with a regular expression that matches the schema.
Takes a container as input.  Validates the length of the input
references to be exactly 1.

See osbuild/osbuild#1402
The pipeline either way only supported one ostree commit, enforced when
adding the resolved commits to the pipeline during serializeStart() and
by checking the array length during serialize().  To simplify the
requirement and avoid potential confusion when reading the struct
itself, replace the array on the pipeline struct definition with a
single commit spec (pointer) and keep the len == 1 check in the
serializeStart() function that, by its interface, must accept an array.

Also document the commit fields for OSTreeDeployment.  These are
private, so they wont show up in public API docs, but it's still useful
to have them documented in the code.
Support a container source for OSTreeDeployment as an alternative to the
ostree commit source.
Currently, if a container is defined, it fails with an error.

The commit source and the container source are now both pointers since
they can be unset, but (will) we need to make sure at least one is set
when serializing.
Move the ostree pull stage to be right before the ostree deploy stage.
This should have no functional impact on the build and will make it
easier to replace the two stages together.

Old vs new stage order:
  org.osbuild.ostree.init-fs  |  org.osbuild.ostree.init-fs
  org.osbuild.ostree.pull     |  org.osbuild.ostree.os-init
  org.osbuild.ostree.os-init  |  org.osbuild.mkdir
  org.osbuild.mkdir           |  org.osbuild.ostree.pull
  org.osbuild.ostree.deploy   |  org.osbuild.ostree.deploy
When creating the ostree deployment pipeline, add the appropriate stages
only when the ostree commit is specified.
The stages are:
  org.osbuild.pull
  org.osbuild.ostree.deploy
  org.osbuild.ostree.remote
  org.osbuild.ostree.fillvar
Rename the NewOSTreeDeployment() constructor to
NewOSTreeCommitDeployment() and add a second one called
NewOSTreeContainerDeployment().
Each takes a different kind of source spec (ostree commit or container).
The container constructor also requires the ref to be specified since
it's not part of the container spec.
Add the org.osbuild.ostree.deploy.container stage when the content
source for the deployment is a container.
Support both content source types on the OSTreeDiskImage ImageKind.
A second constructor is defined, just like with the deployment pipeline
of the image.
The original constructor is renamed to NewOSTreeDiskImageFromCommit to
differentiate and have a more informative name.
Only required properties should be specified through the constructor to
make sure they are not accidentally left empty or nil.  Optional
properties are added via public fields of the pipeline object.

Since the ignition platform is optional, change it to a public field.
Also, to simplify the interaction between the two ignition properties,
remove the boolean that signifies if ignition should be enabled and use
the IgnitionPlatform string instead.  Treat the empty string as the
disabled value.
A new image functions called coreosImage() that requires a container
source and creates a qcow2 image from a CoreOS ostree container.
Based on the example manifest added in osbuild/osbuild#1402
Define the new image type using the new coreosImage() function and
container deployment pipeline and add it to the x86_64 and aarc64
registries for Fedora.
We always aliased these types so that internal changes wouldn't always
require changing our tooling interfaces but it's actually more annoying
needing to change them every time something is added or changed in the
internal types.  These are development and testing tools.  It's fine to
use the internal types directly and keep them simple.
Use gpgkeys instead of concatenating multiple keys under gpgkey.
The singular gpgkey was a leftover from an old format and now we use an
array from the internal rpmmd.RepoConfig instead.
Modifying the root user fails when trying to modify a container
deployment.  Let's disable it for now until we figure out if it's
possible.
Use the latest version from the Fedora 38 repo snapshots.
@achilleas-k
Copy link
Member Author

Partially addressed comments (and rebased on main). Will resume addressing comments later.

@achilleas-k
Copy link
Member Author

Moved most of the commits to #272. I'll rebase this PR so that it only adds the coreos image once that's merged.

@supakeen
Copy link
Member

@achilleas-k should be able to rebase now as #272 has landed \o/

@achilleas-k
Copy link
Member Author

Will rebase and fix conflicts then start addressing the remaining comments.

Copy link

This PR is stale because it has been open 30 days with no activity. Remove "Stale" label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 29, 2023
Copy link

github-actions bot commented Jan 5, 2024

This PR was closed because it has been stalled for 30+7 days with no activity.

@github-actions github-actions bot closed this Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants