-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bib: add experimental cross arch image building #139
bib: add experimental cross arch image building #139
Conversation
One thing to consider here instead of invoking qemu ourselves is to just Experimenting with things around this I ran into containers/podman#15300 I think.
Is that true? Either way we're going to be executing the target OS code right? I think it's basically just that we're overall I/O bound in general. |
Thank you for the pointer and idea - I don't mind taking this route and will happily experiment with it. The reason I picked the approach in this PR is that most of the work that osbuild is doing happen in the buildroot. So making this native arch seemed expedient. However I think your point below is very valid, overall even during that stage osbuild is really not doing much (in a container world at least) so it maybe well be not worth the complexity and we can go all in on (slow) emulation for the buildroot work too.
What I perceive is that this approach cross builds an bootc fedora image from amd64 for aarch64 in under a minute - this is why I wrote this comment. But as I wrote above I did not actually test how long it takes to have the buildroot running in emulation or even to just use podman --arch. It may well be that this is still very much good enough because of the i/o bound nature of the workload. I will play with this in the next few days and report back. My main point with this PR was to socialize the idea and get feedback and that I certainly archived :) Thank you! |
I think this is wrong, it is going to be way too slow. We should run bootc-image-builder in the native arch and then pass into it the --arch flag. |
Everything runs really slowly in emulation mode, if we can run the commands in native and just pull and convert in alternative arch, the performance will be much better. |
When I ran this for testing it was suprisingly fast because the buildroot which does the bulk of the work runs as native code. Only the very few pieces that need to chroot into the build tree during the build run in emulation. |
I believe I was mistaken, when I read
My brain interpeted the --target-arch as the --arch option in Podman, meaning the BIB would be running in emulation. The --target-arch going to BIB is exactly what I want. |
This is exactly what we need. |
|
||
imgref := args[0] | ||
rpmCacheRoot, _ := cmd.Flags().GetString("rpmmd") | ||
configFile, _ := cmd.Flags().GetString("config") | ||
tlsVerify, _ := cmd.Flags().GetBool("tls-verify") | ||
imgType, _ := cmd.Flags().GetString("type") | ||
targetArch, _ := cmd.Flags().GetString("target-arch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be handling "target-variant" as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to add more here - but pardon my ignorance, what does target-variant
mean in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Arm platforms have multiple variants. So if BIB is calling skopeo or podman then it is needs to pass the variant as well as the Arch if the user specified a variant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a TODO here for now, if you could share some container or any other link where I can inspect this a bit more that would be super helpful. I will then add/fix this in a followup.
bib/cmd/bootc-image-builder/main.go
Outdated
@@ -296,6 +313,7 @@ func main() { | |||
manifestCmd.Flags().String("config", "", "build config file") | |||
manifestCmd.Flags().String("type", "qcow2", "image type to build [qcow2, ami]") | |||
manifestCmd.Flags().Bool("tls-verify", true, "require HTTPS and verify certificates when contacting registries") | |||
manifestCmd.Flags().String("target-arch", "", "build for the given target architecture") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add target-variant.
Containerfile
Outdated
@@ -13,7 +13,7 @@ FROM registry.fedoraproject.org/fedora:39 | |||
# - https://github.com/osbuild/bootc-image-builder/issues/9 | |||
# - https://github.com/osbuild/osbuild/pull/1468 | |||
COPY ./group_osbuild-osbuild-fedora-39.repo /etc/yum.repos.d/ | |||
RUN dnf install -y osbuild osbuild-ostree osbuild-depsolve-dnf podman qemu-img && dnf clean all | |||
RUN dnf install -y osbuild osbuild-ostree osbuild-depsolve-dnf podman qemu-img qemu-user-static && dnf clean all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
qemu-user-static is not available in RHEL, so if we ever move to a RHEL based image this will not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, in the general case we need qemu-user
because some of the osbuild stages will chroot into the target. Bootupd is installed this way today, i.e. we essentially chroot into the deployment root and just call bootupd. Is that a showstopper for this approach or should we go with it for now (given that we have limited time until the demo etc)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing that might be an issue is the arch specific mkfs things that Colin pointed out in #153 (comment) - we need to see what we can do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets get this working on Fedora first and then we can worry about RHEL.
81389b7
to
2c45d9b
Compare
I rebased this now and it works nicely when cross building from amd64->aarch64 with the centos-bootc:stream9 images. Once #148 has landed and we have a more flexible way to parametrize the tests I will add a automatic cross build test here as well.
So things looks fine there. |
2c45d9b
to
7e64339
Compare
I would like to get this merged to add the feature and then we can continue to work on some of the stuff Colin asked for. |
7e64339
to
9f8639a
Compare
This is ready now, it builds on top of #148 and does a "x86"->"arm" cross build with boot test. The boot test takes around 30s on my system, the test a total runtime of 5min. Once things looks good in here and this is merged I will work on the "arm"->"x86" tests. I know it's an important use-case but I find it a bit painful to test on mac as I need to do this remote only. |
The integration test fails right now with:
which means that the binfmt-misc handler needs to be registered from inside the container (we cannot rely on having this on the host OS). Hopefully just a SMOP. |
8c39f48
to
4981e82
Compare
Just curious, are we going to switch to use bootc install for bib by default or just as an option? does this cross arch build also works after switch over to bootc install? |
"-smp", "2", | ||
"-bios", "/usr/share/AAVMF/AAVMF_CODE.fd", | ||
] | ||
elif self._arch in ("amd64", "x86_64"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i also see the options here, so I'm assuming command line it should only accept amd64 or arm64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should eventually support all of the platforms that RHEL supports, otherwise our parent company will not be happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to test more scenarios here but if the container is available for both host and target arch and qemu-user is available for the target arch the existing code should DTRT even for e.g. ppc64le.
The plan is to move to bootc install, there is already work in osbuild/images happening for this. |
20d6ff2
to
7d4accb
Compare
7d4accb
to
023d962
Compare
FTR:
2:17.77 total
4:10.20 total This is awesome. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny comment about minor optimisations but happy to merge and follow-up in another PR.
Keeping un-approved until #148 is merged and this is rebased.
hostArch := arch.Current() | ||
resolverNative := container.NewResolver(hostArch.String()) | ||
resolverTarget := container.NewResolver(c.Architecture.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this right, in the standard case, we'll get two identical resolvers and each one will get the same container it will have to resolve. It's not terrible but a bit wasteful. Maybe we could instead:
resolvers[hostArch.String()] = container.NewResolver(hostArch.String())
if hostArch.String() != c.Architecture.String() {
resolvers[c.Architecture.String()] = container.NewResolver(c.Architecture.String())
}
...
if plName == "build" {
rarch = hostArch.String()
} else {
rarch = c.Architecture.String()
}
for _, c := range sourceSpecs {
rarch.Add(c)
}
I now also realise that even in this case, we're resolving the same container twice (and have been since we started using the container as a build root).
Again, not terrible, but maybe we can be smarter about it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I will look at this, maybe in a followup as it's not that pressing but I agree that it's a bit wasteful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again, I was thinking it could be something like:
index 5e9ed2d..9bb6a5b 100644
--- a/bib/cmd/bootc-image-builder/main.go
+++ b/bib/cmd/bootc-image-builder/main.go
@@ -115,13 +115,18 @@ func makeManifest(c *ManifestConfig, cacheRoot string) (manifest.OSBuildManifest
}
// resolve container
- hostArch := arch.Current()
- resolverNative := container.NewResolver(hostArch.String())
- resolverTarget := container.NewResolver(c.Architecture.String())
+ hostArch := arch.Current().String()
+ targetArch := c.Architecture.String()
+
+ resolverNative := container.NewResolver(hostArch)
+ resolverTarget := resolverNative
+ if hostArch != targetArch {
+ resolverTarget = container.NewResolver(targetArch)
+ }
containerSpecs := make(map[string][]container.Spec)
for plName, sourceSpecs := range manifest.GetContainerSourceSpecs() {
- var resolver container.Resolver
+ var resolver *container.Resolver
if plName == "build" {
resolver = resolverNative
} else {
which requires osbuild/images#433 (so that we get pointers to the resolver first) - I have no super strong opinion here though, happy to discuss more if you think this is less obvious or too verbose :)
This commit adds experimental cross arch building. I.e. you `bootc-image-builder` on a mac and target an `amd64` machine. The usage would be: ``` sudo podman run ... \ --type qcow2 \ --target-arch amd64 \ quay.io/centos-bootc/fedora-bootc:eln ``` and this is predicated on the idea that the target container is available for both host and target architecture. When that is the case the container for the host architecture is used as a buildroot, the container for the target architecture is the target tree. Note that some stages chroot into the target tree so `qemu-user` is essential for this to work so that target achitecture code can be run transparently (because only very small amounts of code need to run emulated this is pretty fast still). It will need qemu-user{,-static} installed on the host. Sadly we cannot just install qemu-user into the container because the binfmt_misc system is global and not namespaced. So while adding the binfmt_misc handler from inside the container would work with enough privs it would alter the global machine state and we should not do it.
ac95040
to
b3b5102
Compare
b3b5102
to
80acf87
Compare
Hm, this one is running out of diskspace again :/ |
Most go packages return pointers to structs when using `NewFoo()`. But `NewResolver()` does not and AFAICT there is no deep reason for this. I would prefer to return a pointer here, I got tripped up by this in osbuild/bootc-image-builder#139 (comment)
The main piece of this PR got merged via #179 and is available now. The testing got split into #181 mostly to avoid too much spam for this PR as I had to force push a bunch of times to figure the right GH runner setup with enough diskspace out. That seems to be fortunately solved now too. Once 181 is merged I will close this one as it's empty. Thanks everyone for your great insights and help getting this ready. I really appreciate that! |
Most go packages return pointers to structs when using `NewFoo()`. But `NewResolver()` does not and AFAICT there is no deep reason for this. I would prefer to return a pointer here, I got tripped up by this in osbuild/bootc-image-builder#139 (comment)
[This is based on https://github.com//pull/138 and only the last two commits are relevant]
This draft PR outlines how we could do cross architecture image building. I.e. you
bootc-image-builder
on a mac and target anamd64
machine.The usage would be:
and this is predicated on the idea that the target container is available for both host and target architecture. When that is the case the container for the host architecture is used as a buildroot, the container for the target architecture is the target tree. Note that some stages chroot into the target tree so
qemu-user
is essential for this to work so that target achitecture code can be run transparently (because only very small amounts of code need to run emulated this is pretty fast still).