Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add webhook support #256

Open
antheas opened this issue Oct 19, 2024 · 8 comments
Open

Add webhook support #256

antheas opened this issue Oct 19, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@antheas
Copy link
Contributor

antheas commented Oct 19, 2024

Currently, the kernel (e.g., fsync and bazzite) has to be built, then afterwards kernel-cache has to be triggered manually, and only after that builds, can akmods be triggered to build. This increases kernel iteration time 2-3x (assuming someone is monitoring the builds).

Add webhook support to the akmod repo, so that it can be called automatically for specific kernels and fedora versions if and only if the requesting kernel was updated.

The webhook will include the kernel version, mitigating the possibility of drift.

This should also reduce builder pressure by not scheduling dry runs for already built kernels..

@castrojo castrojo added the enhancement New feature or request label Oct 19, 2024
@castrojo castrojo changed the title [Feature Request] Add webhook support Add webhook support Oct 19, 2024
@bsherman
Copy link
Contributor

@m2Giles we've discussed merging kernel-cache workflows into akmods workflows or vice versa.
The resulting images would retain the same names, but akmods would be built immediately after kernel-cache succeeds.

This pre-existing idea does only address part of this suggestion, however, since currently, I don't believe we can trigger a kernel-cache or akmods for a specific kernel/fedora version, and that would still require a webhook to allow triggering from COPR builds, etc.

@antheas
Copy link
Contributor Author

antheas commented Oct 20, 2024

We just deployed the following kernel on bazzite unstable:
https://github.com/hhd-dev/kernel-bazzite/actions/runs/11428122526

It builds on github and supports webhooks. I do not believe building kernels in copr for self consumption is the future for us. It takes way too long (5-7 hours) and it is unpredictable.

Whereas on github it builds on 2 hours and we also have the option of signing it.

You will also notice that the above is an action. So after it finishes I will have to manually have to set the release to latest (as doing so automatically would be risky if users have to consume the kernel), ping one of the ublue members and then they will have to trigger kcache and akmods accordingly.

Whereas with a webhook, it would autobuild the kmods for it based on tag name and the gate on bazzite could then be lifted when appropriate. With no manual intervention afterwards.

@bsherman
Copy link
Contributor

We just deployed the following kernel on bazzite unstable: https://github.com/hhd-dev/kernel-bazzite/actions/runs/11428122526

It builds on github and supports webhooks. I do not believe building kernels in copr for self consumption is the future for us. It takes way too long (5-7 hours) and it is unpredictable.

Whereas on github it builds on 2 hours and we also have the option of signing it.

I mentioned COPR in my comment as an example, but I agree with your point. Any appropriate caller could trigger a webhook after completing a kernel build.

You will also notice that the above is an action. So after it finishes I will have to manually have to set the release to latest (as doing so automatically would be risky if users have to consume the kernel), ping one of the ublue members and then they will have to trigger kcache and akmods accordingly.

Whereas with a webhook, it would autobuild the kmods for it and the gate on bazzite could then be lifted when appropriate. With no manual intervention afterwards.

Yep, I see the value and want to proceed with a webhook implementation.

The one "must have" to enable a webhook the way this issue is worded:

  • workflow restructuring to support specific kernel builds. at the moment, we only can trigger per Fedora version

My question to @m2Giles is looking for agreement on merging kernel/akmods workflows before we implement webhook and specific kernel/fedora version builds, since if we don't do it, a caller (like the bazzite-kernel build) would need to call 2 webhooks and only call the second (akmods) if the first was successful. And we'd have to do workflow restructuring in both repos.

@antheas
Copy link
Contributor Author

antheas commented Oct 20, 2024

Ideally, kernel-cache would call a webhook after it succeeds to akmods for each kernel that succeeded.

Kernel cache has to be somewhat separate as we want to rebuild akmods for a specific kernel multiple times in case one of them updates.

@m2Giles
Copy link
Member

m2Giles commented Oct 20, 2024

Would definitely like to merge kernel-cache into akmods.

The two button press dance when they are inherently linked.

I also think we need to do something to prevent that intermediate time where kernel-cache is updated but akmods is not

@bsherman
Copy link
Contributor

bsherman commented Oct 20, 2024

@antheas

Kernel cache has to be somewhat separate as we want to rebuild akmods for a specific kernel multiple times in case one of them updates.

I believe your point here is to avoid unnecessary rebuilding of the kernel-cache if we only need to update the akmods.

The kernel-cache workflow is relatively fast so I think we could handle this a couple ways:

  1. maybe it's fine to always rebuild kernel-cache if we want to rebuild akmods as it is "fast enough"?
  2. maybe we skip rebuilding kernel-cache if we see that for the kernel-version inputs we already have published a good, signed image into the registry?

Like @m2Giles, I agree we want them more tightly coupled.

I also think we need to do something to prevent that intermediate time where kernel-cache is updated but akmods is not

For this concern, I think we could restructure workflows like:

  1. build kernel-cache locally (or pull known good version if already exists) but do NOT push to ghcr
  2. build all akmods dependencies against the host-builder's local copy of kernel-cache image, but don't push images to GHCR
  3. when all builds are complete: push all images to GHCR
    • perhaps we add an artifact to track which local images need to be pushed in a "final step" of the workflow?

@antheas
Copy link
Contributor Author

antheas commented Oct 20, 2024

Kernel cache has to be somewhat separate as we want to rebuild akmods for a specific kernel multiple times in case one of them updates.

I believe your point here is to avoid unnecessary rebuilding of the kernel-cache if we only need to update the akmods.

I mean having a dry run is one thing. The point here was being able to use an older kernel version that expired for reverts.

The kernel-cache workflow is relatively fast so I think we could handle this a couple ways:

  1. maybe it's fine to always rebuild kernel-cache if we want to rebuild akmods as it is "fast enough"?
  2. maybe we skip rebuilding kernel-cache if we see that for the kernel-version inputs we already have published a good, signed image into the registry?

Like @m2Giles, I agree we want them more tightly coupled.

I also think we need to do something to prevent that intermediate time where kernel-cache is updated but akmods is not

There is an inherent race condition here in which akmods might have partially updated. This is compounded due to the fact that there are multiple akmod images.

For this concern, I think we could restructure workflows like:

  1. build kernel-cache locally (or pull known good version if already exists) but do NOT push to ghcr

  2. build all akmods dependencies against the host-builder's local copy of kernel-cache image, but don't push images to GHCR

  3. when all builds are complete: push all images to GHCR

    • perhaps we add an artifact to track which local images need to be pushed in a "final step" of the workflow?

I think a big part in the solution will be if we want to be able to access older kernel builds. We have been bit due to copr on bazzite multiple times in the past and that is what created the kernel cache.

However, having a github repo that has a full kernel history and being able to build a fresh set of akmods from any previous kernel version is very powerful. In addition to being able to build multiple fresh kernels at the same time in just 2 hours.

If it is powerful enough to negate the need for a kernel cache, then we could just merge all akmods and the kernel that built them into a single output image and solve all concurrency concerns. Do we get a perf benefit from splitting them apart anyway? Seems most time is spent doing the same thing in all images.

@antheas
Copy link
Contributor Author

antheas commented Oct 20, 2024

If we proceed this way, I think the best way forward is a new repo that merges the previous two. Anything else would break builds for a week or so at least, as any change of this type would induce breaking changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Todo
Development

No branches or pull requests

4 participants