-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to update kata-runtime to point to the main
version
#1596
Comments
After a lot of compilation issues, I've got it all compiling not and built a CAA OCI image from it, but when testing it doesn't work:
so I've clearly broken something during the change |
Beraldo has recommended re-generating the hypervisor protos with ttrpc rather than grpc, so I've created a branch that has that change in kata-runtime: https://github.com/kata-containers/kata-containers/compare/main...stevenhorsman:hypervisor-ttrpc?expand=1 |
I've updated my branch to use my fork of kata runtime with the ttrpc changes and I think I get further now. as when I try and start a peer pods the error is:
|
We've been tracking the status on this in a slack thread. A rough summary is:
It also shows that nydus options aren't being set. This is for at least two reasons:
I think we are now blocked until these steps before we can go much further, but hopefully my kata runtime PR: https://github.com/kata-containers/kata-containers/pull/8520/commits can be merged in the mean time |
We've managed to get some of the steps required for this upstreamed now:
Remaining issues that will need resolving before we can be unblocked here (and there might be more after)
|
Just an update on this - now the agent supports image pull on guest I've done a bunch of PoC work on this to push us further in the right direction. The current place we are at is that nydus_snapshotter isn't putting the correct annotation into the storage driver for us to pull on guest. Fabiano is also seeing this on local hypervisor, so hopefully we can work it out between us... |
The problem we were hitting in noted in Issue 4 here: kata-containers/kata-containers#8407 (comment) I ran the following script, kindly provided by Fabiano on the worker:
and after that the image pull on host worked and the container is up and running:
so we just need a way to do this more easily on the worker node... |
Ok, I'm going to try and describe how to reliably and reproducible set up a dev environment for this for testing. I'm using libvirt with a kcli cluster and an pretty chunky 16 vCPU 32GRAM VM, but I'm not sure that is strictly required to be that large. I will also do some steps like pushing the podvm image that are just so people following that can use my images and save some time:
You should now have pods:
After waiting a little while you should see the pod running:
|
After chatting to Pradipta we've realised that we need to change and remove all the CAA install/kustomize's installation of the caa-pod now that the peerpodconfig-ctrl is deploying it. There is a lot of references to it, so I'm trying to go through and unpick and provide alternatives to this... |
After a bunch of updates to resolve the double CAA ds the latest instructions are a bit more simplified: I'm going to try and describe how to reliably and reproducible set up a dev environment for this for testing. I'm using libvirt with a kcli cluster and an pretty chunky 16 vCPU 32GRAM VM, but I'm not sure that is strictly required to be that large. I will also do some steps like pushing the podvm image that are just so people following that can use my images and save some time:
After waiting a little while, you should now have pods:
After waiting a little while you should see the pod running:
|
Hi @stevenhorsman I can confirm this is working with containerd cluster. I was able to reproduce it. Unfortunately not true with cri-o. :(
It looks like the operator is not fully supporting cri-o yet. |
A small update here - I've created new images with the 3.4 release of kata-containers: |
This is going okay and I have some e2e passes locally, but can't get them on the e2e CI as the I also can't fully test it on my fork as it runs on the azure runner that I can't access. |
The workflow PR #1828 has been merged, so hopefully if I rebase the |
As part of the merge to main effort, we have kata-containers/kata-containers#7046 which is adding the remote hypervisor feature to the kata runtime. Once this is merge we should test out whether the CAA can re-vendor on it and see what issues there are. We also know that as part of these changes we want to remove the gogoprotobuf workaround
cloud-api-adaptor/go.mod
Lines 162 to 164 in eb1b368
The text was updated successfully, but these errors were encountered: