Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Idea: Would it be possible to create the cgroup in advance? #869

Open
utam0k opened this issue Nov 4, 2022 · 6 comments
Open

Idea: Would it be possible to create the cgroup in advance? #869

utam0k opened this issue Nov 4, 2022 · 6 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@utam0k
Copy link
Member

utam0k commented Nov 4, 2022

This is just an idea. I'd like to discuss this with the conmon-rs team.

Maybe you know creating a cgroup takes a cost, actually, it is one of the most time-consuming tasks. in the container runtime following OCI runtime spec.
Youki previously considered creating the cgroup asynchronously with io_uring, but this did not yield very good results.
However, if it is a daemon like a server, there should be enough time to create it in advance. Whereas there, the container runtime should be able to skip the cgroup creation process by creating the process with clone3. Wdyt?

This idea is inspired from:

Thus reducing the amount of exec calls that must happen in the container engine, and reducing the amount of memory it uses.

@saschagrunert
Copy link
Member

Hey @utam0k, thank you for reaching out!

I'm wondering if cgroup creation should be really a concern of conmon-rs, on the other hand we're also thinking about moving parts of the namespace handling into it. How would the interface between (let's say) youki and conmon-rs look like?

Wouldn't it be possible to use clone3 directly within the runtime in the same way as crun does it?
containers/crun#1042

What are your thoughts on that @giuseppe @haircommander ?

@saschagrunert saschagrunert added enhancement New feature or request question Further information is requested labels Nov 4, 2022
@utam0k
Copy link
Member Author

utam0k commented Nov 4, 2022

Thanks for your reaction 🙏

I'm still too new to this project to know that, so please close if this is out of this project's interest.

I'm wondering if cgroup creation should be really a concern of conmon-rs, on the other hand we're also thinking about moving parts of the namespace handling into it. How would the interface between (let's say) youki and conmon-rs look like?

Of course, we can implement it. Sorry if my understanding is different. That assumes that the cgroup subgroup to which the container process has to belong to is created from the caller of OCI Runtime beforehand, right?

Wouldn't it be possible to use clone3 directly within the runtime in the same way as crun does it?
containers/crun#1042

@haircommander
Copy link
Collaborator

this did not yield very good results

what went wrong with this?

the other hand we're also thinking about moving parts of the namespace handling into it

this is true, though the motivation is slightly different. conmon-rs will be taking the responsibility of creating pod-level namespaces. however, in kubernetes world, cri-o is not responsible for creating the pod level cgroup (the kubelet is)

@utam0k
Copy link
Member Author

utam0k commented Nov 6, 2022

@haircommander
Here is the detail. In summary, at that time there was no way to create the directory with io_uring 😭
youki-dev/youki#327

what went wrong with this?

Oh, really? I didn't know about it. Thanks for telling me about it. In other words, when an oci container runtime is used by kubelet, it doesn't need to create a cgroup dir by itself, right?

the kubelet is

@haircommander
Copy link
Collaborator

sort of. Kubelet uses libcontainer to create the pod level cgroup. the oci runtime doesn't as much care about pod cgroups, as it only focuses on the container cgroup. if the pod cgroup is not created, it is created by the oci runtime. otherwise it's treated as any other cgroup (like putting the container in system.slice)

@giuseppe
Copy link
Member

giuseppe commented Nov 8, 2022

I don't think that is really possible when using the systemd cgroup manager. systemd itself will take care to create the cgroup and for doing that, systemd first needs to know the PID to move to the new cgroup so that it is not possible to create an empty cgroup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants