-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use persist dir for oom file #21523
Use persist dir for oom file #21523
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: umohnani8 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
67b526e
to
0c224be
Compare
Ephemeral COPR build failed. @containers/packit-build please check. |
1 similar comment
Ephemeral COPR build failed. @containers/packit-build please check. |
Cockpit tests failed for commit 67b526e0fc51538fff7220f649ff84e781aab34a. @martinpitt, @jelly, @mvollmer please check. |
Cockpit tests failed for commit 0c224be3c7f9e2034f0819ffb45799db44df1f48. @martinpitt, @jelly, @mvollmer please check. |
d3c5dab
to
c7e8ec1
Compare
Cockpit tests failed for commit d3c5dab3c28b1386c034c0451c93c20efef02d62. @martinpitt, @jelly, @mvollmer please check. |
Cockpit tests failed for commit c7e8ec198eafe553d193993624d8c08319d9de4d. @martinpitt, @jelly, @mvollmer please check. |
c7e8ec1
to
c2d7ab8
Compare
Cockpit tests failed for commit c2d7ab84814f539a8d9529b372937833be95f047. @martinpitt, @jelly, @mvollmer please check. |
c2d7ab8
to
0d1383f
Compare
@rhatdan @mheon @haircommander PTAL This is a breaking change as we are moving from using exit-dir to perisist-dir for the exit and oom files. Conmon writes the oom file to the persist-dir if a container is oom killed as well as the exit file so switching to only using persist-dir. This fixes the bug also where we were not setting oom killed to true when the container was oom killed. |
this makes sense to me. exit dir is most useful for cri-o which registers an inotify watcher on it. Having the exists centralized is helpful. Whereas podman really only needs to read from one location, and having that be persist path makes sense |
LGTM |
/lgtm |
Is the hyperv failure expected? |
No, but it could be a flake |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this break all existing running containers? They can no longer read the exit file correctly.
I would love to have #21535 running on this before merging. If this doesn't work then we cannot really upgrade test properly.
0d1383f
to
d170c3d
Compare
Sure we can wait on #21535 to get in before getting this in.
|
Ephemeral COPR build failed. @containers/packit-build please check. |
8b4cf78
to
ca960e8
Compare
Conmon writes the exit file and oom file (if container was oom killed) to the persist directory. This directory is retained across reboots as well. Update podman to create a persist-dir/ctr-id for the exit and oom files for each container to be written to. The oom state of container is set after reading the files from the persist-dir/ctr-id directory. The exit code still continues to read the exit file from the exits directory. Signed-off-by: Urvashi Mohnani <[email protected]>
ca960e8
to
667311c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code LGTM but given that touches exec parts as well I guess we should test an podman exec oom kill as well here. I am not sure what happens if podman exec runs into an oom kill than does it a) only kill the exec session or b) the whole container or c ) can it be either depending on what the oom killer feels like or something else entirely?
should be the exec session unless podman is setting the cgroupv2 knob though, with v1 oom killing, sometimes it's c :) |
Nothing with the exec functionality has been changed by adding the persist dir option here. The only thing is if an exec session is oom killed, I believe it will write oom file in the persist dir location and handle the exit code as it always was. I am trying to add a test regardless, which works as expected when I do it manually but not in the gingko test suite. These are the commands I am running
Any other possible examples that would work with the test suite? |
/unhold If something has to be done for exec lets do it in a separate pr. |
/lgtm |
Conmon writes the exit file and oom file (if container
was oom killed) to the persist directory. This directory
is retained across reboots as well.
Update podman to create a persist-dir/ctr-id for the exit
and oom files for each container to be written to. The oom
state of container is set after reading the files
from the persist-dir/ctr-id directory.
The exit code still continues to read the exit file from
the exits directory.
Fixes #13102
Does this PR introduce a user-facing change?