-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systests: cp: add wait_for_ready #20912
systests: cp: add wait_for_ready #20912
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds like a logical explanation to me but I think you have overdone it a bit.
I half-agree. My first pass was addressing only the touch/mkdir containers. After some testing, and some thinking about it, I decided I never want to look at this flake again. I then applied |
Harmful no, but it makes the diff here bigger than it needs to be and makes the tests slower as they now always call podman logs even when it is not needed. |
OK. I'll repush once CI finishes. |
56bde48
to
18a268f
Compare
Some of the tests were doing "podman run -d" without wait_for_ready. This may be the cause of some of the CI flakes. Maybe even all? It's not clear why the tests have been working reliably for years under overlay, and only started failing under vfs, but shrug. Thanks to Chris for making that astute observation. Fixes: containers#20282 (I hope) Signed-off-by: Ed Santiago <[email protected]>
18a268f
to
4d2125b
Compare
Done. Now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: edsantiago, Luap99 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
a64cc98
into
containers:main
Thanks for fixing this Ed, hopefully it was the cause.
If it helps, and this is a total guess. My feeling is the failure unpredictability is coming from the storage subsystem in the cloud context. All the CI VMs are running with (presumably multi-path) fiber-channel/network based storage. That in and of itself adds in a HUGE amount of complexity w/in the kernel and hardware-wise. Worse, both bandwidth and IOPS are "provisioned" (i.e. limited) based on what you pay for. Either/both of those aspects could easily result in randomly appearing "hiccups" in user-space. In other words, we should expect both the cloud "throttling" reads and/or writes, and occasional (transparent) hiccups w/in the hardware or network "fabric" itself. |
Some of the tests were doing "podman run -d" without wait_for_ready.
This may be the cause of some of the CI flakes. Maybe even all?
It's not clear why the tests have been working reliably for years
under overlay, and only started failing under vfs, but shrug.
Thanks to Chris for making that astute observation.
Fixes: #20282 (I hope)
Signed-off-by: Ed Santiago [email protected]