-
-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retire pending workers #820
Conversation
Everything looks good at least for now. I deployed the change to our production cluster and will provide updates if needed. Hopefully, everything works well 😄 |
Surprisingly I had the most stable run ever. One note to mention is if a pod is restarting which means it's
So I was thinking of a way to execute the logic only on deployment that have pods in a |
Done! now the operator takes actions based on pending pods rather than the deployments |
f0bf687
to
3c2aaec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time here. A few thoughts.
not in [ | ||
"ContainerCreating", | ||
"Initializing", | ||
"Terminating", | ||
"Running", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you might be confusing the Pod phases with the status that kubectl
displays. The possible values for phase are Pending
, Running
, Succeeded
, Failed
and Unknown
.
https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/#podstatus-v1-core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, you're correct I've been debugging this using k9s
Sorry for the long delay, unfortunately, I don't have much time to work on this. meanwhile, I'll close this PR to allow someone else to finish this |
@jacobtomlinson I believe this is ready to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI appears to be hanging as a result of these changes. This needs looking into before we can merge.
It is something related to building some go code. I tried to check what is wrong but didn't figure it out. it appears to be something related to the CI itself as I see most of the PRs are falling too. |
@jacobtomlinson If possible can you suggest any solution to solve this issue? |
Thanks for being patient here. I've nudged the CI back into a happy state and pulled main into this PR. So hopefully now any failures will only be related to this PR. |
Ok the tests are still failing here so they are certainly hanging due to the changes in this PR. |
Seems like they're timeout issues not something failing |
Sure but I expect the CI is timing out because the tests are hanging. The tests on |
Based on discussion at #817