-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment mutation causing pod to restart #586
Comments
It does work. I am using it currently in a local |
Hello, well it does work, I've just managed to successfully deploy (on minikube 1.21.0 and kubernetes 1.21.0) the operator along with Jenkins using the helm chart from the master branch. Could you provide some more details:
Jenkins uses |
We are using EKS (1.19.6) and we do not have issues with EmptyDir for other pods. See example pod with emptydir in the same workspace:
We are deploying via helm chart using the documentation here for the operator: Here's the values file for our operator:
This was pulled from here with very minor tweaks to Snippet of jenkins-opeartor logs:
Here's the
All the objects for the mounts are there: cm: secrets: |
@prryb We are using helm:
We're struggling to understand why it can't find the
When we do this the operator complains that we can't define our own jenkins-home. So it seems like the operator knows what's going on. Also when we describe the pod we see the As stated and shown above we've not had any other issues with EmptyDir until now. I've noticed in the last few months there's been a handful of people posted relatively the same issue under another title:
I let the operator and pod run for a while and here is consistent snippet of the events log:
I'm also including the yaml for the failing jenkins-test pod:
|
So i was reading around a bit more and digesting what I'm seeing a bit more and honestly I think the issue is more about the operator restarting Jenkins each time the env is set by the operator. The values it's calling out during the restart is here:
This spits out in the operator log every 3 - 5 seconds. The
So I assume that It's using the Jenkins object to set these values. So the way I see it (making some assumptions here):
|
So as @prryb and @thecooldrop stated in the beginning it turned out to be environmental. So my last message about stuck in an endless loop on Env Change was correct, however what I was wrong about is the message we had in the operator log showed:
We have newrelic installed and it injected the following env for every container including jenkins pod managed by the operator. Adding this to the
Is there a better way to handle this? My concern is that newrelic releases a new version that includes a new ENV or changes the format of one of the existing ones such that they don't map and I'm back to square one again. Is there a way to setup an ignore list for ENV? Or if nothing else the log message didn't include the NEW_RELIC_* ENVs in the list so it was only after beating my head against the wall I was able to find the root cause. In short I think we there's an improvement here. Thanks again for rubber-ducking this for me guys! |
Thx to @thecooldrop & @prryb for chiming in about it working well for you. This caused us to redouble our efforts and led us to confirm it was an environmental issue with our cluster. We were about ready to throw in the towel as the error about jenkins home volume was misleading. Your comments gave us some renewed hope. I updated the title to reflect the nature of the issue better in case someone else encounters a similar situation. |
Thank you @doncorsean, your comment has made my evening and rekindled my hope as well that I will be able to completely integrate this product in my team in professional setup as well. I had issues with home volume as well and it was quite unclear to me why I can not just mount some PV, but then I dug into older issues and discovered that it is actually a feature to retain immutability. Also a tip: Upgrade the Kubernetes plugin version to 1.30.0, it seems like the 1.29.6 ( or 1.29.7, whichever is actually latest 1.29.x version ) has a build problem and runs into |
@thecooldrop thanks for the tip! Both @doncorsean and myself went through the mental shift ourselves. We think with the CasC (Configuration as Code) plugin the reality of an immutable jenkins home is a great idea. |
I dont understand what the solution to this issue is. If i try to set the following env variable in the values.yaml for helm, then jenkins goes into a restart loop.
Because the env changes.
What should I set in order for this to work with jenkins-operator not restarting jenkins? |
Just want to add an answer for the previous comment, in case anyone else winds up here.
to:
which fixes it. |
@adaphi Awesome. Thanks for taking the time to share that workaround. |
why this issue was closed , same happened in jenkins-operator deployed to EKS , as we want jenkins master having IRSA to integrate with AWS ASG , the aws credential related ENVs and Volumes will dynamically attached to POD which also gonna cause restart loop , any fixing? |
by what? a mutation webhook or? |
https://github.com/aws/amazon-eks-pod-identity-webhook/ sort of webhook |
I attempted to configure an ephemeral ebs volume under master in the CRD, but this has the same issue because the volume config triggers the ebs-csi-driver to generate the volume in AWS and then update the pod with the volume details for mounting, which triggers jenkins-operator to restart, and round and round it goes. Looks like it's configured here - https://github.com/jenkinsci/kubernetes-operator/blob/master/pkg/configuration/base/pod.go#L22 @brokenpip3 Would you be amenable to some configuration around disabling these checks individually? Maybe a list of disabled_reconciliations and a contains() check on each conditional? I scrolled through the issues and I believe these are all related - #361 #368 #733 |
yes it's on my plan, as soon I will find the time to complete the 0.9.0 I will introduce several changes. |
No worries; I understand your priorities. I can assist here; just want to make sure we agree on the right course. :) |
I filed #1078 to discuss the handling of projected volumes, which is a special case of an environment mutation. |
After CONSTANT struggles to get any version of operator to work we tried using the latest release (0.6.0). Getting an error related to missing volume
Error: cannot find volume "jenkins-home" to mount into container "jenkins-master"
Steps to reproduce the behavior:
Follow the documentation here for 0.6.0 release
https://jenkinsci.github.io/kubernetes-operator/docs/installation/
Additional information
Kubernetes version: 1.19.6
Jenkins Operator version: 0.6.0
EDIT:
After debugging by @ambrons we discovered our New Relic is injecting some env vars to the jenkins pod causing couple potentially misleading error messages and the pod to restart indefinitely.
The text was updated successfully, but these errors were encountered: