New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Added new retention policy "onJobFailure" #1265

Open

moraxy wants to merge 1 commit into jenkinsci:master from moraxy:onJobFailure

moraxy commented Nov 25, 2022

This new retention policy onJobFailure prevents the deletion of a pod in case of a failed build result by the job that was using the pod. This is useful for debugging failed jobs because you are still able to open a terminal in any pod and investigate log files or similar, without having to set the retention policy to 'Always' or using workarounds like long timeouts.

The logic is comparatively simple: Use the runUrl pod annotation to query the particular run from Jenkins and check its result. Sometimes the result isn't available yet, in which case we wait up to 30 seconds.

To prevent having to import Jenkins classes into the test code, I'm only testing the utility methods.

This fixes JENKINS-688969

Make sure you are opening from a topic/feature/bugfix branch (right side) and not your main branch!
Ensure that the pull request title represents the desired changelog entry
Please describe what you did
Link to relevant issues in GitHub or Jira
Link to relevant pull requests, esp. upstream and downstream changes
Ensure you have provided tests - that demonstrates feature works or fixes the issue


          Added new retention policy: onJobFailure

df368ab

This new retention policy prevents the deletion of a pod in case of a failed build result by the job that was using the pod. This is useful for debugging failed jobs because you are still able to open a terminal in any pod and investigate log files or similar, without having to set the retention policy to 'Always' or using workarounds like long timeouts.

moraxy requested a review from a team as a code owner

November 25, 2022 03:15

Author

moraxy commented Dec 8, 2022

Any update on this?

Vlatombe added the enhancement label

Vlatombe reviewed

View reviewed changes

Member

Vlatombe left a comment

It's a good idea, but it needs more work before it can be merged.

Also consider adding an integration test under KubernetesPipelineTest

src/main/java/org/csanchez/jenkins/plugins/kubernetes/pod/retention/OnJobFailure.java

+              package org.csanchez.jenkins.plugins.kubernetes.pod.retention;
+              import hudson.Extension;
+              import hudson.model.*;

Member

Vlatombe Dec 13, 2022

Please set your IDE to avoid wildcard imports.

src/main/java/org/csanchez/jenkins/plugins/kubernetes/pod/retention/OnJobFailure.java

Comment on lines +38 to +41

+                  // small convenience function
+                  private void LOG(Level level, String message) {
+                      LOGGER.log(level, () -> MODULENAME + ": " + message);
+                  }

Member

Vlatombe Dec 13, 2022

Unneeded, the class name is already printed as part of the standard formatter.

src/main/java/org/csanchez/jenkins/plugins/kubernetes/pod/retention/OnJobFailure.java

+                  @Override
+                  public boolean shouldDeletePod(KubernetesCloud cloud, Pod pod) {
+                      if (cloud == null || pod == null) {
+                          LOG(Level.INFO, "shouldDeletePod called without actual cloud and pod");

Member

Vlatombe Dec 13, 2022

Demote all this logging to FINE.

src/main/java/org/csanchez/jenkins/plugins/kubernetes/pod/retention/OnJobFailure.java

Comment on lines +55 to +59

+                      Jenkins jenkins = Jenkins.getInstanceOrNull();
+                      if (jenkins == null) {
+                          LOG(Level.INFO, "Couldn't get the current Jenkins reference");
+                          return true;
+                      }

Member

Vlatombe Dec 13, 2022

Suggested change

      
                    Jenkins jenkins = Jenkins.getInstanceOrNull();
          
                    if (jenkins == null) {
          
                        LOG(Level.INFO, "Couldn't get the current Jenkins reference");
          
                        return true;
          
                    }
          
                    Jenkins jenkins = Jenkins.get();

src/main/java/org/csanchez/jenkins/plugins/kubernetes/pod/retention/OnJobFailure.java

Comment on lines +101 to +104

+                      if (parts.length < 3) {
+                          LOG(Level.INFO, "runUrl has unknown format: " + runUrl);
+                          return null;
+                      }

Member

Vlatombe Dec 13, 2022

This is way too simplistic. If you need to look up the run, I would advise instead to add new annotations (one for the item full name, another for the build id)

Member

Vlatombe commented Feb 1, 2023

Conflict with #1304

jglick reviewed

View reviewed changes

...ain/resources/org/csanchez/jenkins/plugins/kubernetes/KubernetesCloud/help-podRetention.html

@@ @@ -6,6 +6,7 @@ @@
                   <ol>
                       <li>Never - always delete the agent pod.</li>
                       <li>On Failure - keep the agent pod if it fails during the build.</li>
+                      <li>On Job Failure - keep the agent pod if the build itself fails.</li>

Member

jglick Feb 27, 2023

It should rather be called On Build Failure.

Member

jglick Feb 27, 2023

Also, in Jenkins terminology fails means FAILURE—the build did not run to normal completion. UNSTABLE (ran to completion but with test failures) would be considered “successful”. If you mean to use the current impl (checking for SUCCESS) then be clear that an unstable build will also retain the pod.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels