-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent StepExecutionIterator
from leaking memory in cases where a single processed execution has a stuck CPS VM thread
#347
Changes from 4 commits
ce222c4
9b6c0ae
b2c93a4
7100611
2de670d
b15fde4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -38,12 +38,14 @@ | |
import hudson.model.TaskListener; | ||
import hudson.model.queue.QueueTaskFuture; | ||
import java.io.Serializable; | ||
import java.lang.ref.WeakReference; | ||
import java.time.Duration; | ||
import java.time.Instant; | ||
import java.util.Collections; | ||
import java.util.Set; | ||
import java.util.function.Supplier; | ||
import java.util.logging.Level; | ||
import jenkins.model.Jenkins; | ||
import org.hamcrest.Matcher; | ||
import org.jenkinsci.plugins.workflow.cps.CpsFlowDefinition; | ||
import org.jenkinsci.plugins.workflow.job.WorkflowJob; | ||
|
@@ -60,6 +62,7 @@ | |
import org.jvnet.hudson.test.Issue; | ||
import org.jvnet.hudson.test.LoggerRule; | ||
import org.jvnet.hudson.test.JenkinsSessionRule; | ||
import org.jvnet.hudson.test.MemoryAssert; | ||
import org.jvnet.hudson.test.TestExtension; | ||
import org.kohsuke.stapler.DataBoundConstructor; | ||
|
||
|
@@ -160,6 +163,32 @@ | |
}); | ||
} | ||
|
||
@Test public void stepExecutionIteratorDoesNotLeakBuildsWhenOneIsStuck() throws Throwable { | ||
sessions.then(r -> { | ||
var notStuck = r.createProject(WorkflowJob.class, "not-stuck"); | ||
notStuck.setDefinition(new CpsFlowDefinition("semaphore('wait')", true)); | ||
var notStuckBuild = notStuck.scheduleBuild2(0).waitForStart(); | ||
SemaphoreStep.waitForStart("wait/1", notStuckBuild); | ||
WeakReference<Object> notStuckBuildRef = new WeakReference<>(notStuckBuild); | ||
// Create a Pipeline that runs a long-lived task on its CpsVmExecutorService, causing it to get stuck. | ||
var stuck = r.createProject(WorkflowJob.class, "stuck"); | ||
stuck.setDefinition(new CpsFlowDefinition("echo 'test message'; Thread.sleep(Integer.MAX_VALUE)", false)); | ||
var stuckBuild = stuck.scheduleBuild2(0).waitForStart(); | ||
r.waitForMessage("test message", stuckBuild); | ||
Thread.sleep(1000); // We need Thread.sleep to be running in the CpsVmExecutorService. | ||
// Make FlowExecutionList$StepExecutionIteratorImpl.applyAll submit a task to the CpsVmExecutorService | ||
// for stuck #1 that will never complete, so the resulting future will never complete. | ||
StepExecution.applyAll(e -> null); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should this return value be kept in a local variable? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's not necessary for the leak, if that's what you mean. Even if the |
||
// Let notStuckBuild complete and check that it can be GC'd. | ||
SemaphoreStep.success("wait/1", null); | ||
r.waitForCompletion(notStuckBuild); | ||
notStuckBuild = null; // Clear out the local variable in this thread. | ||
Jenkins.get().getQueue().clearLeftItems(); // We don't want to wait 5 minutes. | ||
MemoryAssert.assertGC(notStuckBuildRef, true); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Revert the fix and this will fail. The PR description shows the reference path preventing the build from being cleaned up. |
||
// TODO: Test cleanup hangs for 1 minute in CpsFlowExecution.suspendAll because the checkpoint task can't run. | ||
dwnusbaum marked this conversation as resolved.
Show resolved
Hide resolved
|
||
}); | ||
} | ||
|
||
public static class NonResumableStep extends Step implements Serializable { | ||
public static final long serialVersionUID = 1L; | ||
@DataBoundConstructor | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yikes. Project Loom would be very welcome in code like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more carefully through Guava's Javadoc I think I can use
FluentFuture
to simplify this.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually
FluentFuture
doesn't really make things any clearer, so I will leave it.