Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover gracefully when a PlaceholderTask is in the queue but the associated build is complete #185

Merged
merged 6 commits into from
Dec 6, 2021
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
<useBeta>true</useBeta>
<gitHubRepo>jenkinsci/${project.artifactId}-plugin</gitHubRepo>
<hpi.compatibleSinceVersion>2.40</hpi.compatibleSinceVersion>
<jenkins-test-harness.version>1666.vd1360abbfe9e</jenkins-test-harness.version>
</properties>
<dependencyManagement>
<dependencies>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,22 @@ public String getCookie() {
}

@Override public CauseOfBlockage getCauseOfBlockage() {
Run<?, ?> run = runForDisplay();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is safe. If we get here for a step that is just starting or resuming, then the run is already loaded and so this should complete quickly. The only time this should be slow is if this is after a Jenkins restart and the build has already completed so we end up here without the build having been loaded via some other route and we trigger the cancellation path.

if (!stopping && run != null && !run.isLogUpdated()) {
stopping = true;
LOGGER.warning(() -> "Refusing to build " + this + " and cancelling it because associated build is complete");
Timer.get().execute(() -> {
Queue.getInstance().cancel(this);
});
dwnusbaum marked this conversation as resolved.
Show resolved Hide resolved
}
if (stopping) {
return new CauseOfBlockage() {
@Override
public String getShortDescription() {
return "Stopping " + getDisplayName();
}
};
Comment on lines +432 to +437
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if an anonymous class or hard-coded text is ok here. My thought was that this cause should not usually be around long enough for anyone to see it, but I guess we should set up localization just in case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An anonymous class should be fine here. As you say, it ought not appear in the GUI for more than a moment if at all.

}
return null;
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import hudson.model.FreeStyleProject;
import hudson.model.Item;
import hudson.model.Job;
import hudson.model.Label;
import hudson.model.Node;
import hudson.model.Queue;
import hudson.model.Result;
Expand Down Expand Up @@ -79,6 +80,7 @@

import hudson.util.VersionNumber;
import java.nio.charset.StandardCharsets;
import java.nio.file.StandardCopyOption;
import java.util.Set;
import jenkins.model.Jenkins;
import jenkins.security.MasterToSlaveCallable;
Expand Down Expand Up @@ -1264,6 +1266,46 @@ public void accessPermittedOnlyFromCurrentBuild() throws Throwable {
r.buildAndAssertSuccess(main);
});
}

@Test public void placeholderTaskInQueueButAssociatedBuildComplete() throws Throwable {
logging.record(ExecutorStepExecution.class, Level.FINE).capture(50);
Path tempQueueFile = tmp.newFile().toPath();
sessions.then(r -> {
WorkflowJob p = r.createProject(WorkflowJob.class, "p");
p.setDefinition(new CpsFlowDefinition("node('custom-label') { }", true));
WorkflowRun b = p.scheduleBuild2(0).waitForStart();
// Get into a state where a PlaceholderTask is in the queue.
while (true) {
Queue.Item[] items = Queue.getInstance().getItems();
if (items.length == 1 && items[0].task instanceof ExecutorStepExecution.PlaceholderTask) {
break;
}
Thread.sleep(500L);
}
// Copy queue.xml to a temp file while the PlaceholderTask is in the queue.
r.jenkins.getQueue().save();
Files.copy(sessions.getHome().toPath().resolve("queue.xml"), tempQueueFile, StandardCopyOption.REPLACE_EXISTING);
// Create a node with the correct label and let the build complete.
DumbSlave node = r.createOnlineSlave(Label.get("custom-label"));
r.assertBuildStatusSuccess(r.waitForCompletion(b));
// Remove node so that tasks requiring custom-label are stuck in the queue.
Jenkins.get().removeNode(node);
});
// Copy the temp queue.xml over the real one. The associated build has already completed, so the queue now
// has a bogus PlaceholderTask.
Files.copy(tempQueueFile, sessions.getHome().toPath().resolve("queue.xml"), StandardCopyOption.REPLACE_EXISTING);
sessions.then(r -> {
WorkflowJob p = r.jenkins.getItemByFullName("p", WorkflowJob.class);
WorkflowRun b = p.getBuildByNumber(1);
assertFalse(b.isLogUpdated());
r.assertBuildStatusSuccess(b);
while (Queue.getInstance().getItems().length > 0) {
Thread.sleep(100L);
}
assertThat(logging.getMessages(), hasItem(startsWith("Refusing to build ExecutorStepExecution.PlaceholderTask{runId=p#")));
});
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW the reason there was no newline here before is that this step was a @TestExtension of the test formerly above it, now with the new test intervening.

public static final class WriteBackStep extends Step {
static File controllerFile;
static boolean legal = true;
Expand Down