Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging core PRs doesn't cancel PR builds #3807

Closed
NotMyFault opened this issue Oct 30, 2023 · 10 comments
Closed

Merging core PRs doesn't cancel PR builds #3807

NotMyFault opened this issue Oct 30, 2023 · 10 comments

Comments

@NotMyFault
Copy link
Member

Service(s)

ci.jenkins.io

Summary

ref jenkinsci/jenkins#8659 (comment) and the couple of messaged below.

Apparently that's the case for plugins, but should be the case for core PRs too, especially when advancing a merge could save use a few hours of resources.

Reproduction steps

No response

@NotMyFault NotMyFault added the triage Incoming issues that need review label Oct 30, 2023
@dduportal dduportal self-assigned this Nov 1, 2023
@dduportal dduportal removed the triage Incoming issues that need review label Nov 1, 2023
@dduportal
Copy link
Contributor

Hello @NotMyFault , it is a pipeline setting (I believe a PR is needed on Core then) as the job configuration only (and already) enable the "Abort Build" option for "Orphan Item strategy".

For plugin, this is this property: https://github.com/jenkins-infra/pipeline-library/blob/ccc2b0a6ab573bb2381047aac9b495e8ad6257bb/vars/buildPlugin.groovy#L7

@dduportal
Copy link
Contributor

It's weird: the behavior is already set on the Jenkinsfile: https://github.com/jenkinsci/jenkins/blob/01e30ea1df6dc87c0a891a7b42a4252bbce01f45/Jenkinsfile#L12

@dduportal
Copy link
Contributor

And the configuration of https://ci.jenkins.io/job/Core/job/jenkins/ is below:

Capture d’écran 2023-11-01 à 11 02 56

=> when a PR is merged, a webhook is sent to ci.jenkins.io, which triggers a "Scan Repository Now" detecting the PR's reference branch is gone and consider it as "Orphaned".

But there is something weird going on: I see Prs marked as "removed" but should be deleted:

Capture d’écran 2023-11-01 à 11 06 07

For instance: jenkinsci/jenkins#8669

Gotta check deeper

@dduportal
Copy link
Contributor

After triggering a manual scanc, these PRs have been GC-ed.

I wonder if the GC of orphaned build isn't defered or async: that would explain.

@dduportal
Copy link
Contributor

🤔 Checked in datadog: no error on the time when the PR as merged.

Last recorded log line is:

[2023-10-29T22:32:02.872Z] Commit message: "Update dependabot branch for LTS 2.426.x (#8659)"

I'm not sure if the "Scan Repository Logs" are collected though

@dduportal
Copy link
Contributor

Ping @NotMyFault do you have more details to help us, it is hard to analyse the issue here (we're not sure of the exact behavior you had)

@NotMyFault
Copy link
Member Author

NotMyFault commented Nov 7, 2023

Ping @NotMyFault do you have more details to help us, it is hard to analyse the issue here (we're not sure of the exact behavior you had)

I didn't merge any PRs with a running build yet, but I'll look out for one.

@zbynek
Copy link

zbynek commented Nov 12, 2023

@dduportal "disable concurrent builds" will prevent two parallel builds on the master/main branch after merging, but will it abort the PR build?

@dduportal
Copy link
Contributor

@dduportal "disable concurrent builds" will prevent two parallel builds on the master/main branch after merging, but will it abort the PR build?

Nope, it is unrelated. In the context of a Multi-branch pipeline, the mechanism in charge of aborting the PR build is the "Orphan Item Strategy": when the PR is merged, a webhook is sent to Jenkins which trigger a repository scan. Then Jenkins detects the PR's source branch reference is gone and decides to remove the associated pipeline (e.g. the Pipeline sub-job from the "PR" tab) => hence the screenshot above about this strategy showing Jenkins should abort the build immediately when detecting the change.

However, this is not a synchronous process so there might be multiple elements slowing down the detection:

  • If the GitHub API rate limit threshold is crossed, the Scan repository can take way more time than expected (10 - 20 minutes), delaying the detection of the merged PR.
  • If the webhook triggering the scan repository is lost (controller restart, network issue, etc.) then the Multibranch is not scanned (until the next webhook for something else or the daily polling).
  • [Pure gut feeling] It looks like using the S3 artifact manager is delaying the "garbage collection": a job or a build is not deleted by the controller immediately if it is in the phase of uploading artifacts. There might be something to check in this area

@dduportal
Copy link
Contributor

I'm gonna close this issue as we do not have an immediate reproduction, so the infra team may focus on something else.

Please feel free to reopen with screenshots (recordings), logs etc. if you see this behavior again on ci.jenkins.io.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants