Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build repeatedly triggers for Open PR where the source branch has been deleted #219

Closed
stevemuskiewicz opened this issue Mar 17, 2020 · 24 comments

Comments

@stevemuskiewicz
Copy link

With plugin version 1.4.28, builds are repeatedly triggered for an open PR even though its source branch has been deleted (and obviously the builds all fail)

Probably should be checking for the existence of the source branch before triggering a build.

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 17, 2020 via email

@stevemuskiewicz
Copy link
Author

I will attempt if I get a chance, but currently we are forced to run with 1.4.28 due to the unresolved issues reported in #193

Hard for me to tell if 1.5.0 addressed the above issue or not.

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 17, 2020 via email

@stevemuskiewicz
Copy link
Author

@CodeMonk so understood and appreciated, however as my previous comment in 193 indicated, I found a pretty specific combination where the issue was reproducing and it seems some other commenter offered some feedback about seeing the issue as well.

I am willing to help reproduce further with the caveats that I am not a Java developer and that I am trying to maintain a sizable production build cluster with limited timeframes for "testing" new plugin versions, so aside from providing logs or attempting other tests, I am kind of at a loss as to how I can debug this further...if you have suggestions as to any specific things I can try, I will go and attempt to do that.

thanks!

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 17, 2020 via email

@stevemuskiewicz
Copy link
Author

@CodeMonk thanks!

So for this bug, repro is pretty straightforward, just open a PR in Bitbucket cloud and delete the source branch from bitbucket.org without closing/declining the PR. New builds should get triggered on the PR for every single polling interval of the plugin.

For #193, we have the PR builder configured with "Rebuild if destination branch changes" and "Cancel outdated jobs" enabled in the job config, then we get a particular PR building (our builds take close to an hour) and while that is running, push another commit to the PR branch. What we observe is that the PR builder posts a Bitbucket "in progress" build status on the new commit SHA but it does not cancel the prior (now outdated job) and no new job is triggered against the new commit (the build status in Bitbucket ends up just linking to our toplevel Jenkins URL, not a specific job ID).

Hope that provides enough information, if not please let me know if you are looking for more specifics.

thanks!

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 25, 2020 via email

@CodeMonk
Copy link
Collaborator

Did you delete the branch before the build started?

I deleted after it started, and the build happened to fail, but, it hasn't kicked off again.

@CodeMonk
Copy link
Collaborator

Now trying successful build, deleteing branch after the build started

@CodeMonk
Copy link
Collaborator

So for this bug, repro is pretty straightforward, just open a PR in Bitbucket cloud and delete the source branch from bitbucket.org without closing/declining the PR. New builds should get triggered on the PR for every single polling interval of the plugin.

I am trying, but I have been unable to reproduce this so far.

For #193, we have the PR builder configured with "Rebuild if destination branch changes" and "Cancel outdated jobs" enabled in the job config, then we get a particular PR building (our builds take close to an hour) and while that is running, push another commit to the PR branch. What we observe is that the PR builder posts a Bitbucket "in progress" build status on the new commit SHA but it does not cancel the prior (now outdated job) and no new job is triggered against the new commit (the build status in Bitbucket ends up just linking to our toplevel Jenkins URL, not a specific job ID).

So, that behavior works perfectly for me. We often push updates to a PR, previous ones are aborted, and a new one starts. However:

  1. Jenkins can sometimes not interrupt an individual instruction. So, if your jenkins file does something like docker run ubuntu sleep 300, then the docker process will usually ignore the ctrl-c (because -i wasn't passed), and it will take up to five minutes for the instruction to run, which delays the abort.
  2. When a build is first triggered, it will report the in process to bitbucket before it has an actual job or executor. While annoying, and a bug, it is low priority to fix, and does not cause much harm.

@stevemuskiewicz
Copy link
Author

Did you delete the branch before the build started?

I deleted after it started, and the build happened to fail, but, it hasn't kicked off again.

I'm not certain of the timing, I think the branch/PR were up and ran normally then after some period of time the developer deleted the branch without closing the PR. I think initially it didn't get triggered but at some point (maybe Jenkins master restart or more likely Bitbucket outage) caused the builder to start retriggering PRs at which point it would trigger this PR during every single polling interval.

@CodeMonk
Copy link
Collaborator

Still behaving well. Going to try a branch deletion BEFORE the job does checkout scm now.

@stevemuskiewicz
Copy link
Author

Still behaving well. Going to try a branch deletion BEFORE the job does checkout scm now.

Have a way to "simulate" a Bitbucket outage or something that seems to cause the builder to retrigger everything? I think that may be the key to repro...

@stevemuskiewicz
Copy link
Author

So for this bug, repro is pretty straightforward, just open a PR in Bitbucket cloud and delete the source branch from bitbucket.org without closing/declining the PR. New builds should get triggered on the PR for every single polling interval of the plugin.

I am trying, but I have been unable to reproduce this so far.

For #193, we have the PR builder configured with "Rebuild if destination branch changes" and "Cancel outdated jobs" enabled in the job config, then we get a particular PR building (our builds take close to an hour) and while that is running, push another commit to the PR branch. What we observe is that the PR builder posts a Bitbucket "in progress" build status on the new commit SHA but it does not cancel the prior (now outdated job) and no new job is triggered against the new commit (the build status in Bitbucket ends up just linking to our toplevel Jenkins URL, not a specific job ID).

So, that behavior works perfectly for me. We often push updates to a PR, previous ones are aborted, and a new one starts. However:

  1. Jenkins can sometimes not interrupt an individual instruction. So, if your jenkins file does something like docker run ubuntu sleep 300, then the docker process will usually ignore the ctrl-c (because -i wasn't passed), and it will take up to five minutes for the instruction to run, which delays the abort.
  2. When a build is first triggered, it will report the in process to bitbucket before it has an actual job or executor. While annoying, and a bug, it is low priority to fix, and does not cause much harm.

Fair enough, our build is docker based and we do tend to have some issues with aborted builds leaving containers around, however on 1.4.28 it definitely still seems to abort the build as expected and retrigger another within a 5-10 minute span but with 1.4.30 this was definitely not what we were seeing. As I said, unfortunately don't have much time to experiment, but I will check with our devs and see if maybe this weekend I can try 1.5.0 and see if I can reproduce a specific condition like this. It did seem to be mostly working it was just the aborting the outdated builds that wasn't working entirely as with 1.4.28 for us but that is a common pattern with our PR's so really need that feature to work reliably so we don't end up testing outdated builds.

Thanks for your help and efforts with this one!

@CodeMonk
Copy link
Collaborator

Got the build to fail because checkout_scm failed (because branch was deleted).

So far, no extra builds kicked off.

I'm trying a restart of jenkins next. If a restart doesn't make this do the loop, I'm going to have to ask for your logs.

@CodeMonk
Copy link
Collaborator

Ok - I have been completely unable to reproduce this.

Do you have any other triggers you are using (other than the BBPRB kicking it off)?

I am ONLY able to get repeated builds for each change I push, or the comment trigger. Deleting the branch did not cause it to re-build, and definitely doesn't cause things to build multiple times. The result of the build, failed or not, just goes to the PR.

Can you please attach logs from a failure? From a cycle of two rebuilds? On 1.5.0 preferably?

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 25, 2020

Actually, go ahead and send me logs from your version. If I can find the root cause in any version, then I should be able to either reproduce on 1.5.0, or confirm whether or not the bug was fixed.

Log level FINE, if possible.

@stevemuskiewicz
Copy link
Author

for now all I have is this (default log level), just repeats over and over for each BBPRB polling interval

2020-03-16 01:06:19.018+0000 [id=43] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/pullrequests/7517/approve | Response body: {"type": "error", "error": {"message": "You haven't approved this pull request."}} 2020-03-16 01:06:19.238+0000 [id=165962] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/commit/1d031bc435b9/statuses/build | Response body: {"data": {"shas": ["1d031bc435b9"]}, "type": "error", "error": {"message": "Commit not found", "data": {"shas": ["1d031bc435b9"]}}} 2020-03-16 01:07:08.357+0000 [id=165962] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/commit/1d031bc435b9/statuses/build | Response body: {"data": {"shas": ["1d031bc435b9"]}, "type": "error", "error": {"message": "Commit not found", "data": {"shas": ["1d031bc435b9"]}}} 2020-03-16 01:07:08.363+0000 [id=165962] INFO j.p.s.l.SlackNotificationsLogger#info: [CFM - Pull Request #12291] found #12290 as previous completed, non-aborted build 2020-03-16 01:08:02.300+0000 [id=41] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/commit/1d031bc435b9/statuses/build/jenkins-eff682f47713b6347ce8c9b8a172c1d9 | Response body: {"data": {"shas": ["1d031bc435b9"]}, "type": "error", "error": {"message": "Commit not found", "data": {"shas": ["1d031bc435b9"]}}} 2020-03-16 01:08:18.660+0000 [id=41] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/pullrequests/7517/approve | Response body: {"type": "error", "error": {"message": "You haven't approved this pull request."}} 2020-03-16 01:08:18.864+0000 [id=165983] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/commit/1d031bc435b9/statuses/build | Response body: {"data": {"shas": ["1d031bc435b9"]}, "type": "error", "error": {"message": "Commit not found", "data": {"shas": ["1d031bc435b9"]}}} 2020-03-16 01:09:03.841+0000 [id=165983] WARNING b.b.bitbucket.ApiClient#send: Response status: HTTP/1.1 404 Not Found | URI: https://bitbucket.org/api/2.0/repositories/plexxi/connect/commit/1d031bc435b9/statuses/build | Response body: {"data": {"shas": ["1d031bc435b9"]}, "type": "error", "error": {"message": "Commit not found", "data": {"shas": ["1d031bc435b9"]}}} 2020-03-16 01:09:03.848+0000 [id=165983] INFO j.p.s.l.SlackNotificationsLogger#info: [CFM - Pull Request #12292] found #12291 as previous completed, non-aborted build

Probably not much to go on I'm sure but that's all we have at the moment. If I can attempt to repro this weekend will try to crank up the logging level with it.

@stevemuskiewicz
Copy link
Author

one other piece of info, looking at the BB PR there was a prior successful build logged at commit ID 1d031bc435b9

@CodeMonk
Copy link
Collaborator

You never mentioned that you have it set to approve the PR. Checking the approval code.

@CodeMonk
Copy link
Collaborator

CodeMonk commented Mar 25, 2020

Try using the GUI and set logging to FINE for bbprb and see if you can get more logs from the console.
If the logs above were from the gui, try to also get me the relevant lines from /var/log/jenkins/jenkins.log

I still don't see in the code how that could ever happen.

@CodeMonk
Copy link
Collaborator

CodeMonk commented Apr 2, 2020

Did you ever manage to get FINE logs?

@stevemuskiewicz
Copy link
Author

not yet, I'm just going to spin up another jenkins instance and then try to repro using a test forked repo so I can do this at my own pace rather than wait for windows when our jenkins cluster isn't in use by our dev team (which aren't very frequent anymore). Hopefully I can do this in the next couple of days, sorry for the delay on this.

@stevemuskiewicz
Copy link
Author

This does appear to be resolved in 1.5.0

(of course now we are running into #193 which is more problematic for us than this issue so I'll probably need to downgrade again due to that issue...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants