Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(controller): git-push step: pull --rebase before push #3119

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

krancour
Copy link
Member

@krancour krancour commented Dec 11, 2024

Fixes #3071

Although this is WIP 👀 would still be helpful, @hiddeco. 🙏

Apart from the git bits of this, this PR is the catalyst for the step execution engine to start distinguishing between errors/failures that are terminal and ones that can be retried. This seemed like the time, because once a merge conflict that requires manual resolution has been detected, there is no point in performing any retries, no matter what the retry policy says.

I tried to implement what we discussed offline, but I think you'll have some good constructive feedback on this.

r4r

Copy link

netlify bot commented Dec 11, 2024

Deploy Preview for docs-kargo-io ready!

Name Link
🔨 Latest commit efa806f
🔍 Latest deploy log https://app.netlify.com/sites/docs-kargo-io/deploys/676073e931fa58000821bdb3
😎 Deploy Preview https://deploy-preview-3119.docs.kargo.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Dec 11, 2024

Codecov Report

Attention: Patch coverage is 40.41096% with 87 lines in your changes missing coverage. Please review.

Project coverage is 51.18%. Comparing base (69493aa) to head (efa806f).

Files with missing lines Patch % Lines
internal/controller/git/work_tree.go 8.51% 40 Missing and 3 partials ⚠️
internal/directives/git_pusher.go 37.14% 20 Missing and 2 partials ⚠️
internal/directives/simple_engine_promote.go 61.22% 18 Missing and 1 partial ⚠️
internal/directives/errors.go 62.50% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3119      +/-   ##
==========================================
- Coverage   51.26%   51.18%   -0.08%     
==========================================
  Files         283      285       +2     
  Lines       25563    25680     +117     
==========================================
+ Hits        13104    13145      +41     
- Misses      11762    11834      +72     
- Partials      697      701       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@krancour krancour changed the title Krancour/pull rebase fix(controller): git-push step: pull --rebase before push Dec 11, 2024
@krancour krancour force-pushed the krancour/pull-rebase branch from 9b3979c to 0fac579 Compare December 11, 2024 05:03
@krancour krancour marked this pull request as draft December 11, 2024 05:06
internal/controller/git/errors.go Show resolved Hide resolved
internal/controller/git/work_tree.go Outdated Show resolved Hide resolved
@krancour krancour force-pushed the krancour/pull-rebase branch from 3d4ccb1 to efa806f Compare December 16, 2024 18:39
@krancour krancour marked this pull request as ready for review December 16, 2024 18:40
@krancour krancour requested a review from a team as a code owner December 16, 2024 18:40
// is indeed the maximum number of attempts.
backoff.Steps = int(*cfg.MaxAttempts)
} else {
backoff.Steps = 50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This default seems pretty high, I think something like 10 would probably be better. Unless you have a specific reason to opt for this value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the following, and ~20 promos running concurrently and writing to the same branch, I was occasionally able to exhaust attempts:

wait.Backoff{
	Duration: 1 * time.Second,
	Factor:   2,
	Steps:    10,
	Cap:      2 * time.Minute,
	Jitter:   0.1,
}

Being aware of how extensive some users' monorepos are, I presumed 10 to be not nearly enough.

The effective defaults are now:

wait.Backoff{
	Steps:    50,
	Duration: 10 * time.Millisecond,
	Factor:   5.0,
	Jitter:   0.1,
}

I think I still like defaulting to a reasonably high number of steps, but what I definitely didn't account for here is that retry.DefaultBackoff has no cap -- which makes 50 steps absurd.

What do you think about leaving it 50, but capping it at something like 5s?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could use a mutex to make the pull + rebase + push a critical section of code.

Given that controllers can be sharded, it won't eliminate the need for retries entirely, but it would improve the situation considerably.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using this default backoff because, iirc, you'd suggested it, but I'm rethinking that.

In general, the best way to avoid conflicts will be spacing out the retries of the multiple concurrent promos...

That's got me thinking about a larger starting interval and a larger jitter.

At the same time, increasing the interval by a factor of five at each step seems way too aggressive. Even two seems aggressive.

I'm contemplating something like this:

wait.Backoff{
	Steps:    10,
	Duration: time.Second,
	Factor:   1.5,
	Jitter:   0.5,
	Cap:   30 * time.Second,
}

I'll try that in the morning to see how it performs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

git-push subject to race conditions when many stages push to same mono repo
2 participants