Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for exponential backoff retry #220

Open
cen1 opened this issue Jan 25, 2018 · 21 comments
Open

Add support for exponential backoff retry #220

cen1 opened this issue Jan 25, 2018 · 21 comments
Assignees

Comments

@cen1
Copy link

cen1 commented Jan 25, 2018

Failsafe already has support for this.

Preferred and less wasteful policy than fixed period when the target service could be under heavy load and unresponsive for longer periods of time.

@Emily-Jiang
Copy link
Member

sounds useful. +1 from me.

@Emily-Jiang
Copy link
Member

We can add one more method to configure the backoffFactor with the default value of 2

@carlosdlr
Copy link
Contributor

carlosdlr commented Jan 31, 2018

+1

1 similar comment
@brunobat
Copy link
Contributor

brunobat commented Feb 7, 2018

+1

@antoinesd
Copy link
Contributor

Should be ok for our implementation. So +1 for me

@Emily-Jiang Emily-Jiang added this to the 1.1 milestone Feb 7, 2018
@jeanouii
Copy link

jeanouii commented Feb 7, 2018

Very useful. We can start agressive at first and then get slower and slower until we reach a max value and decide it's dead :)

@cen1
Copy link
Author

cen1 commented Sep 18, 2018

I see the pull request is like 20 LOC, what is the holdup for the merge?

@Emily-Jiang
Copy link
Member

We talked this PR and had some questions. @antoinesd is going to update his PR accordingly. I will try to take this over, if Antoine cannot get around to do it.

@Emily-Jiang Emily-Jiang self-assigned this Oct 15, 2019
@lordofthejars
Copy link

@Emily-Jiang I am really interested in this feature as well, but instead of focusing on backoff retry. I'd create some kind of interface called RetryPolicy or something like that, so users can implement their own strategies, notice that you could try to use the exponential backoff policy but maybe you prefer a Fibonacci one or an incremental one.

So the API should look like:

@Retry(retryPolicy=FibonacciRetryPolicy.class)

@Ladicek
Copy link
Contributor

Ladicek commented Dec 2, 2019

When I was looking at this a while ago, I realized that even adding just exponential backoff results in adding not-exactly-intuitive parameters to @Retry. Perhaps externalizing the backoff policy into a separate class like @lordofthejars suggests would be an elegant way to solve that, but unfortunately it's less declarative/configuration.

I personally would lean towards using more than one annotation. Just @Retry means constant backoff, configured by delay / delayUnit. We could add an annotation @ExponentialBackoff that, together with @Retry, would mean exponential backoff, with initial backoff configured by @Retry.delay, multiplication factor e.g. @ExponentialBackoff.factor, and max delay e.g. @ExponentialBackoff.maxDelay. If more backoff strategies are needed, we could add @FibonacciBackoff etc., but I think exponential is good enough.

@lordofthejars
Copy link

Well I have been checking in Microsoft patterns library and they support three (iterative, constant, exponential) but then if you check Awaitility open source project, you'll see that also supports Fibonacci, so I guess that these four strategies might be interesting to support.

@Emily-Jiang Emily-Jiang modified the milestones: Future, 3.0 Feb 11, 2020
Emily-Jiang added a commit to Emily-Jiang/microprofile-fault-tolerance that referenced this issue Apr 14, 2020
Signed-off-by: Emily Jiang <[email protected]>
@Emily-Jiang
Copy link
Member

Let's discuss further next Tuesday and agree on the solution.

@Azquelt
Copy link
Member

Azquelt commented May 12, 2020

We discussed this on the call.

We thought:

  • There isn't a clear consensus of what the spec should include
  • We can see that there's immediate benefit to the exponential backoff strategy, but that other strategies might make more sense for certain scenarios.
    • @Ladicek found a paper concluding that a fibonacci backoff strategy was better than exponential backoff for some of the scenarios it examined.
  • We didn't see much benefit in being able to tweak the parameters of the backoff strategy (whether the delay increases exponentially, linearly or with a fibonacci pattern is more important than the variables in the formula)

For these reasons, we thought we could cover most use cases by just adding one parameter to @Retry which specified the backoff strategy to use.

Usage:

@Retry(delay=500, delayGrowth="fibonacci")

This allows us to add new strategies to the spec in the future and allows implementers to provide extra strategies if desired.

@Ladicek
Copy link
Contributor

Ladicek commented May 13, 2020

Personally, I don't really like the string approach, and some of the strategy names are too easy to misspell (exponential, fibonacci). I think we should use an enum.

On the other hand, I can see how string is appealing -- it can possibly even include a FQN of a class implementing some interface, which allows for ultimate flexibility. But there's an easy way how to allow that with an enum -- extra enum value CUSTOM, which would require a configuration property (not expressible via annotation) pointing to a class.

@Emily-Jiang
Copy link
Member

As promised, I investigated this a bit more. Resilience4j has exponential backoff and random exponential backoff. Failsafe has also exponential backoff . I think we need to support similar exponential backoff as a minimum.

@Emily-Jiang
Copy link
Member

Emily-Jiang commented May 21, 2020

We can see that there's immediate benefit to the exponential backoff strategy, but that other strategies might make more sense for certain scenarios.

@Ladicek found a paper concluding that a fibonacci backoff strategy was better than exponential backoff for some of the scenarios it examined.

It is not 100% accurate. The statement says:

One of these algorithms is Fibonacci increment backoff (FIB), FIB algorithm achieves higher throughput than the exponential backoff that is used by the standard IEEE 802.11 when it used in a mobile ad hoc network.

The performance is better when used in a mobile ad hoc network, which is very narrow use case. Remember the exponential backoff is used by the standare IEEE 802.11, which means a lot and it is very widely adopted strategy, as demonstrated by both failsafe and Resilience4j.

@Ladicek
Copy link
Contributor

Ladicek commented May 21, 2020

That's an out-of-context quote, as the entire paper focuses on that particular scenario. There are other papers that examine various backoff strategies in different contexts, e.g. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5375778/

What I mainly wanted to say is that only adding exponential backoff in the proposed way means we won't be able to add other backoff strategies in the future.

Currently, I even think that the amount of time we spent debating this shows that the design of @Retry is actively hostile against supporting more than 1 or 2 backoff strategies and should probably be revisited entirely.

@cen1
Copy link
Author

cen1 commented May 21, 2020

I would give a +1 to the single property with .class proposal since it seems to cover everything. It is type safe and extensible. Enum is nicer for regular case, uglier for custom case.

Another thought that came on my mind is whether the policy mechanism has wider use outside of @Retry, for example in @CircuitBreaker or anywhere else in fault tolerance where delays are defined.

@Ladicek
Copy link
Contributor

Ladicek commented May 26, 2020

At this point, I can think of several options that make some sense to me. I wanted to summarize them here. Beware, I don't fear of breaking changes :-)

  1. @Retry is already pretty big, and already includes one backoff strategy (constant) which has 4 (four!) configuration attributes. Backoff strategies should be separated out of the @Retry annotation:

    @Retry(maxRetries, maxDuration, maxDurationUnit, retryOn, abortOn)
    @RetryWithConstantBackoff(delay, delayUnit, jitter, jitterUnit)
    @RetryWithExponentialBackoff(initialDelay, initialDelayUnit, factor, [maxDelay, jitter, jitterUnit])
    @RetryWithFibonacciBackoff(initialDelay, initialDelayUnit, [maxDelay, jitter, jitterUnit])
    possibly others (polynomial, LILD, LIMD, MILD, MIMD, PLEB, PFB)

    When only @Retry is used, there's no delay between retries. Adding one of the annotations above defines how individual retries should be delayed. Adding more than 1 is tricky to handle, because of the interactions between method-level and class-level annotations and inheritance, but that's a problem we already have.

  2. Variant of above where the @*Backoff annotations are folded into the @Retry annotation. All backoff annotations have default values that correspond to "nothing":

    @Retry(maxRetries, maxDuration, maxDurationUnit, retryOn, abortOn,
        constantBackoff=@ConstantBackoff(...),
        exponentialBackoff=@ExponentialBackoff(...),
        fibonacciBackoff=@FibonacciBackoff(...)
    )

    Obvious downside is that it isn't possible to have sensible default values.

  3. Elaboration on the Class<? extends BackoffStrategy> attribute. At this point, I actually think this could be made pretty reasonable, using MP Config.

    @Retry(maxRetries, maxDuration, maxDurationUnit, retryOn, abortOn, delay, delayUnit, jitter, jitterUnit,
        delayGrowth = ConstantBackoff.class)
    
    interface BackoffStrategy {
        // delay and delayUnit is exactly what was configured for @Retry
        // I'm not married to this method signature (in fact I can see few better ones already), just something to demonstrate
        // jitter is applied outside of this
        Duration nextDelay(long delay, ChronoUnit delayUnit);
    }

    We would then ship these:

    class ConstantBackoff implements BackoffStrategy {
        Duration nextDelay(long delay, ChronoUnit delayUnit) {
            return Duration.of(delay, delayUnit);
        }
    }
    
    class ExponentialBackoff implements BackoffStrategy {
        final long factor;
        final long maxDelay;
    
        long last = -1;
    
        class ExponentialBackoff() {
            this.factor = ConfigProvider.getConfig().getOptionalValue(this.getClass().getName() + ".factor", Long.class).orElse(2L);
            this.maxDelay = ConfigProvider.getConfig().getOptionalValue(this.getClass().getName() + ".maxDelay", Long.class).orElse(Long.MAX_VALUE);
        }
    
        Duration nextDelay(long delay, ChronoUnit delayUnit) {
            last = (last < 0 ? delay : last) * factor;
            if (last > maxDelay) {
                last = maxDelay;
            }
            return Duration.of(last, delayUnit);
        }
    }
    
    class FibonacciBackoff implements BackoffStrategy {
        final long maxDelay;
    
        long lastA = 1;
        long lastB = 1;
    
        class ExponentialBackoff() {
            this.maxDelay = ConfigProvider.getConfig().getOptionalValue(this.getClass().getName() + ".maxDelay", Long.class).orElse(Long.MAX_VALUE);
        }
    
        Duration nextDelay(long delay, ChronoUnit delayUnit) {
            long current = lastB;
            long next = lastA + lastB;
            lastA = lastB;
            lastB = next;
            current *= delay;
            if (current > maxDelay) {
                current = maxDelay;
            }
            return Duration.of(current, delayUnit);
        }
    }

    This is a bit monstrous and requires using MP Config to configure the additional strategies. I have intentionally used this.getClass().getName() as part of the config key so that people can just create an empty subclass if they want different configuration for specific cases.

I would personally prefer option 1, obviously :-)

@Emily-Jiang Emily-Jiang removed this from the 3.0 milestone Oct 20, 2020
@asantaga
Copy link

Hey guys, can we have an update on this ER, it is something we would need for our project.

@Ladicek
Copy link
Contributor

Ladicek commented Jun 10, 2022

There's been no movement on this on the specification front. Depending on which implementation of MicroProfile Fault Tolerance you use, you may already have additional backoff strategies at your disposal. (Shameless plug: SmallRye Fault Tolerance provides exponential backoff, Fibonacci backoff, and completely custom backoff since version 5.2.0, see https://smallrye.io/blog/fault-tolerance-5-2/)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants