Separate retry settings for different exception types on the same method #436

tlvlp · 2021-06-09T19:52:27Z

Problem summary
Currently only one @Retry annotation can be placed on each method.
Although an annotation can filter for multiple exceptions, they can only be handled along the same set of parameters.

eg. it would be useful to handle an exception that signals that a remote service is not available with a longer retry delay,
and another exception that is thrown for a race condition with a much shorter one.

Workaround
In our case it was possible to extract the relevant section to a separate method, receiving its own @Retry annotation,
but it would improve readability to have both retires next to each other on the original method, instead hiding away one of them.

Also there might be cases where two or more exception are thrown from the same, potentially 3rd party library, that would require different handling where this workaround would fall short.

Possible solutions

Allowing multiple retry annotations.

@Retry(...)
@Retry(...)
void retried() {...}

Grouping retry annotations

@RetryWith(retries = {
    Retry(...),
    Retry(...)
} )
void retried() {...}

Possible caveats
exceptions matching to multiple Retry annotations. In this case only the first matched Retry should be used.

Ladicek · 2021-06-10T10:39:59Z

I'd be reluctant to do this (just like I'm reluctant to support multiple @Fallback annotations for different exceptions), because I really don't want to invent a new syntax for try/catch. Fault tolerance annotations are not a new mechanism for exception handling.

Also, as you acknowledge, there's a problem selecting the correct handler. Java's try/catch uses "first match wins", but other options are possible ("most specific match wins" or "last match wins"), and they are all kinda sensible.

On the other hand, this is like 2nd or 3rd time someone asks for [something like] this. In retrospect, I guess it must have happened, after we added the ability to skip/ignore certain fault tolerance strategies for certain exceptions. So not really sure what I think about it...

tlvlp · 2021-06-10T12:13:05Z

try / catch comparison + clarification:
I understand your reluctance, and indeed twisting an API into something it was not meant for, should be avoided.
With @Fallback it is much easier to misuse as a catch block but in case of @Retry the handling logic would remain the same for all cases, and the difference would be in the resource efficiency of the retries.

Covering more than one case where retry is needed (service not available + service returned with and error code) with different retry intervals is not less relevant than just the one that is covered with the annotation.
It just leaves the other identically important case(s) to be handled by a eg. a retry loop inside the same method.

It would leave certain retry use cases of the same call nice and clean with the safety of an identical context and state between retries and the rest of the use cases to be more error prone, harder to read and most importantly inconsistent.
Then again, the whole fault tolerance logic can be handled in such a retry loop, but the whole point of this approach is not having to do that.

Exception matching:
With more than one acceptable matching strategy, picking one and adding a line of documentation to the interface seems to be a working approach.

Ladicek · 2021-06-11T11:49:44Z

I thought about this a little and I think repeatable @Retry is bad.

Imagine a service guarded with @Retry can throw 3 exceptions: BusinessException, where retry doesn't make sense because the operation is intrinsically wrong and no number of retries may possibly help, TransientException, where retry in a short while has a high chance of helping, and RateLimitException, where you have to wait longer, but retrying after some time may succeed.

Say I declare it like this:

@Retry(delay = 1, delayUnit = ChronoUnit.MINUTES, retryOn = RateLimitException.class, abortOn = BusinessException.class)
@Retry(delay = 5, delayUnit = ChronoUnit.SECONDS, retryOn = TransientException.class, abortOn = BusinessException.class)
Result myService(Parameters params) {
    ...
}

Sounds like something you're after. However, @Retry has some other attributes. The interesting ones here are maxRetries and maxDuration. What happens if these are configured differently on each annotation instance? Do we force users to always keep them in sync? What other attributes have to be in sync?

You might say that this problem already exists if I declare a @Retry on one method, which immediately calls another method that also declares @Retry:

@Retry(delay = 1, delayUnit = ChronoUnit.MINUTES, retryOn = RateLimitException.class, abortOn = BusinessException.class)
Result myService(Parameters params) {
    // note that for this to work, self-interception must be supported (it is in Quarkus,
    // but only for non-private methods, so `doMyService` must not be `private`)
    // otherwise, I'd have to use self-injection and call this method on the injected instance of this class
    return doMyService(params);
}

@Retry(delay = 5, delayUnit = ChronoUnit.SECONDS, retryOn = TransientException.class, abortOn = BusinessException.class)
Result doMyService(Parameters params) {
    ...
}

And indeed it is true, the same problem exists. But here, it's much easier to understand what actually happens, because these are 2 different retries, one nested inside the other. In fact, in this setup, having different maxRetries or maxDuration may even make sense. You could do say at most 20 retries on doMyService, but at most 5 retries on myService, and it's still quite easy to understand.

With repeatable @Retry, not so much.

tlvlp · 2021-06-14T01:30:18Z

Annotating nested methods:
The proposed solution to immediately call another method with it's own annotation should work in many cases and it is similar to the workaround, that we ended up using before I wrote this ticket, but it has some flaws beside the readability and consistency mentioned above:

If the exception in one of the @Retry annotations of the lower methods, also matches one or more of the annotations on the upper methods by inheritance (which would make this case hard to detect), it might run the retries exponentially, but I guess from the API design perspective this is unavoidable either way.
With chaining methods like this while having to keep all of them public, it is easier to misuse the class' API.
eg. with the above example it is impossible to ensure that doMyService(..) will not be called instead of myService(..) which would also silently alter the retry behaviour without changing the business logic or the output.

Multiple Retry annotations would probably not have worked:
I just realized to use multiple instances of the same annotation, it needs to have the @Repeatable meta annotation and since @Retry does not belong to this project, but to the upstream microprofile fault tolerance, there is no chance to implement it this way.

But by looking at the current implementation, we would have to store all the retry alternatives at the annotation collection and since we are not yet aware of the exception, we cannot populate all the attributes, like with a single Retry. So we have to do that at runtime and we have to add the selection of the correct Retry to the method that actually handles the exception.
At the first run it would have to populate all the attributes (maxRetries..etc). But if the annotated method starts to throw a different exception in the middle of the retries, it would change the attributes and strange things could happen, like if the retry counter is at 13 and the maxRetries suddenly changed to 5, and there are many other cases as well.

So it feels like with the current implementation, that was clearly created with a single Retry annotation in mind, it might be too much effort to add this feature.

Ladicek · 2021-06-14T07:27:43Z

So it feels like with the current implementation, that was clearly created with a single Retry annotation in mind

That is correct, but it isn't set in stone. What I'm more interested in is...

But by looking at the current implementation, we would have to store all the retry alternatives at the annotation collection and since we are not yet aware of the exception, we cannot populate all the attributes, like with a single Retry. So we have to do that at runtime and we have to add the selection of the correct Retry to the method that actually handles the exception.
At the first run it would have to populate all the attributes (maxRetries..etc). But if the annotated method starts to throw a different exception in the middle of the retries, it would change the attributes and strange things could happen, like if the retry counter is at 13 and the maxRetries suddenly changed to 5, and there are many other cases as well.

semantics. Imagine you could add multiple @Retry annotations. How would maxRetries etc. work? In the quoted paragraph, you talk about it in terms of implementation details. Can you describe the desired behavior from the API user's perspective? Surely being able to limit the number of retries (maxRetries) or the time one can spend retrying (maxDuration) is useful, so we don't want to drop those.

At this point, I think what you actually want isn't repeatable @Retry. You want the causing exception to affect the delay between individual retries. This is actually very close to another topic that has been discussed several times: backoff strategies. Currently, MicroProfile Fault Tolerance has a single backoff strategy: constant (with a jitter). There's been several requests for additional backoff strategies, notably exponential and Fibonacci. If we had an API for completely custom backoff strategies, in addition to the few common ones we want to provide out of the box, that should be enough for you. (You can take a look at the existing discussions in #394 and eclipse/microprofile-fault-tolerance#220, and perhaps also quarkusio/quarkus#17207.)

tlvlp · 2021-06-14T11:47:54Z

Interesting proposition, and yes, in our case an exponential delay should cover both of our cases:

Fast retries in the beginning would not place unnecessary load on the service that is currently offline / before passing the readiness checks, and it would work well for race condition related retries that should be resolved with a few fast consecutive retries.
And it would (ideally) leave only the service outage related exceptions to reach the retries with longer intervals.

I'm generally in favour of explicitly declaring different behaviours for different scenarios, like having any kind of two separate retry configurations for the above cases, but I admit that yours would be an elegant solution to my problem, while also adding a feature that is useful in other ways as well.

Ladicek · 2021-06-14T11:53:46Z

Thanks! I'd like to point out that while exponential/Fibonacci backoff would be generally an OK solution for your issue, my proposal (that gives custom backoff strategy access to the causing exception) would also let you implement the behavior you described originally.

Now, just need to find the time to implement it :-)

tlvlp · 2021-06-14T12:26:31Z

Hmm, indeed, it could be more flexible than I first imagined :)
Closing this one in favour of #394

Ladicek mentioned this issue Jun 14, 2021

additional backoff strategies for retry #394

Closed

tlvlp closed this as completed Jun 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separate retry settings for different exception types on the same method #436

Separate retry settings for different exception types on the same method #436

tlvlp commented Jun 9, 2021 •

edited

Loading

Ladicek commented Jun 10, 2021

tlvlp commented Jun 10, 2021

Ladicek commented Jun 11, 2021

tlvlp commented Jun 14, 2021

Ladicek commented Jun 14, 2021 •

edited

Loading

tlvlp commented Jun 14, 2021 •

edited

Loading

Ladicek commented Jun 14, 2021

tlvlp commented Jun 14, 2021

Separate retry settings for different exception types on the same method #436

Separate retry settings for different exception types on the same method #436

Comments

tlvlp commented Jun 9, 2021 • edited Loading

Ladicek commented Jun 10, 2021

tlvlp commented Jun 10, 2021

Ladicek commented Jun 11, 2021

tlvlp commented Jun 14, 2021

Ladicek commented Jun 14, 2021 • edited Loading

tlvlp commented Jun 14, 2021 • edited Loading

Ladicek commented Jun 14, 2021

tlvlp commented Jun 14, 2021

tlvlp commented Jun 9, 2021 •

edited

Loading

Ladicek commented Jun 14, 2021 •

edited

Loading

tlvlp commented Jun 14, 2021 •

edited

Loading