Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🐛 Bug]: serverErrorPolicy does retry on Exception / Error #14917

Open
joerg1985 opened this issue Dec 19, 2024 · 5 comments
Open

[🐛 Bug]: serverErrorPolicy does retry on Exception / Error #14917

joerg1985 opened this issue Dec 19, 2024 · 5 comments

Comments

@joerg1985
Copy link
Member

joerg1985 commented Dec 19, 2024

What happened?

The serverErrorPolicy inside the org.openqa.selenium.remote.http.RetryRequest does not only handle the server errors.
It basically handles all Exceptions / Errors, i guess this is due tue a missunderstanding of the API of the library used here.
This increases the retries on ConnectException to 5, instead of only 3.

In general i think the use of this library is problematic, due to:

  • A miss match between the naming of method / things inside the javadoc and what is actually happening (this is also the reason for this issue...).
    e.g. ExecutionAttemptedEvent.getLastException does return a Throwable, so we have a potential class cast exception in
    Exception exception = (Exception) executionAttemptedEvent.getLastException();

    e.g. the javadoc of RetryPolicy.handleIf speaks about "exception" but meant is "throwable"
  • It implies handling failues is super easy, just put it inside a retry and don't think about it.
    This will probably lead to leaks, e.g. leaking sub in case pub failes in
    Failsafe.with(retryPolicy)
    .run(
    () -> {
    sub = context.createSocket(SocketType.SUB);
    sub.setIPv6(isSubAddressIPv6(publishConnection));
    sub.connect(publishConnection);
    sub.subscribe(new byte[0]);
    pub = context.createSocket(SocketType.PUB);
    pub.setIPv6(isSubAddressIPv6(subscribeConnection));
    pub.connect(subscribeConnection);
    });
  • I have debugged into to the code and it makes things complex compared to a simple loop. And complexity does bring bugs with it e.g. before PR [java] Ensure retry mechanism does not swallow an exception #12838 the responses of concurrent failed request could get mixed.
  • No response to most issues raised in the issue tracker, including one related to selenium java.lang.NoClassDefFoundError: dev/failsafe/Policy in java modul failsafe-lib/failsafe#386

So i would like to ask: Glue some patches on the code and hope the code does what it should or remove the library with a custom implementation aka. a simple loop?

How can we reproduce the issue?

Run `RetryRequestTest.canThrowUnexpectedException` in debug mode to see the unexpected retries in the logs.

Or run the code below to see the 5 retries instead of 3.
HttpHandler handler =
        new RetryRequest()
            .andFinally(
                (HttpRequest request) -> {
                  throw new WebDriverException("oops", new ConnectException("Testing"));
                });

Relevant log output

Dez. 19, 2024 11:50:53 AM org.openqa.selenium.remote.http.RetryRequest lambda$static$4
INFORMATION: Failure due to server error #1. Retrying.
Dez. 19, 2024 11:50:56 AM org.openqa.selenium.remote.http.RetryRequest lambda$static$4
INFORMATION: Failure due to server error #2. Retrying.

Operating System

Win 10 x64

Selenium version

4.27.0

What are the browser(s) and version(s) where you see this issue?

N/A

What are the browser driver(s) and version(s) where you see this issue?

N/A

Are you using Selenium Grid?

No response

Copy link

@joerg1985, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@joerg1985
Copy link
Member Author

@pujagani @diemol @shs96c what do you think about this?

@diemol
Copy link
Member

diemol commented Dec 19, 2024

We used the library because it made sense at the moment.

If you think a custom implementation is enough, I would be okay with that. I would not replace it all in a single PR, I would do the custom implementation, and remove the library bit by bit.

@joerg1985
Copy link
Member Author

I did not wanted to judge about using it in the past, i only wanted to show my currenty view on it.

@diemol
Copy link
Member

diemol commented Dec 19, 2024

I did not mean it like that. We did not oversee any of the current issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants