Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add request ID to client lib spec #417

Closed
mattheworiordan opened this issue Apr 17, 2018 · 6 comments
Closed

Add request ID to client lib spec #417

mattheworiordan opened this issue Apr 17, 2018 · 6 comments
Labels
content-request A request for new content, as opposed to changing/fixing existing content no-sync Do not sync ticket to JIRA

Comments

@mattheworiordan
Copy link
Member

mattheworiordan commented Apr 17, 2018

We have previously discussed ways to help us debug request failures for customers as it can be quite difficult when there is a very high volume of requests globally.

We have a working undocumented solution in Ruby at ably/ably-ruby#133.

We have previously also discussed:

We should consider adding this to the next minor release of our spec as it will help us resolve issues more quickly for customers.

┆Issue is synchronized with this Jira Task by Unito

@mattheworiordan mattheworiordan added content-request A request for new content, as opposed to changing/fixing existing content client lib spec labels Apr 17, 2018
@mattheworiordan
Copy link
Member Author

We need to additionally require that the request ID remains the same across all fallback host retries. See #417 for a related issue.

@mattheworiordan
Copy link
Member Author

mattheworiordan commented Apr 30, 2018

I was thinking about this a bit more, and I think we should seriously consider putting this in the spec sooner rather than later. My thinking is that adding this functionality to client libraries is trivial, yet gives us far more insight into issues when they arise, but also may in fact be able to deal with the speculative issue we have of idempotency.

If the spec stated:

  • All IDs generated are effectively base64 encoded i.e. we have 64 characters to use safely in URLs
  • Every client when instanced gets a 8 char session ID i.e. 281 trillion variations so not likely to get a collision in the time windows we care about (i.e. days)
  • Every HTTP or WS request includes a session ID, as well as a request ID. The request ID could be a base64 encoded long i.e. we could then easily determine the sequence of requests from a single client. This must be a param such as reqid=ABCDEFGH-C1
  • Any request that is retried will have a retry attempt sequence appended such as reqid=ABCDEFGH-1.1

Then we could now:

  • Quickly search for failed requests reported by customers in our own logs
  • See where requests are retried across the global cluster easily

Then we could later when we have time:

Notes:

  • I did think that it could be useful to let customers choose their own request IDs, but I am concerned with that approach our useful request IDs that have meaning to use will no longer be useful. We may want to allow customers to add their custom request IDs if we wanted that are simply an extra param and reported in exceptions, and later could also be used to help with idempotency potentially.

@SimonWoolf @paddybyers @funkyboy WDYT?

@paddybyers
Copy link
Member

The publish idempotency requirement isn't really related because that I think needs to be addressed by putting a library-specified messageId in the message that's published. The mechanism in realtime already exists to prevent the duplicate publish. Simply putting an id in the request params won't allow us to de-dupe across frontend instances, without some additional thing being persisted per-request.

@mattheworiordan
Copy link
Member Author

The publish idempotency requirement isn't really related.

No, not necessarily. In most cases, customers who publish over REST probably don't want to think about this level of detail, and only publish from one client. What we are mostly trying to defend against is the most common problem of an in-flight HTTP request's connection being abruptly terminated, the client detecting that, and retrying the request to another region, which in turn can theoretically result in two publishes of the same message.

because that I think needs to be addressed by putting a library-specified messageId in the message that's published

Sure, but as above, that's a different requirement and adds a new layer of complexity for customers. We want to avoid that for the bulk of customers who won't haVe unique IDs, and don't care about this.

The mechanism in realtime already exists to prevent the duplicate publish

Yup, but we're only talking about REST requests here for idempotency. Additionally, that is not 100% true I am afraid either as a connection could termiate like an in-flight HTTP request, the connection could be re-established in another region, and republished. That's very very unlikely, but still plausible.

Simply putting an id in the request params won't allow us to de-dupe across frontend instances, without some additional thing being persisted per-request.

Sure, but if we add it to the protocol, we can worry about that later. If we don't, as we know, when we do want to fix that issue we will have our hands tied.

Also, de-duping would most likely never be done on frontends, it would be done on the channels themselves. Why would the frontends need to do this?

@MarkWoulfeAbly MarkWoulfeAbly added the no-sync Do not sync ticket to JIRA label Sep 2, 2020
@mattheworiordan
Copy link
Member Author

@paddybyers @SimonWoolf isn't this now covered with the idempotency support?

@SimonWoolf
Copy link
Member

SimonWoolf commented Apr 12, 2022

This was done in #763 with an add_request_ids client option (basically just writing down what you did in ably-ruby), separate from idempotency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content-request A request for new content, as opposed to changing/fixing existing content no-sync Do not sync ticket to JIRA
Development

No branches or pull requests

5 participants