Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve documented errors to include a mandatory useful description #279

Open
mattheworiordan opened this issue Nov 25, 2024 · 0 comments
Open

Comments

@mattheworiordan
Copy link
Member

mattheworiordan commented Nov 25, 2024

Problem

Currently when developers receive error messages from SDKs and the realtime system, they are presented with a response/error body such as the following that is auto-generated from https://rest.ably.io/404:

{
	"error": {
		"message": "Could not find path: /404.json. (See https://help.ably.io/error/40400 for help.)",
		"code": 40400,
		"statusCode": 404,
		"nonfatal": false,
		"href": "https://help.ably.io/error/40400",
		"serverId": "frontend.245e.6.eu-central-1-A.i-0722416a82570e265.e91l9RmkABiwds"
	}
}

This error message follows the format https://sdk.ably.com/builds/ably/specification/main/features/#TI1, and according to spec #TI3, will always map back to a predefined error code in ably-common, see https://github.com/ably/ably-common/blob/main/protocol/errors.json.

This approach tends to work well for common errors as we have FAQs that match those error codes. For example, if you follow the href error URL https://help.ably.io/error/40400, we have an app that runs at help.ably.io, which checks if the error code 40400 has a matching FAQ (see https://github.com/ably/ably-common/blob/main/protocol/errorsHelp.json#L51), which redirects the user to https://faqs.ably.com/error-code-40400-not-found.

Unfortunately, for less frequent error codes, this is less helpful, and the experience is pretty poor for developers in spite of us having prior knowledge of what that error code might imply.

Take for example this recent commit that adds some error codes. The experience for developers seeing this error code could be as follows:

  1. They see error 80023 from their SDK, with some custom error message potentially. Given the title in errors.json is not required to be used when issuing an error to the end user (this is intentional by design, take 40400 for example, the message is far more intuitive than a generic message). Either way, seeing error "unable to resume connection from a different site" may be accurate for the user, but does not provide any useful context on why this happened.
  2. The developer may not understand the error message, especially when it has a title that is really designed for internal use (see https://github.com/ably/ably-common/blob/main/protocol/errors.json, many are), and will want to find out more. They follow the link to the error https://help.ably.io/error/80023.
  3. As there is no FAQ, they are taken to https://faqs.ably.com/unknown-error-code, which tells the customer we have no docs for this error code.

At the time of writing the error code however, the developer working on this would have had an opportunity to provide a little more context around why this error may be happening, if the developer needs to worry about it it or not, and what, if anything, they could do to rectify it. By adding this context at the time of adding the error code too, this will help the docs / support team who typically pick up this task of providing a more exhaustive FAQ to help developers for that error code.

Other related issues

  • In order to write an FAQ, the process is convoluted. A support article needs to be created in HS, a matching error and URL in errorsHelp.json needs to be created.
  • We have no clear indication in the error codes if these are expected to be 4xx or 5xx errors for users. Whilst we were able to do that historically because we grouped all 4xx errors under 40000 -> 50000 range, that no longer applies to error codes outside of 40000 -> 59999 range, for products like Chat.
  • We don't have a simple CI build stage to validate URLs and error codes entered are valid (see related Add CI checks for error help URLs (knowledge.ably.com) #66 and Rake task to validate URLs in errorsHelp.json #63).
  • We could consider adding a canonical name for each error (see Include error constants in includes in this repo? #32) at the time of generating error codes to make SDK development more intuitive and easier to maintain.
  • Documentation for error codes is outside of the docs repo and this repo and not under Git control. We should fix that.
  • We are leaking some internal error state to end users such as the field nonFatal which is not documented anywhere (see https://rest.ably.io/foo for example which shows nonFatal: false. We also send empty fields such as cause which has little value in these HTML error messages.

Solution

TBD.

However, some initial thinking:

  • Introduce a schema for errors.json that requires an error description along with potentially a canonical error ID (used for SDK constant definitions). Potentially consider a YAML file instead if easier to read / structure? Also, consider guidelines for what is needed in the description field to make it useful to end users. Potentially even consider different fields such as "Why is this happening", "Is this an Ably error or something I have done wrong" etc.
  • Move FAQs into docs repo, and introduce a naming convention that ensures FAQs are automatically pulled so that errorsHelp.json becomes unnecessary.
  • When an FAQ does not exist, generate a help page that shows the documented error message, description, and useful generic information on how to get in touch / debug problems. Ensure the error tracking views continue to be tracked as they are now (see https://github.com/ably/help) so that the customer support teams have insights into which error codes need more full FAQs.
  • Introduce an action to validate the structure of the errors to ensure sufficient information is provided./
  • Ensure errors are categorised and presented as user (4xx) and internal (5xx) errors to end users.

Related

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

1 participant