Skip to content

Towards common error handling

Hongsuda edited this page Jun 18, 2016 · 11 revisions

error code

  • 400 bad requests (i.e. the request is badly structured e.g. invalid json or csv input)
  • 401 unauthorized (i.e. user need to login, session expire)
    • redirect user to login page
  • 403 forbidden (i.e. user doesn't have permission)
    • display an error box, and upon clicking ok, direct them to the project landing page.
  • 404 not found (i.e. malformed url)
  • 405 no method
    • http method not supported on URL
  • 409 conflict on all data methods, unless otherwise stated
    • violation of foreign key constraints including invalid vocabulary reference,
      • data entry: (default) display error, upon clicking ok, lead them to the data entry form and reload the foreign keys drop-down.
    • unique key violation on entity POST (e.g. the rows with same unique keys already exist)
      • data entry: iterate all the unique key constraints, for those that the client has all the values, do a GET to check whether the entry already exist. If so, there is a collision.
    • data type violation (e.g. wrong type, out of range numbers, etc. )
      • data entry: use default (e.g. display error). Nice to have: save the draft states so they persists.
    • bad queries on GET/DELETE (e.g., a range query on a json column etc)
      • data entry: use default
    • unknown column names in defaults query parameter, path expression, filters
      • data entry: reload the model, check the tags, restart the app if they are different.
        • ermrest: add etag to the schema object that returns
    • unknown schema name or table name
      • ermrestjs: check whether there is a schema/table not found error.
      • data entry: generate an error message (e.g. bookmark doesn't work) and go to a landing page.
    • model conflicts could be due to a UI bug or changes made to the model since the UI bootstraped itself.
      • data entry: use default
  • 500 internal server error (e.g. ermrest or other servers side bug, mysterious apache stack error 500)
  • 503 temporary unavailable
  • status 0 and no response text. which could be due to connection timeout/refuse, SSL cert error, DNS error,

HTTP Error Diagram

Error handling strategies

[https://docs.google.com/spreadsheets/d/1frkZJhkD-WUgHghsWRHJIuN3sZoU72qxh-JRDsP0hXo](Good spreadsheet listed error handling strategy)

Automatic retry within ermrestjs

  • Only in response to 500-series and 0 responses or connection errors for GET and DELETE (see next delete topic too)
  • Retry up to 10 times before declaring a fatal error
  • Exponential delay series for each retry (i.e. 100ms, 200ms, 400ms, 800ms, 1.6s, 3.2s, 6.4s, 12.8s, 25.6s, 51.2s)
  • On fatal error, invoke application error callback

Applications SHOULD do best-effort strategy to offer degraded functionality while waiting for GET results rather than freezing the UI (e.g. show spinning, retry status, allow user to cancel). On a fatal error, the application might signal the user and/or leave those same degraded UI elements in place?

Idempotent delete

Consider a 404 response to DELETE as OK and change app state to show resource as now missing? Allows for recovery from lost responses if we use automatic retry and from concurrent delete by other tabs/clients.

Degraded presentation

  • In response to 404, 403 for GET of inline resources like thumbnails?
  • Try to function without resource if possible, showing warning/missing data indication...

Manual retry within apps

  • In response to 500-series responses or connections errors for PUT, POST which should never auto-retry
  • Ideally: do GET tests if possible to see whether new data exists on server ti disambiguate, i.e. response was lost after successful operation
  • Tell the user there was a problem and they may want to retry
  • Allow user to return to app context where PUT or POST was attempted w/ minimal loss of data-entry state etc.

Possibly also in response to 409 on data submission? Or is this a restart case for now (see next)?

Manual restart of apps

  • Only in response to 409-series responses where it is feasible we are out of sync w/ catalog model and can recover
  • Ideally: check whether catalog model is newer than what we have already retrieved to disambiguate
  • Tell the user the catalog may have changed and we need to restart to see if we can recover...
  • Offer link back to initial app state to reinitialize app state from minimal bookmark state
  • Offer link to support if the problem is persisting for the user?

Bad URL, go to landing page

  • Only in response to model elements in app URL not found in local model structures or other parse errors
  • Tell the user the URL is bad and to see the project page for valid links to enter our apps
  • Offer link back to landing page

Fatal error: get support

These cases differ in terms of triggering conditions and possibly error message, but have same basic flow to give a message and links to support (i.e. to ask for more privileges), previous app link (to try other activities that may still be permitted) and project landing page.

  1. Lack of permission
  • 403 response to any operation
  1. Probable bugs in our GUI
  • Any other error not covered above

We might even merge this with manual restart response, if we always want to offer the same set of links and choices to users with a variable error message...

Data entry: 409:

  1. Check that the schema/table exist. If not, generate an error message (e.g. bookmark doesn't work) and go to a landing page.
  2. reload the model, check the tags, restart the app if they are different. * ermrest: add etag to the schema object that returns
  3. iterate all the unique key constraints, for those that the client has all the values, do a GET to check whether the entry already exist. If so, there is a collision. This could be part of validation.
  4. reload the foreign keys drop-down
  5. display error, upon clicking ok, lead them to the data entry form with existing state and validation errors. (MVP)

Some thoughts on common error handling across Chaise apps.

Errors that might be encountered:

  • network error
  • status 0 and no response text. In this case, attempt a few retry and report an error if the failure is persistent.
  • malformed URL or more precisely, malformed fragment identifier in the URL (e.g., .../#/legacy:dataset... which is missing the catalog ID)
  • catalog does not exist in the server (so its correctly found in the URL but doesn't actually exist on the server)
  • schema or table do not exist in the catalog
  • column names referenced in the fragment identifier (filters section) don't exist
  • bad queries (e.g., a range query on a json column etc)
  • data changes (a row is deleted by a different user)
  • model changes (even the schema like a table definition could change with columns dropped or added)
  • internal errors (ermrest or other servers side bug)
  • transient internal errors (mysterious apache stack error 500)
Clone this wiki locally