Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Handlig unicode on annotation creation
This is an odd bug! On specific browser, they have noticed that on Label Studio this gives the following issues: ``` psycopg2.errors.InvalidTextRepresentation: invalid input syntax for type json DETAIL: Unicode low surrogate must follow a high surrogate ``` We work around this by doing a round trip to JSON, with `ensure_ascii=False` will ensure that any UNICODE characters are preserved, and therefore escaped on the round trip back. Note: we don't care about any unicode in annotation, as we don't use the text for referencing our annotation. Test Plan: We can reproduce this by forcing the annotation end point to consume unicode. 1. Use the `copy as CURL` option on your browser to copy the "update annotation" call from Label Studio 2. Inject some unicode into the CURL command - e.g. ``` curl 'http://localhost:8080/api/annotations/126?taskID=5979&project=43' \ -X 'PATCH' \ #.... snipped --data-raw $'{"result":[{"value":{"value":{"start":"/div[1]/div[1]/text()[1]","startOffset":0,"end":"/div[1]/div[1]/text()[1]","endOffset":88,"globalOffsets":{"start":0,"end":88},"text":"Some unicode: \udfff " ...],"draft_id":0,"parent_prediction":null,"parent_annotation":null,"project":"43"}' ``` 3. Run it againt the dev server, previously this will cause 500, and with the fix it should 200.
- Loading branch information