-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unicode patch email rejected #502
Comments
The Unicode updates are certainly going to exercise UTF-8 parsing in the entire pipeline. We are not going to have invalid UTF-8, but we may have UTF-8 code points that cannot be displayed correctly because we are updating to code points that have just been added to the standard. |
We assume that the database can store UTF-8. I suspect you didn't set the encoding for the table before running
You can validate this by running
As risk of stating the obvious, you can fix this after the fact but it'll take some time. First, modify the default:
...followed by the following for each table:
|
Ah, wait, I think I misunderstood the issue. So you're submitting content that is currently recognized as invalid unicode? Interesting. 🤔 Should we consider storing patches as binary strings? If so, are we going to be able to decode them on pull if they are "invalid" (to Python) code points? I'm not sure what to do here.... |
@stephenfin No, if we need to exercise invalid Unicode then we need to create it by hand carefully and it shouldn't show up in a patch. I don't think that patchwork should need to handle invalid UTF-8. |
Okay, then what (if anything) is the ask here? I don't know for sure, but I'm guessing MySQL/MariaDB rejected that patch content because it thought it was invalid unicode? |
It's unlikely to be a table encoding issue since they're at utf8mb4. I altered AFAICT, the email doesn't have invalid unicode either, so not sure what's going on. |
This patch on libc-alpha did not appear in patchwork as it crashed during parsing:
https://sourceware.org/pipermail/libc-alpha/2022-October/142506.html
Here's the backtrace at the point of the crash:
The backtrace was attained using the following change to the sourceware instance, maybe something like this could be useful for general purpose logging too:
The text was updated successfully, but these errors were encountered: