Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Resolve Heisenbug in StringHashTable._unique
When processing an invalid Unicode string, the exception handler for UnicodeEncodeError called `get_c_string` with an ephemeral repr value that could be garbage-collected the next time an exception was raised. Issue #45929 demonstrates the problem. This commit fixes the problem by retaining a Python reference to the repr value that underlies the C string until after all `values` are processed. Wisdom from StackOverflow suggests that there's very small performance difference between pre-allocating the array vs. append if indeed we do need to fill it all the way, but because we only need references on exceptions, we expect that in the usual case we will append very few elements, making it faster than pre-allocation. Signed-off-by: Michael Tiemann <[email protected]>
- Loading branch information