Skip to content

Commit

Permalink
utf8n_to_uv_msgs: Avoid unnecessary work
Browse files Browse the repository at this point in the history
The outermost block here is executed when any of three types of
problematic Unicode code points is encountered, and the caller has
indicated special handling of at least one of those types.  Before this
commit, we set a flag to later look to see if what was encountered
matched the type the caller specified.  This commit changes to do that
looking at the point where the flag had been set, and only sets the flag
if necessary.  This may completely avoid the later work, which has
set-up overhead, and this will make future commits simpler.
  • Loading branch information
khwilliamson committed Dec 4, 2024
1 parent e6954d8 commit a3b31bf
Showing 1 changed file with 22 additions and 3 deletions.
25 changes: 22 additions & 3 deletions utf8.c
Original file line number Diff line number Diff line change
Expand Up @@ -1858,13 +1858,19 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,
* loop just above. */
if (isUNICODE_POSSIBLY_PROBLEMATIC(uv)) {
if (UNLIKELY(UNICODE_IS_SURROGATE(uv))) {
possible_problems |= UTF8_GOT_SURROGATE;
if (flags & (UTF8_DISALLOW_SURROGATE|UTF8_WARN_SURROGATE)) {
possible_problems |= UTF8_GOT_SURROGATE;
}
}
else if (UNLIKELY(UNICODE_IS_SUPER(uv))) {
possible_problems |= UTF8_GOT_SUPER;
if (flags & (UTF8_DISALLOW_SUPER|UTF8_WARN_SUPER)) {
possible_problems |= UTF8_GOT_SUPER;
}
}
else if (UNLIKELY(UNICODE_IS_NONCHAR(uv))) {
possible_problems |= UTF8_GOT_NONCHAR;
if (flags & (UTF8_DISALLOW_NONCHAR|UTF8_WARN_NONCHAR)) {
possible_problems |= UTF8_GOT_NONCHAR;
}
}
}
}
Expand Down Expand Up @@ -2039,6 +2045,10 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,

case UTF8_GOT_SURROGATE:

/* Code earlier in this function has set things up so we don't
* get here unless at least one of the two top-level 'if's in
* this case are true */

if (flags & UTF8_WARN_SURROGATE) {
*errors |= UTF8_GOT_SURROGATE;

Expand Down Expand Up @@ -2068,6 +2078,10 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,

case UTF8_GOT_NONCHAR:

/* Code earlier in this function has set things up so we don't
* get here unless at least one of the two top-level 'if's in
* this case are true */

if (flags & UTF8_WARN_NONCHAR) {
*errors |= UTF8_GOT_NONCHAR;

Expand Down Expand Up @@ -2202,6 +2216,11 @@ Perl_utf8_to_uv_msgs_helper_(const U8 * const s0,

case UTF8_GOT_SUPER:

/* We get here when the input is for an above Unicode code
* point, but it does not use Perl extended UTF-8, and the
* caller has indicated that these are to be disallowed and/or
* warned about */

if (flags & UTF8_WARN_SUPER) {
*errors |= UTF8_GOT_SUPER;

Expand Down

0 comments on commit a3b31bf

Please sign in to comment.