From 5325a083b3183aacc7824ceabe786a8c0e744f83 Mon Sep 17 00:00:00 2001 From: Jordan Harband Date: Fri, 22 Mar 2024 15:18:43 -0700 Subject: [PATCH] [spec] hex-escape punctuators Fixes 65 --- spec.emu | 35 +++++++++++++---------------------- 1 file changed, 13 insertions(+), 22 deletions(-) diff --git a/spec.emu b/spec.emu index fc1e600..4f1dad3 100644 --- a/spec.emu +++ b/spec.emu @@ -15,26 +15,6 @@ contributors: Jordan Harband

RegExp (Regular Expression) Objects

- -

Patterns

- -

Syntax

-

Each `\\u` |HexTrailSurrogate| for which the choice of associated `u` |HexLeadSurrogate| is ambiguous shall be associated with the nearest possible `u` |HexLeadSurrogate| that would otherwise have no corresponding `\\u` |HexTrailSurrogate|.

- - HexNonSurrogate :: - Hex4Digits [> but only if the MV of |Hex4Digits| is not in the inclusive interval from 0xD800 to 0xDFFF] - - IdentityEscape[UnicodeMode] :: - [+UnicodeMode] SyntaxCharacter - [+UnicodeMode] `/` `,` `-` `=` `<` `>` `#` `&` `!` `%` `:` `;` `@` `~` `'` `"` `\`` - [+UnicodeMode] WhiteSpace - [~UnicodeMode] SourceCharacter but not UnicodeIDContinue - - DecimalEscape :: - NonZeroDigit DecimalDigits[~Sep]? [lookahead ∉ DecimalDigit] - -
-

Properties of the RegExp Constructor

@@ -55,9 +35,20 @@ contributors: Jordan Harband 1. Append code unit U+005C (REVERSE SOLIDUS) to _escapedList_. 1. Append code unit U+0078 (LATIN SMALL LETTER X) to _escapedList_. 1. Append code unit U+0033 (DIGIT THREE) to _escapedList_. + 1. Append _c_ to _escapedList_. 1. Else if _toEscape_ contains _c_ or _c_ is matched by |WhiteSpace|, then - 1. Append code unit U+005C (REVERSE SOLIDUS) to _escapedList_. - 1. Append _c_ to _escapedList_. + 1. Let _hex_ be Number::toString(𝔽(_c_), *16*𝔽). + 1. If the length of _hex_ is 1 or 2, then + 1. Set _hex_ to StringPad(_hex_, 2, *"0"*, ~start~). + 1. Append code unit U+0078 (LATIN SMALL LETTER X) to _escapedList_. + 1. Append the code units in _hex_ to _escapedList_. + 1. Else, + 1. Assert: The length of _hex_ is at most 4. + 1. Set _hex_ to StringPad(_hex_, 4, *"0"*, ~start~). + 1. Append code unit U+0075 (LATIN SMALL LETTER U) to _escapedList_. + 1. Append the code units in _hex_ to _escapedList_. + 1. Else, + 1. Append _c_ to _escapedList_. 1. Return CodePointsToString(_escapedList_).