You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
According to its documentation, text-icu's collation algorithm uses incremental normalization. This is very helpful in collation: when you're comparing two strings, the decision about how to order them is generally one you can make after the first few characters, so no need to normalize the whole thing.
Could unicode-transforms provide a function that does this? For my purposes, an ideal interface would be
normalizeStreaming :: NormalizationMode -> Text -> [Int]
where the Ints are code points, and the list is produced lazily.
The text was updated successfully, but these errors were encountered:
The best way to deal with this would be to use stream based normalization. Streamly is going to support that using a signature like this:
normalize :: (IsStream t, Monad m) => NormalizationMode -> t m Char -> t m Char
See this PR composewell/streamly#698 for a working implementation of the above. We are also going to break streamly into several packages so that it can be a lightweight dependency and also have a streamly-unicode package for stream based unicode algorithms (see composewell/streamly#533).
To work with text we can convert it to stream, normalize it and convert the stream back to text. In fact, with that we can work with any streamable type, not just text.
According to its documentation, text-icu's collation algorithm uses incremental normalization. This is very helpful in collation: when you're comparing two strings, the decision about how to order them is generally one you can make after the first few characters, so no need to normalize the whole thing.
Could unicode-transforms provide a function that does this? For my purposes, an ideal interface would be
where the
Int
s are code points, and the list is produced lazily.The text was updated successfully, but these errors were encountered: