Skip to content

Conversation

@davidkna
Copy link
Contributor

unicode_normalization provide fast-paths functions (is_nfc_quick/is_nfd_quick) for quick-checks to determine if (de-)composition is necessary. These functions may can return an ambiguous results in certain cases (Maybe), to avoid this is_nfc/is_nfd exists which do slow checks if these functions return Maybe (matching the current code in gix-utils).

For strings that do not require normalisation the is_nfc/is_nfd are about 75% faster, but 20% slower for strings that do require normalisation. As far as I can tell the slowdown can be avoided by avoiding the slow check e.g is_nfc_quick(s.chars()) == IsNormalized::Yes, but this may come at the expense of unnecessary allocations.

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's great, thanks so much!

@Byron Byron merged commit 20b7c20 into GitoxideLabs:main Aug 17, 2025
23 checks passed
@davidkna davidkna deleted the norm-fast-path branch August 17, 2025 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants