Skip to content

Conversation

@kyleconroy
Copy link
Collaborator

T-SQL treats many characters as whitespace beyond standard ASCII spaces,
including control characters (0x01-0x1F) and Unicode spaces. This adds
proper handling for multi-byte UTF-8 whitespace sequences including
Zero Width Space (U+200B) which Go's unicode.IsSpace doesn't include.

Also fixes isLetter to only handle ASCII letters, preventing UTF-8
leading bytes from being incorrectly treated as identifier characters.

T-SQL treats many characters as whitespace beyond standard ASCII spaces,
including control characters (0x01-0x1F) and Unicode spaces. This adds
proper handling for multi-byte UTF-8 whitespace sequences including
Zero Width Space (U+200B) which Go's unicode.IsSpace doesn't include.

Also fixes isLetter to only handle ASCII letters, preventing UTF-8
leading bytes from being incorrectly treated as identifier characters.
@kyleconroy kyleconroy merged commit 94a3c62 into main Dec 23, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants