-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestion: include Unicode codepoint standard names to sym.txt #6
Comments
Note that I'd like to avoid adding dependencies in build.rs just for validation as these will be pulled by any user of the library. |
I see. Another approach is a dev-dependency or even a non-build.rs validation/generation script |
Or potentially keeping it under a feature flag. I think that will work quite well. Would you be in favor of that? I'm checking to decide whether to start work on implementing it |
It may make sense to allow specifying a Unicode name instead of using the |
That was certainly the intention, only to replace |
Many symbols in
sym.txt
are specified as their Unicode codepoint in the formU+XXXX
rather than a plain character, because it would be hard to parse or notice when reading the file later. I believe using the Unicode-assigned name of such characters would be more useful and self-documenting than simply entering the code point.Ideally, these names would be machine-checked in
build.rs
rather than just act as informative comments, to ease the minds of reviewers from ensuring the right name is provided for each character. These names could also then be used to look up the wanted Unicode codepoint thereby entirely replacing theU+
scalar reference.Either way, we could opt to include the names even on characters that are directly embedded in the txt files just to have more context directly available when editing them (though this is definitely more of a bonus/personal preference change and should be discussed separately.)
The text was updated successfully, but these errors were encountered: