-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement char extension in Rust tools #254
Conversation
fa015ba
to
d404fda
Compare
Awesome; thanks for getting this going! And for catching those problems in the text formatter; it does make sense to escape characters on output IMO (so we end up with a valid source file). Before we merge this with a I could be wrong, but I think the existing JS implementation would still be correct-ish; it just would not support Unicode scalars beyond the BMP. We could create a bug for supporting beyond-BMP characters in the JavaScript stuff and consider this a bug (but not too worrisome of a bug). |
Nice! I'm really glad that the char extension will have support in other tools.
The idea was to use the control characters escaped only in the text format since this is the human-friendly version, so I left it out from
Totally agree! I'm willing to make these changes! |
It's certainly easier to implement on my end so that would be fine with me. I've added a commit with this change.
Would need to disallow lone "Surrogate code points" and allow larger "Unicode scalar values" as with the failing |
This does make sense though I would side on being consistent with the handling of control characters if possible.
Maybe representing brili's char as a u32 would be easiest given Javascript's weirdness with unicode? Not totally sure. |
Here is something I don't quite understand yet, about shells in general: for that command line, does the The reason is that, IMO, the backslash-escapes should only be the domain of the text format, which |
I've found out you can do something like
I don't necessarily agree since the text format and command line arguments are generally both forms of user-provided input and I think they should have similar semantics. However it's kind of a niche case so an error is probably fine. Usually that makes this a "bad program" and then I would fall back to |
Ah, sure, I dig that notion. It is pretty cool that you can do |
6137ccc
to
a2889d2
Compare
6928f0e
to
8e80c7f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good so far! Is the reason store-char.bril
is disabled because the new type doesn't interact with the memory extension yet?
Specifically, the This extension should be fully supported and compose well with the memory extension in the (rust) |
Ah, right; that explains it! Many thanks as ever!!! |
I've gone and implemented the char extension #253 in
bril-rs
and it's corresponding Rust tools:bril2txt
,bril2json
, andbrilirs
.The main implementation difference is that I've used a
u16
as the internal representation of a char which I believe is an equivalent representation. It is possible that implementation differences could lead to slightly different behavior though the rust tools currently pass all char tests.I have a couple of notes for what they are worth:
bril2json < test/interp/char/control_char.bril | bril2txt
in the sense that the outputted text file doesn't have the control characters escaped. In this example, the resulting file isn't valid. Similarly,bril2json < test/interp/char/control_char.bril
will output some of the control values as\u000b
which doesn't seem to cause an issue but might be surprising given that the larger set of escape characters is unsupported(maybe this is fine, I'm not sure).bril2json < ../test/interp/char/char_args.bril | brili '\n' e y
errors out and probably shouldn't?brili
's check for the range ofint2char
should be more strict to disallow the following program? `@tmaradc do you have any thoughts on this?