-
Couldn't load subscription status.
- Fork 66
RFC: Single-quoted strings as singletons. #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
|
||
| First, and arguably most importantly: The inference we already have is performing quite well! I could only find one ticket that points out a bug in the system: https://github.com/luau-lang/luau/issues/1483. This bug is a little bit esoteric and can probably be fixed in a reasonable timeframe, so it is not itself strong evidence that we should change the language. | ||
|
|
||
| Secondly, most code formatters in modern use will automatically change all string literals to use double quotes because that is considered good style. If this RFC is implemented, those tools will need to be updated so they do not break type inference. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tooling side of this makes this whole thing a dealbreaker to me. This is current default behavior of StyLua:

Notably:
- It enforces double quoted strings
- It converts strings that only escape double quotes into single-quoted strings
This feature being added to Luau would kill both of those formatter options. While that's not its own justification, I think the loss of them would be enough of a downside that I can't support this.
I want my formatter to be able to enforce string-style, and to be able to make things easier to read without changing the semantic meaning of my program, including the type system. I don't want to have to review pull requests to double check that they're using the right quote style for the right situation. That's work that is pointless and completely avoidable.
|
Note that I don't use a formatter for Luau, so I've used this without problems others have had. This has been wonderful to use. It hasn't caused any problems for me when writing or reading code. The worst that happens is that I type a singleton with I for one do not think the loss in formatting options is enough to eliminate this feature, nor do I think it will be possible to infer singleton vs string in every position, so some kind of singleton syntax is needed. This syntax is good, and there are other good options too. |
|
I am not opposed of this personally, as I think it would make sense for single quotes to have different semantics over types than double quotes. And the syntax is pretty great, but the bar is low considering that it is standing next to casting to literals Although the alternative of fixing the bugs and allowing bounded generics to be able to have a strict subtype constraint makes more sense to me. It would likely involve less work for my own codebases. |
|
I like the distinction between stringletons and regular strings afforded by I think that quote style should affect (but not necessarily completely override) type inference -- you should still be able to do Here's an example from seal's datetime library illustrating why I'd want this feature: local datetime = {
common_formats = {
ISO_8601 = "%Y-%m-%d %H:%M" :: "%Y-%m-%d %H:%M",
RFC_2822 = "%a, %d %b %Y %H:%M:%S %z" :: "%a, %d %b %Y %H:%M:%S %z",
RFC_3339 = "%Y-%m-%dT%H:%M:%S%:z" :: "%Y-%m-%dT%H:%M:%S%:z",
SHORT_DATE = "%Y-%m-%d" :: "%Y-%m-%d",
SHORT_TIME = "%H:%M" :: "%H:%M",
FULL_DATE_TIME = "%A, %B %d, %Y %H:%M:%S" :: "%A, %B %d, %Y %H:%M:%S",
LOGGING_24_HR = "%a %b %e %H:%M:%S %Z %Y" :: "%a %b %e %H:%M:%S %Z %Y",
LOGGING_12_HR = "%a %b %e %I:%M:%S %p %Z %Y" :: "%a %b %e %I:%M:%S %p %Z %Y",
["MM/DD/YYYY"] = "%m/%d/%Y" :: "%m/%d/%Y",
["MM/DD/YYYY HH:MM (AM/PM)"] = "%m/%d/%Y %I:%M %p" :: "%m/%d/%Y %I:%M %p",
["MM/DD/YY"] = "%m/%d/%y" :: "%m/%d/%y",
["HH:MM (AM/PM)"] = "%I:%M %p" :: "%I:%M %p",
AMERICAN_FULL_DATE_TIME = "%A, %B %d, %Y %I:%M:%S %p" :: "%A, %B %d, %Y %I:%M:%S %p",
}
}I don't think this is an uncommon paradigm; it's currently pretty messy and would be surely made better by dedicated stringleton inference syntax? Like Jack, I don't rely on a formatter for Luau, but I also think that Luau code formatters should be automatically handling Luau semantics such as |
|
As an alternative (as discussed on Discord), we could provide a symbol string literal syntax ala Ruby, allowing users to explicitly choose literal inference without casting nor breaking existing tooling: type Animal = :Cat | :Dog | :Seal | :"Snow Leopard"
local animal: Animal = :Cat
local common_formats = {
ISO_8601 = :"%Y-%m-%d %H:%M"
}
local entry: (:File | :Dir)? = nil
entry = "File" :: "File" -- continues to work
-- this unfortunately becomes legal
local meow::meow=:meowSymbol string literals would act like regular single/double-quoted strings at runtime. Like interpolated string literals, symbol string literals cannot be used in non-parenthesized functioncalls (to avoid ambiguity). Single/double quoted strings casted to themselves will continue to act like stringletons for backwards compatibility (we can add a lint). We can flesh this out into a full RFC if there's interest? |
It doesn't have to be. Try doing |
|
If you're going to start inferring some string literals as singletons, why not all of them? Almost every style guide like Roblox's prefers double quotes and it just seems oddly alienating not to support it. My guess is there's some concern with backwards compatibility with the likes of Regardless, having two different inference rules for what are normally interchangeable style preferences seems a little surprising. Whatever the inference rule is, it should stay consistent for all styles in my opinion. |
|
|
||
| Given that this is the most common use case, it makes quite a lot of sense to offer separate syntax to allow developers to be precise about what they want. | ||
|
|
||
| In this RFC, we propose the syntax `"foo"` for a string of type `string`, and `'bar'` for a string with the singleton type `"bar"`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier, the RFC says "This strategy is broadly successful, but fails in a few ways" and also has another supporting argument, but the RFC as proposed is removing this nice ergonomic feature without a reason?
There's a half-way solution here.
| In this RFC, we propose the syntax `"foo"` for a string of type `string`, and `'bar'` for a string with the singleton type `"bar"`. | |
| In this RFC, we propose the syntax `'foo'` for a string with the singleton type `'foo'`, and `"bar"` to retain its current behavior standing for either the singleton type `'bar'` or `string`, depending on the bounds. |
| return table.freeze { | ||
| semicolon = new_token(';' :: any) :: cst.TokenKind<';'>, | ||
| equals = new_token('=' :: any) :: cst.TokenKind<'='>, | ||
| colon = new_token(':' :: any) :: cst.TokenKind<':'>, | ||
| comma = new_token(',' :: any) :: cst.TokenKind<','>, | ||
| dot = new_token('.' :: any) :: cst.TokenKind<'.'>, | ||
| endd = new_token('end' :: any) :: cst.TokenKind<'end'>, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth, this could also be solved with explicit instantiation too. new_token<<";">>(";"). But I must say, I don't understand why new_token(';' :: any) :: cst.TokenKind<';'> is necessary here when new_token(';' :: ';') would work just as well?
|
|
||
| ```lua | ||
| local function new_token<Kind>(kind: Kind, text: string?): cst.TokenKind<Kind> | ||
| return { kind = kind, text = text or kind :: any, span = span, trivia = trivias } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific cast is arguably because of the lack of generic bounds. If TokenKind defines text: string, then Kind must be bounded by string for text or kind to be valid, so this specific example is unsound because you can write new_token(5). If we could utter Kind: string, then string | Kind always reduces to string, and so text = text or kind will type check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you write type TokenKind<Kind> = { read kind: Kind, read text: string, .. }, then new_token(";") simply infers TokenKind<";"> because of polarity. This should already work today, and if it doesn't then I would consider that a different bug orthogonal to this RFC.
Because that's pathologically horrendous. If you have a type The problem here is this: if you write Even worse, the approach will not work across functions. function make()
return {"a"}
end
local t = make()With your approach here, you can't even say That's why the RFC is proposing one knob, have single quote strings infer the singleton type (to replace the stupid
This is funny to me because in the old type system, I put in a lot of work to fix various unification bugs to force singletons to generalize into their top primitive type, like this: function f(t, x)
if x ~= "x" then
return
end
table.insert(t, x)
endIn the old solver, unification of two types were more or less an equality constraint (the story is more complicated than that), so if the subtype or the supertype were a free metavariable, it would bind one to the other and be done. That meant the above snippet would infer the type That means adding a rule in unification where if the subtype is a singleton (distributive over union) and the supertype is a free metavariable, then we generate a new "replaced subtype" where the singletons are replaced by their top primitive types ( If we did make every string literals infer the string singleton type in the new solver, we're going back to that world for the second time. The current strategy we have in the new solver is subtly different, which is why doesn't seem to have any glaring DX issues and doesn't clash with invariances. I was very happy when I had the brainwave to suggest that singleton inference should be |
|
If that's the case, I'd rather have literals inferred consistently as a boring These variance issues won't go away if you push them over to someone's personal preference of which quotation marks to use. Someone's bound to do There's got to be a better way to handle the core issue here, which is avoiding the |
|
Personally, I think this is something that really needs its own syntax, rather than retconning existing syntax. For those of us in the know, this RFC would be a very useful and powerful change, but how many of the existing end users of the language are going to be aware that |
Rendered