Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 133 additions & 0 deletions docs/syntax-single-quotes-for-singleton-strings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Use `'` for Singleton Strings

## Summary

Adjust type inference to assume that string literals written using single quotes (eg `'string'`) should always be inferred to have a singleton type, and that strings written using `"double quotes"` should always be typed `string`.

## Motivation

It's quite tricky for us to consistently infer a singleton type in exactly all of the places where it's what the developer intended, and for us to always infer `string` where it's what the developer intended.

Today, our inference strategy is to assume that a string literal should have type `string` if possible, but that type can be constrained to a singleton if necessary.

This strategy is broadly successful, but fails in a few ways.

### Bugs

We have bugs that prevent us from deducing the proper type in certain cases. For example: https://github.com/luau-lang/luau/issues/1483

```lua
type Form = "do-not-register" | (() -> ())

local function observer(register: () -> Form) end

observer(function()
if math.random() > 0.5 then
return "do-not-register" -- TypeError: Type 'string' could not be converted into '"do-not-register" | (() -> ())' Luau(1000)
end
return function() end
end)
```

### Singleton Strings and Generics

The currently implemented inference strategy doesn't always do what the developer intended in the presence of generics. Whenever a string literal is passed to a generic, it is possible for it to be typed `string`, and so that's what Luau infers.

In the following snippet using the [luaup](https://github.com/jackdotink/luaup) project, a developer clearly wanted `Kind` to be bound to a string singleton and not `string`. They were instead forced to write a bunch of casts:

```lua
local function new_token<Kind>(kind: Kind, text: string?): cst.TokenKind<Kind>
return { kind = kind, text = text or kind :: any, span = span, trivia = trivias }
Copy link
Contributor

@alexmccord alexmccord Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This specific cast is arguably because of the lack of generic bounds. If TokenKind defines text: string, then Kind must be bounded by string for text or kind to be valid, so this specific example is unsound because you can write new_token(5). If we could utter Kind: string, then string | Kind always reduces to string, and so text = text or kind will type check.

Copy link
Contributor

@alexmccord alexmccord Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you write type TokenKind<Kind> = { read kind: Kind, read text: string, .. }, then new_token(";") simply infers TokenKind<";"> because of polarity. This should already work today, and if it doesn't then I would consider that a different bug orthogonal to this RFC.

end

return table.freeze {
semicolon = new_token(';' :: any) :: cst.TokenKind<';'>,
equals = new_token('=' :: any) :: cst.TokenKind<'='>,
colon = new_token(':' :: any) :: cst.TokenKind<':'>,
comma = new_token(',' :: any) :: cst.TokenKind<','>,
dot = new_token('.' :: any) :: cst.TokenKind<'.'>,
endd = new_token('end' :: any) :: cst.TokenKind<'end'>,
}
Comment on lines +43 to +50
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth, this could also be solved with explicit instantiation too. new_token<<";">>(";"). But I must say, I don't understand why new_token(';' :: any) :: cst.TokenKind<';'> is necessary here when new_token(';' :: ';') would work just as well?

```

## Design

Languages like Lisp and Erlang offer a concept called an _atom_, which is essentially an interned string. Atoms and Luau string singletons are largely used for the same purpose: People use them to tag their unions and to build enumerations.

Given that this is the most common use case, it makes quite a lot of sense to offer separate syntax to allow developers to be precise about what they want.

In this RFC, we propose the syntax `"foo"` for a string of type `string`, and `'bar'` for a string with the singleton type `"bar"`.
Copy link
Contributor

@alexmccord alexmccord Oct 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier, the RFC says "This strategy is broadly successful, but fails in a few ways" and also has another supporting argument, but the RFC as proposed is removing this nice ergonomic feature without a reason?

There's a half-way solution here.

Suggested change
In this RFC, we propose the syntax `"foo"` for a string of type `string`, and `'bar'` for a string with the singleton type `"bar"`.
In this RFC, we propose the syntax `'foo'` for a string with the singleton type `'foo'`, and `"bar"` to retain its current behavior standing for either the singleton type `'bar'` or `string`, depending on the bounds.


This makes a whole bunch of inference scenarios completely trivial.

```lua
-- fruits : {string}
local fruits = {}
table.insert(fruits, "avocado") -- These are just strings
table.insert(fruits, "hazelnut")
table.insert(fruits, "carolina reaper")
```

```lua
type Ok<T> = {
tag: 'ok', -- the tag of a tagged union should use single quotes
data: T
}
type Err<E> = {
tag: 'error',
error: E
}
type Result<T, E> = Ok<T> | Err<E>

function try_very_hard()
local n = math.random()
if n > 0 then
return {tag='ok', data=n}
else
return {tag='err', error="I failed"} -- no ambiguity: The tag is a singleton and the error message is not.
end
end

local r: Result<number, string> = try_very_hard()
```

And

```lua
local function new_token<Kind>(kind: Kind, text: string?): cst.TokenKind<Kind>
return { kind = kind, text = text or kind :: any, span = span, trivia = trivias }
end

return table.freeze {
semicolon = new_token(';'), -- The string here uses single quotes, and so the generic Kind can only be ';'
equals = new_token('='),
colon = new_token(':'),
comma = new_token(','),
dot = new_token('.'),
endd = new_token('end'),
}
```

### Implementation

This is a very simple adjustment to the constraint generation phase of type checking. Most of the actual work involves disabling the code to infer when a particular string literal must be a singleton.

It is so easy to implement, in fact, that we've already done it as an experiment. As of this writing, you can try it out by setting the flag `DebugLuauStringSingletonBasedOnQuotes`.

## Drawbacks

There are a number of pretty serious drawbacks to this proposal:

First, and arguably most importantly: The inference we already have is performing quite well! I could only find one ticket that points out a bug in the system: https://github.com/luau-lang/luau/issues/1483. This bug is a little bit esoteric and can probably be fixed in a reasonable timeframe, so it is not itself strong evidence that we should change the language.

Secondly, most code formatters in modern use will automatically change all string literals to use double quotes because that is considered good style. If this RFC is implemented, those tools will need to be updated so they do not break type inference.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tooling side of this makes this whole thing a dealbreaker to me. This is current default behavior of StyLua:
ezgif-8317ad5f9596de

Notably:

  • It enforces double quoted strings
  • It converts strings that only escape double quotes into single-quoted strings

This feature being added to Luau would kill both of those formatter options. While that's not its own justification, I think the loss of them would be enough of a downside that I can't support this.

I want my formatter to be able to enforce string-style, and to be able to make things easier to read without changing the semantic meaning of my program, including the type system. I don't want to have to review pull requests to double check that they're using the right quote style for the right situation. That's work that is pointless and completely avoidable.


This is also a change to type inference, and not type checking. It will therefore also affect nonstrict mode. We probably won't report any spurious errors in this case, but autocomplete quality might suffer for code that hits this.

Lastly, this is probably going to cause some developer confusion unless we are very clear in our communication. We are going to have to document and explain this to developers and we are going to have to announce it fairly loudly so that people know about this, and the necessary changes (if any) to their code.

## Alternatives

Put simply, we could do nothing and fix the remaining bugs with our singleton inference. It's almost there.

The interaction between singletons and generics is unfortunate, but could be resolved another way. One idea is for bounded generics to afford a "strict subtype" constraint. Developers could use this to write a function with a generic that must be some string singleton.