Skip to content

Conversation

@Bottersnike
Copy link

Rendered

Adds type guards of the form

function isFoo(x): x is Foo
    return x.type == "foo"
end

which can be used in control statements and assertions.

@deviaze
Copy link

deviaze commented May 31, 2025

Could we get this as a sort of typecast as well/instead? This could allow user-defined narrowing in a more natural inline manner.

For example:

type Entry = File | Directory | Symlink

local entry = getentry()
if str.endswith(entry.path, ".luau") :: entry is File then
    print(entry:read()) -- a method only on `File`
end

In this case we want to assume the entry is a File if it ends with .luau, but that isn't something inherent to the function str.endswith (which is just a substring check), but inherent to the logic of our conditional itself.

@Bottersnike
Copy link
Author

In terms of as well, I think that would be cool, though potentially worth a separate RFC. In terms of instead, you then lose many of the benefits of having dedicated guard functions, so I don't think I'd want to not have those.

@gaymeowing
Copy link
Contributor

gaymeowing commented Jun 10, 2025

Is there a backwards compat issue with using == instead? As I think that'd be more fitting, as types currently purely use symbols and not words. Plus I think it's nice how luau separates type syntax by not having it be keyword heavy.

Edit: On second thought is is nicer because it does a better job distinguishing in the example, and it is somewhat more runtime adjacent sorta like typeof().

@alexmccord
Copy link
Contributor

alexmccord commented Jun 10, 2025

Unfortunately you can't use is or any contextual keywords like that. See https://github.com/luau-lang/rfcs/blob/master/docs/disallow-proposals-leading-to-ambiguity-in-grammar.md. x is T is an extension of the F T case described in that. x :: T in the return type, however, is not a bad idea. The alternative would be to wrap in parenthesis so that in syntactic type-land there is no ambiguity, but that logic was ruled out when we used :: for casting.

But, eugh. is_leaf(node) :: node :: Leaf. That's a hard pass.

Another case not being mentioned here is the ambiguity when a field of some table is a refinement target in these types. t.p :: T could refer to a type named p. An example: local M = require("/mod.luau") where M exports some type function F<T>, we run into the case of ambiguity when parsing any arbitrary expression upfront, M.F<"hello">.

We would therefore need to hand-write new syntax that limits what gets parsed to just identifiers, no fields. This also avoids the left recursion problem when reusing :: as a type and trying to parse any arbitrary expression.

```lua
type Tree = { value: number, left: Tree?, right: Tree? } | nil

function isLeaf(x: Tree): x is { left: nil, right: nil }
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, see this part. It parses as a function call for some global is, with a table literal whose fields are obviously not a valid parse.

function isLeaf(x: Tree): x
  is({ left: nil, right: nil }) -- error
  -- no backtracking, and ambiguous even if
  -- there were for some types, like strings.
  return ...
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an outstandingly good point and I feel silly for not having noticed it. Do you think foo(...): (x is Bar) would work? It feels a little icky, but might be the simplest approach. I'm not sure I'd want to introduce any new special symbols for this, and a type function wouldn't properly convey the special semantics.

Copy link

@itsmath itsmath Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe x = Bar?

function isLeaf(x: Tree): x = { left: nil, right: nil }
    return ...
end

I don't like to use = for types, but maybe we can justify this because the left-hand side is still a variable?


Assigning the value of a type guard to a variable (`local foo = isCat(x)`) or its use in a more complex expression (`foo(isCat(x))`) is not disallowed, though the predicate returned by the type guard is demoted to a simple boolean value and no longer serves the narrow the type of the subject variable. It is suggested that lint rules may be used to warn about these cases.

Type guard functions are not permitted to have multiple return values. `function foo(x): (x is number, string)` is disallowed, as is `function foo(x): (x is number, x is string)`.
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks a bit like an arbitrary choice.

function foo(x): (x is number, x is string)
  return typeof(x) == "number", typeof(x) == "string"
end

function bar(x: unknown)
  local is_num, is_str = foo(x)
  if is_num then
    -- x : number
  elseif is_str then
    -- x : string
  else
    -- x : ~number & ~string
  end
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to allow predicates to be stored as normal variables, but consider what happens if you make a modification to x after the call to foo but before the use of your predicate booleans. Your predicates are now meaningless. We can't simply track things that might change, as the purpose of type guards is to also allow more complex logic than can be trivially identified by the type narrowing system. This is why the use of type functions was exclusively restricted to control flow statements, and as such any return values other than the first would be ignored. Disallowing multiple return values nips this footgun at the bud.

Copy link
Contributor

@alexmccord alexmccord Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's actually okay. Type refinements store a proposition to refine a specific version of a variable, so even if you update some variable and then apply an outdated proposition, it doesn't affect any variable whose version did not match. This gives us the effect of invalidating any outdated propositions (even if the type system will still commit those refinements)

end
```

This is an acceptable compromise, as situations like these are uncommon. We also have no guarantee that `pet` was not further mutated to remove `bark()`, so `wasDog` can no longer be relied upon. `isDog()` could instead be replaced with a `canBark()` type guard, giving `if isCat(pet) and canBark(pet) then`.
Copy link
Contributor

@alexmccord alexmccord Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, effect systems are necessary to know which refinements survives any invalidation from side effects, but the top effect in the lattice of effects describes every effects. This means all functions, indexing, even something as basic as equality, will invalidate all refinements in relation to any globals and possibly all locals (except primitives that doesn't have mutation, e.g. strings and numbers, but not tables) since they could be reachable by some insidious function that mutates everything. Obviously that's just unusable, so you need to make a pragmatic call to assume the user isn't doing anything crazy.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why the decision was made to exclusively allow type guard calls to be used in control flow (and similar statements like assert). I was unable to come up with any system of rules to allow their use elsewhere that wouldn't immediately break if you looked at it wrong; incorrect types are more dangerous than no types at all.

@deviaze
Copy link

deviaze commented Jun 11, 2025

Looking at this RFC again, I don't like that it's mixing up the runtime return type of the guard function (a boolean), with what it actually means to the user (that it refines x into Foo). Since type guard functions work moreso like type casts than type annotations, I feel that annotation syntax here might be confusing for users.

In terms of as well, I think that would be cool, though potentially worth a separate RFC. In terms of instead, you then lose many of the benefits of having dedicated guard functions, so I don't think I'd want to not have those.

I think a many if not most of the usecases of type guard functions can already be solved (albeit less ergonomically) by casts, putting methods like isFoo: (self: Foo) -> true on table types, using demarking fields on unions of table types { is: "Foo" } | { is: "Boo" }, etc. That's why I feel a new refinement cast syntax would be a more general-purpose ergonomic alternative to type guard functions.

@alexmccord
Copy link
Contributor

alexmccord commented Aug 26, 2025

Looking at this RFC again, I don't like that it's mixing up the runtime return type of the guard function (a boolean), with what it actually means to the user (that it refines x into Foo). Since type guard functions work moreso like type casts than type annotations, I feel that annotation syntax here might be confusing for users.

Think of it this way, the runtime value of type boolean is a witness for the type x is T which lives in a different kind of universe called propositions (for the sake of discussion, let's just ignore the syntactic impossibility). The type system would still track them as a different type from boolean, because they are different, so something like local is_string = typeof(x) == "string" gives us is_string: x is string. x is string is still a subtype of boolean, but the reverse is not true. If you have a union (x is string) | boolean, this normalizes to boolean because you can no longer preserve the provenance of the witness.

@gaymeowing
Copy link
Contributor

gaymeowing commented Sep 11, 2025

Would it be better to do a special type function instead called insure<T, U>? As I feel like that'd be more consistent with the language rather than a new keyword that is sorta jank in its use.

Also I think assert is a more clear term rather than is, as is just at surface level sounds like its doing what :: does.

@alexmccord
Copy link
Contributor

I don't think we can represent this as a type function. Parameters like x is a term, not a type, so insure<T, U> would just break that. This requires a new language construct. On top of that, you will eventually want things like function Animal:isDog(): self is Dog or you want to refine a subterm e.g., I dunno, function Foo:isSomething(): self.something is Something.

@alexmccord
Copy link
Contributor

alexmccord commented Sep 11, 2025

One thing this RFC needs is a different kind of is: implication. Right now, if you say is_number(x) = typeof(x) == "number" where is_number : (x: unknown) -> x is number, then not is_number(x) gives x : ~number. That's fine and dandy, but this breaks when you have a nontrivial clause e.g. is_positive(x) = typeof(x) == "number" and x > 0. is_positive(x) should imply x : number whereas not is_positive(x) does not imply x : ~number. Something like function is_positive(x): boolean => x is number could work as syntax for implications.

@itsmath
Copy link

itsmath commented Sep 11, 2025

Something like function is_positive(x): boolean => x is number could work as syntax for implications.

Wouldn't that mean any x given to is_positive would be classified as a number? Since I'm assuming we're going to return false if x isn't a number OR isn't positive. That function would actually be something like false => x is number | unknown and true => x is number, but that just sounds ridiculous to describe in the type system (maybe not?).

@alexmccord
Copy link
Contributor

alexmccord commented Sep 11, 2025

Something like function is_positive(x): boolean => x is number could work as syntax for implications.

Wouldn't that mean any x given to is_positive would be classified as a number? Since I'm assuming we're going to return false if x isn't a number OR isn't positive. That function would actually be something like false => x is number | unknown and true => x is number, but that just sounds ridiculous to describe in the type system (maybe not?).

I don't think so. If x: string then is_positive(x) == true implies x: string & number which normalizes to x: never. The correct way to read it is "returns false if x isn't a number OR (x is a number AND is not positive)," so false => x is unknown for two independent reasons: 1) because ~number | number is unknown, and 2) there are no such propositions to express x > 0 in any way or shape, so negation of a proposition like this cannot be inverted. Then true => x is number.

The key is that some propositions can be applied positively and negatively (logical equivalence, the same kind of refinements this RFC wanted to see implemented), and some others can only be applied positively and never negatively (logical implications, e.g. is_positive/is_nonempty_string/etc).

@itsmath
Copy link

itsmath commented Sep 11, 2025

The key is that some propositions can be applied positively and negatively (logical equivalence, the same kind of refinements this RFC wanted to see implemented), and some others can only be applied positively and never negatively (logical implications, e.g. is_positive/is_nonempty_string/etc).

Right. I think I understand it now and there is actually only two cases here: the case where false implies that the type isn't something, and the case where false doesn't imply anything at all.

Your syntax using => is actually really good I think, but if we're using it, I think we would NEED to write it as is_positive : (x: unknown) -> true => x is number to actually get the behavior you described. At least in the way I read the syntax here, boolean => x is number sounds like getting a false would also make x necessarily a number (I'm reading it as "returning a boolean implies x is a number")

@alexmccord
Copy link
Contributor

alexmccord commented Sep 11, 2025

Ah. Yeah. I 100% should have written down true => x is T. While we fiddle with the notation here a bit, do you interpret boolean == x is T as true => x is T + false => x is ~T? Maybe that last one with == would give us the syntax for positive and negative propositions we need that doesn't break backward compatibility.

@itsmath
Copy link

itsmath commented Sep 11, 2025

While we fiddle with the notation here a bit, do you interpret boolean == x is T as true => x is T + false => x is ~T?

I think that's an interesting syntax for it, but I'm just wondering what happens if you have another type that isn't boolean on the left-hand side. Would it keep the same behavior but truthy and falsy instead? But if that's the case then I don't see why even have the boolean there, since any value can be truthy or falsy.

With the implication syntax you proposed, I could also see how the left-hand side could be extended to other types as well, like maybe a function that only returns a number if an argument x is of type T, would be like number => x is T

@alexmccord
Copy link
Contributor

That's kind of the same idea that I had with overloaded functions. typeof : ((string) -> "string") & ((number) -> "number") & ((boolean) -> "boolean) & ..., but you could rephrase it as (x: unknown) -> "string" => x is string | "number" => x is number | "boolean" => x is boolean | ....

@itsmath
Copy link

itsmath commented Sep 11, 2025

That's kind of the same idea that I had with overloaded functions. typeof : ((string) -> "string") & ((number) -> "number") & ((boolean) -> "boolean) & ..., but you could rephrase it as (x: unknown) -> "string" => x is string | "number" => x is number | "boolean" => x is boolean | ....

That's beautiful. I really like that syntax.

I also think the arrows work really well for implication since that's also the symbol for implies (⇒) in mathematics (I immediately read it as implies). But I think we definitely need some better syntax for the other case

@alexmccord
Copy link
Contributor

alexmccord commented Sep 11, 2025

Crazy idea. Drop the is. Write function is_number(x): x: number ... end. This extends to "string" => x: string etc. Let the notation x: T be a proposition with a witness.

@itsmath
Copy link

itsmath commented Sep 11, 2025

Crazy idea. Drop the is. Write function is_number(x): x: number ... end. This extends to "string" => x: string etc. Let the notation x: T be a proposition with a witness.

I think this works even for type casts. Using @deviaze's example:

local entry = getentry()
if str.endswith(entry.path, ".luau") :: entry: File then
    print(entry:read()) -- a method only on `File`
end

Is there any ambiguity here? I don't think so?

@alexmccord
Copy link
Contributor

No ambiguity, I'm pretty sure.

@gaymeowing
Copy link
Contributor

Crazy idea. Drop the is. Write function is_number(x): x: number ... end. This extends to "string" => x: string etc. Let the notation x: T be a proposition with a witness.

I think this works even for type casts. Using @deviaze's example:

local entry = getentry()
if str.endswith(entry.path, ".luau") :: entry: File then
    print(entry:read()) -- a method only on `File`
end

Is there any ambiguity here? I don't think so?

It is kinda, as it looks like its casting onto the return of str.endswith so I would prefer allowing this:

local entry = getentry()
if str.endswith(entry.path, ".luau") then
      entry :: File
      print(entry:read()) -- a method only on `File`
end

@itsmath
Copy link

itsmath commented Sep 12, 2025

Oh yeah, type casts as statements are definitely interesting, but it's just nice to know that the syntax works even in casts. And yeah it would actually be casting the return of str.endswith, and that's exactly what we want it to do in this case, since we're essentially saying: "if this return value is true, then entry is a file". It's more complicated than just casting entry after, because, for example:

if condition(entry) or str.endswith(entry.path, ".luau") :: entry: File then
    -- .. snip ..
end

In this case, entry is only a file if condition(entry) is also a type that implies that, but casting right after the if statement would always make entry a file, regardless of the conditions of the if statement.

@alicesaidhi
Copy link
Contributor

alicesaidhi commented Sep 12, 2025

How would this work with user defined type functions?

@itsmath
Copy link

itsmath commented Sep 12, 2025

How would this work with user defined type functions?

-- For x: T syntax
function types.implies(variable: string, is: type): type

-- For T => x: U syntax
function type:implies(variable: string, is: type): type

This should probably only work for variable names defined in user-defined type functions, for example with #137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

6 participants