-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 747: Fix rules related to UnionType (T1 | T2). Contrast TypeExpr with TypeAlias. Apply other feedback. #3856
PEP 747: Fix rules related to UnionType (T1 | T2). Contrast TypeExpr with TypeAlias. Apply other feedback. #3856
Conversation
peps/pep-0747.rst
Outdated
Relationship with UnionType | ||
''''''''''''''''''''''''''' | ||
|
||
``TypeExpr[U]`` is a subtype of ``UnionType`` iff ``U`` is a non-empty union type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to be specified? It feels like asking type checkers to understand details of runtime implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From §"Backward Compatibility":
As a value expression,
X | Y
previously had typeUnionType
(via :pep:604
)
but this PEP gives it the more-precise static typeTypeExpr[X | Y]
(a subtype ofUnionType
) while continuing to return aUnionType
instance at runtime.
Preserving compability withUnionType
is important becauseUnionType
supportsisinstance
checks, unlikeTypeExpr
, and existing code relies
on being able to perform those checks.
Rephrasing:
type.__or__
(and other methods) now have return typeTypeExpr[X | Y]
rather thanUnionType
- Static type checkers need to treat
TypeExpr[X | Y]
as assignable toUnionType
so that existing methods likeisinstance
which expect aUnionType
continue to pass type checking when given aX | Y
expression.- For example
isinstance('words', int | str)
needs to still pass type checking even thoughint | str
is now aTypeExpr[int | str]
andisinstance
expects aUnionType
as its second argument.
- For example
peps/pep-0747.rst
Outdated
|
||
- ``TypeExpr[X | Y | ...]`` is a subtype of ``UnionType``. | ||
- ``TypeExpr[Union[X, Y, ...]]`` is a subtype of ``UnionType``. | ||
- ``TypeExpr[Optional[X]]`` is a subtype of ``UnionType``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't at runtime (currently)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed technically I don't think it is, but I don't think the lack of a runtime subtype relationship is observable. §"Interactions with isinstance() and issubclass()" says:
The
TypeExpr
special form cannot be used as any argument to
issubclass
:
So I'd expect the following behavior:
issubclass(TypeExpr[int | str], UnionType)
TypeError: issubclass() arg 1 must be a class
Edit: And indeed I see that behavior with the current implementation of TypeExpr
in typing_extensions
.
Are there other ways you can think of that a lack of a runtime subtype relationship could be observable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from types import UnionType
from typing_extensions import Optional, TypeExpr
def f(x: UnionType):
assert isinstance(x, UnionType)
def g(x: TypeExpr[Optional[int]]):
f(x)
g(Optional[int]) # boom
I'm not sure there is much reason for TypeExpr to ever be a subtype of UnionType; I don't think it will significantly help users of TypeExpr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clever example. OK it seems that if TypeExpr is to (sometimes) be a subtype of UnionType then there are ways to observe it runtime.
I'm not sure there is much reason for TypeExpr to ever be a subtype of UnionType; I don't think it will significantly help users of TypeExpr.
I also don't think it will help a ton, but I think it's necessary for backward compatibility for existing functions that accept UnionType
so long as there's no other way to spell "a TypeExpr that is a non-empty union type". Currently §["Rejected Ideas" > "Support pattern matching on type expressions"] does not provide a spelling that can replace existing usage of UnionType
.
Do you have an alternative suggestion in mind that both:
- gives a
TypeExpr[int | str]
result for the value expressionint | str
and - continues to allow
isinstance('words', int | str)
to pass a type checker?
I believe it should be possible to make TypeExpr[U]
be conditionally considered a subtype of UnionType
at runtime (via an isinstance
check) by overriding __instancecheck__
on the metaclass of TypeExpr
UnionType
.
Aside: The current implementation of int | str
gives a UnionType
, but neither Union[int, str]
nor Optional[int]
give UnionType
s. If we wanted to more-strictly preserve the existing behavior, I'd be open to narrowing the rules in this section to only make TypeExpr[X | Y | ...]
a subtype of UnionType
:
TypeExpr[X | Y | ...]
is a subtype ofUnionType
.TypeExpr[Union[X, Y, ...]]
is not a subtype ofUnionType
.TypeExpr[Optional[X]]
is not a subtype ofUnionType
.TypeExpr[Never]
is not a subtype ofUnionType
.TypeExpr[NoReturn]
is not a subtype ofUnionType
.
Edit: However I will note that (X | None) == Union[X, None] == Optional[X]
at runtime so it could be confusing to users if Union
and Optional
couldn't be used in the same place as an X | Y
expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the backwards compatibility problem is an issue of type checker inference that we don't have to specify exactly. It's also a very limited problem, mostly applying to isinstance()
which is necessarily special-cased by type checkers anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The following code can be written by users today:
def accept_union(u: UnionType):
pass
accept_union(int | str)
Are you saying we shouldn’t worry about breaking this code so long as we avoid breaking code related to isinstance
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the way to avoid breaking that code can be up to type checkers. We don't need to prescribe it exactly; different type checkers can use different approaches, and adapt it to changes in how the runtime works. For example, type checkers could store something in their internal representation of a TypeExpr type to indicate the runtime construct that was used to create it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type checkers could store something in their internal representation of a TypeExpr type to indicate the runtime construct that was used to create it.
Yes, this description is consistent with the implementation approach I had in mypy: A single bit like is_uniontype
.
But the overall specification rule being implemented is still:
TypeExpr[X | Y | ...]
is a subtype ofUnionType
I'll update the diff to include only this rule, and not the extraneous ones for Union
and Optional
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With only the remaining rule, the code you mentioned before fails (correctly) at type checking time and at runtime:
from types import UnionType
from typing_extensions import Optional, TypeExpr
def f(x: UnionType):
assert isinstance(x, UnionType) # AssertionError
def g(x: TypeExpr[Optional[int]]):
f(x) # ERROR: TypeExpr[Optional[int]] is not a UnionType
g(Optional[int])
And similar code involving TypeExpr[X | Y | ...]
passes (correctly) both at type checking time and at runtime:
from types import UnionType
from typing_extensions import Optional, TypeExpr
def f(x: UnionType):
assert isinstance(x, UnionType) # OK (runtime)
def g(x: TypeExpr[int | None]):
f(x) # OK (type checking time)
g(int | None)
Co-authored-by: Jelle Zijlstra <[email protected]>
…rd compatibility with: X | Y | ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still feel this may specify too many details, but let's leave that for the PEP discussion.
- As a value expression, ``x | y`` has type equal to the return type of ``type(x).__or__`` | ||
if ``type(x)`` overrides the ``__or__`` method. | ||
|
||
- When ``x`` has type ``builtins.type``, ``types.GenericAlias``, or the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to interpret this statement. The phrase "has type" isn't clear. Are you talking about type equivalence? Assignability?
Also, are these static types or runtime types? I presume it's static types, but if that's the case, then I don't know why GenericAlias
is mentioned because that's not a type a static type checker would ever evaluate. It's a runtime implementation detail.
What if the static type of x
is a union, and some of the subtypes have a custom __or__
override and some do not? Presumably, this formulation assumes that an expansion of the types of x
and y
has already been performed, and x
and y
are not union types?
What if the __or__
method is present, but evaluating it generates a type error (e.g. because y
's type is incompatible with the signature)?
We can try to hammer out all of these details, but this is getting really complex. One option is to say that unions never evaluate to TypeExpr
unless you use a TypeExpr
constructor (i.e. TypeExpr(x | y)
). This would also avoid the issue with UnionType
.
and ``y`` has type ``TypeExpr[t2]`` (or ``type[t2]``). | ||
- As a value expression, ``x | y`` has type ``int`` | ||
if ``x`` has type ``int`` and ``y`` has type ``int`` | ||
- As a value expression, ``x | y`` has type ``UnionType`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rule says "in all other situations". What other situations are not covered in the above rules? I think they cover everything, right? Can you give an example of types x
and y
where UnionType
would be evaluated?
To simplify static type checking, a ``Literal[...]`` value is *not* | ||
considered assignable to a ``TypeExpr`` variable even if all of its members | ||
spell valid types: | ||
A value of ``Literal[...]`` type is *not* considered assignable to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way this is phrased, it still sounds like you're talking about the expression Literal[...]
(where ...
is some legal literal value like 1
or "hi"
). I think what you mean here is "a value expression whose evaluated type is a literal string expression". If I'm interpreting this correctly, then I agree with the rule, but I think it needs to be reworded because that's not what it currently says.
considered assignable to a ``TypeExpr`` variable even if all of its members | ||
spell valid types: | ||
A value of ``Literal[...]`` type is *not* considered assignable to | ||
a ``TypeExpr`` variable even if all of its members spell valid types because |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "TypeExpr
variable"? Is it a variable whose type is declared to be TypeExpr[T]
(or a union that includes such a subtype)? If so, what does it mean for a variable to have "members"?
Relationship with UnionType | ||
''''''''''''''''''''''''''' | ||
|
||
``TypeExpr[U]`` is a subtype of ``UnionType`` iff ``U`` is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I don't think this is a good solution. There are many ways that union types can be formed in static analysis. For example, they arise from "joins" in code flow. This doesn't necessarily mean that at runtime the value is implemented with UnionType
.
Consider the following:
x: TypeExpr[int | str]
if random() > 0.5:
x = int
reveal_type(x) # type[int]
else:
x = str
reveal_type(x) # type[str]
reveal_type(x) # TypeExpr[int | str]
print(x) # Will print either "<class 'int'>" or "<class 'str'>", never "UnionType" or "int | str"
y: UnionType = x # This would be problematic!
As another example, Literal[1, 2]
and Literal[1] | Literal[2]
are equivalent types. They are completely interchangeable from the perspective of a static type checker, but they have very different runtime representations. One is implemented as an instance of typing._LiteralGenericAlias
, and the other is a typing._UnionGenericAlias
.
@@ -711,12 +735,38 @@ assigned to variables and manipulated like any other data in a program: | |||
``TypeExpr[]`` is how you spell the type of a variable containing a | |||
type annotation object describing a type. | |||
|
|||
``TypeExpr[]`` is similar to ``type[]``, but ``type[]`` can only used to | |||
``TypeExpr[]`` is similar to ``type[]``, but ``type[]`` can only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the point you're trying to make here, but it's a little misleading because type
(when used in a type expression) works with many of the example in this list including type[list[int]]
or type[int | None]
.
spell simple **class objects** like ``int``, ``str``, ``list``, or ``MyClass``. | ||
``TypeExpr[]`` by contrast can additionally spell more complex types, | ||
including those with brackets (like ``list[int]``) or pipes (like ``int | None``), | ||
and including special types like ``Any``, ``LiteralString``, or ``Never``. | ||
|
||
A ``TypeExpr`` variable looks similar to a ``TypeAlias`` definition, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a "TypeExpr
variable"? I think what you mean is that a variable (or parameter) can be statically evaluated to have a type of TypeExpr[T]
?
By "TypeAlias
definition", are you talking about a statement of the form <name>: TypeAlias = <type expression>
, as define din PEP 613? I'm not sure how this is related to TypeExpr
.
Perhaps you're talking about PEP 484 type aliases that have the syntactic form <name> = <expression>
and numerous (undocumented) semantic rules and heuristics that distinguish it from a regular variable assignment? If that's the case, then I agree there's potential overlap with the TypeExpr
concept. In particular, I was thinking that we could leverage the definitions in this PEP to (at last!) formalize the rules for PEP 484 type aliases. I'm now less sure of this given some of the other limitations we've needed to add to this PEP, such as the requirement that certain ambiguous forms must use an explicit TypeExpr
constructor call.
spell simple **class objects** like ``int``, ``str``, ``list``, or ``MyClass``. | ||
``TypeExpr[]`` by contrast can additionally spell more complex types, | ||
including those with brackets (like ``list[int]``) or pipes (like ``int | None``), | ||
and including special types like ``Any``, ``LiteralString``, or ``Never``. | ||
|
||
A ``TypeExpr`` variable looks similar to a ``TypeAlias`` definition, but | ||
can only be used where a dynamic value is expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what "only be used where a dynamic value is expected"? What is a "dynamic value" in this context?
A ``TypeExpr`` variable looks similar to a ``TypeAlias`` definition, but | ||
can only be used where a dynamic value is expected. | ||
``TypeAlias`` (and the ``type`` statement) by contrast define a name that can | ||
be used where a fixed type is expected: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term "fixed type" isn't defined anywhere. I think what you mean is that a type alias can be used in a type expression whereas variables cannot?
|
||
:: | ||
|
||
maybe_float: TypeExpr = float | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I think I now understand what you were trying to say above. This is all a (confusing) way to reiterate that a variable cannot be used in a type expression. That rule is already spelled out clearly in the "Type Annotations" section of the spec, so I don't think it needs to be repeated in this PEP.
PEP 123: Summary of changes
)In particular:
📚 Documentation preview 📚: https://pep-previews--3856.org.readthedocs.build/