-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP 747: Fix rules related to UnionType (T1 | T2). Contrast TypeExpr with TypeAlias. Apply other feedback. #3856
Changes from all commits
34a7b65
1cb02d0
46842ba
7b7e45e
515ecb1
26204a7
71b5c1a
c20c9b1
463a133
7863e2f
af8ea3d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -254,9 +254,8 @@ A ``TypeExpr`` value represents a :ref:`type expression <typing:type-expression> | |
such as ``str | None``, ``dict[str, int]``, or ``MyTypedDict``. | ||
A ``TypeExpr`` type is written as | ||
``TypeExpr[T]`` where ``T`` is a type or a type variable. It can also be | ||
written without brackets as just ``TypeExpr``, in which case a type | ||
checker should apply its usual type inference mechanisms to determine | ||
the type of its argument, possibly ``Any``. | ||
written without brackets as just ``TypeExpr``, which is treated the same as | ||
to ``TypeExpr[Any]``. | ||
|
||
|
||
Using TypeExprs | ||
|
@@ -278,7 +277,6 @@ or a variable type: | |
:: | ||
|
||
STR_TYPE: TypeExpr = str # variable type | ||
assert_type(STR_TYPE, TypeExpr[str]) | ||
|
||
Note however that an *unannotated* variable assigned a type expression literal | ||
will not be inferred to be of ``TypeExpr`` type by type checkers because PEP | ||
|
@@ -352,7 +350,7 @@ not spell a type are not ``TypeExpr`` values. | |
:: | ||
|
||
OPTIONAL_INT_TYPE: TypeExpr = TypeExpr[int | None] # OK | ||
assert isassignable(Optional[int], OPTIONAL_INT_TYPE) | ||
assert isassignable(int | None, OPTIONAL_INT_TYPE) | ||
|
||
.. _non_universal_typeexpr: | ||
|
||
|
@@ -442,14 +440,29 @@ so must be disambiguated based on its argument type: | |
- As a value expression, ``Annotated[x, ...]`` has type ``object`` | ||
if ``x`` has a type that is not ``type[C]`` or ``TypeExpr[T]``. | ||
|
||
**Union**: The type expression ``T1 | T2`` is ambiguous with the value ``int1 | int2``, | ||
so must be disambiguated based on its argument type: | ||
**Union**: The type expression ``T1 | T2`` is ambiguous with | ||
the value ``int1 | int2``, ``set1 | set2``, ``dict1 | dict2``, and more, | ||
so must be disambiguated based on its argument types: | ||
|
||
- As a value expression, ``x | y`` has type equal to the return type of ``type(x).__or__`` | ||
if ``type(x)`` overrides the ``__or__`` method. | ||
|
||
- When ``x`` has type ``builtins.type``, ``types.GenericAlias``, or the | ||
internal type of a typing special form, ``type(x).__or__`` has a return type | ||
in the format ``TypeExpr[T1 | T2]``. | ||
|
||
- As a value expression, ``x | y`` has type equal to the return type of ``type(y).__ror__`` | ||
if ``type(y)`` overrides the ``__ror__`` method. | ||
|
||
- When ``y`` has type ``builtins.type``, ``types.GenericAlias``, or the | ||
internal type of a typing special form, ``type(y).__ror__`` has a return type | ||
in the format ``TypeExpr[T1 | T2]``. | ||
|
||
- As a value expression, ``x | y`` has type ``TypeExpr[x | y]`` | ||
if ``x`` has type ``TypeExpr[t1]`` (or ``type[t1]``) | ||
and ``y`` has type ``TypeExpr[t2]`` (or ``type[t2]``). | ||
- As a value expression, ``x | y`` has type ``int`` | ||
if ``x`` has type ``int`` and ``y`` has type ``int`` | ||
- As a value expression, ``x | y`` has type ``UnionType`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This rule says "in all other situations". What other situations are not covered in the above rules? I think they cover everything, right? Can you give an example of types There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There are no other situations that I am aware of. |
||
in all other situations. | ||
|
||
- This rule is intended to be consistent with the preexisting fallback rule | ||
used by static type checkers. | ||
|
||
The **stringified type expression** ``"T"`` is ambiguous with both | ||
the stringified annotation expression ``"T"`` | ||
|
@@ -466,71 +479,24 @@ New kinds of type expressions that are introduced should define how they | |
will be recognized in a value expression context. | ||
|
||
|
||
Implicit Annotation Expression Values | ||
''''''''''''''''''''''''''''''''''''' | ||
|
||
Although this PEP is mostly concerned with *type expressions* rather than | ||
*annotation expressions*, it is straightforward to extend the rules for | ||
:ref:`recognizing type expressions <implicit_typeexpr_values>` | ||
to similar rules for recognizing annotation expressions, | ||
so this PEP takes the opportunity to define those rules as well: | ||
|
||
The following **unparameterized annotation expressions** can be recognized unambiguously: | ||
|
||
- As a value expression, ``X`` has type ``object``, | ||
for each of the following values of X: | ||
|
||
- ``<TypeAlias>`` | ||
|
||
The following **parameterized annotation expressions** can be recognized unambiguously: | ||
|
||
- As a value expression, ``X`` has type ``object``, | ||
for each of the following values of X: | ||
|
||
- ``<Required> '[' ... ']'`` | ||
- ``<NotRequired> '[' ... ']'`` | ||
- ``<ReadOnly> '[' ... ']'`` | ||
- ``<ClassVar> '[' ... ']'`` | ||
- ``<Final> '[' ... ']'`` | ||
- ``<InitVar> '[' ... ']'`` | ||
- ``<Unpack> '[' ... ']'`` | ||
|
||
**Annotated**: The annotation expression ``Annotated[...]`` is ambiguous with | ||
the type expression ``Annotated[...]``, | ||
so must be :ref:`disambiguated based on its argument type <recognizing_annotated>`. | ||
|
||
The following **syntactic annotation expressions** | ||
cannot be recognized in a value expression context at all: | ||
|
||
- ``'*' unpackable`` | ||
- ``name '.' 'args'`` (where ``name`` must be an in-scope ParamSpec) | ||
- ``name '.' 'kwargs'`` (where ``name`` must be an in-scope ParamSpec) | ||
|
||
The **stringified annotation expression** ``"T"`` is ambiguous with both | ||
the stringified type expression ``"T"`` | ||
and the string literal ``"T"``, and | ||
cannot be recognized in a value expression context at all: | ||
|
||
- As a value expression, ``"T"`` continues to have type ``Literal["T"]``. | ||
|
||
No other kinds of annotation expressions currently exist. | ||
|
||
New kinds of annotation expressions that are introduced should define how they | ||
will (or will not) be recognized in a value expression context. | ||
|
||
|
||
Literal[] TypeExprs | ||
''''''''''''''''''' | ||
|
||
To simplify static type checking, a ``Literal[...]`` value is *not* | ||
considered assignable to a ``TypeExpr`` variable even if all of its members | ||
spell valid types: | ||
A value of ``Literal[...]`` type is *not* considered assignable to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The way this is phrased, it still sounds like you're talking about the expression There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hopefully the example following the paragraph helps clarify the meaning. |
||
a ``TypeExpr`` variable even if all of its members spell valid types because | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is a " There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Again: Hopefully the example following the paragraph helps clarify the meaning. |
||
dynamic values are not allowed in type expressions: | ||
|
||
:: | ||
|
||
STRS_TYPE_NAME: Literal['str', 'list[str]'] = 'str' | ||
STRS_TYPE: TypeExpr = STRS_TYPE_NAME # ERROR: Literal[] value is not a TypeExpr | ||
|
||
However ``Literal[...]`` itself is still a ``TypeExpr``: | ||
|
||
:: | ||
|
||
DIRECTION_TYPE: TypeExpr[Literal['left', 'right']] = Literal['left', 'right'] # OK | ||
|
||
|
||
Static vs. Runtime Representations of TypeExprs | ||
''''''''''''''''''''''''''''''''''''''''''''''' | ||
|
@@ -569,19 +535,35 @@ Subtyping | |
Whether a ``TypeExpr`` value can be assigned from one variable to another is | ||
determined by the following rules: | ||
|
||
Relationship with type | ||
'''''''''''''''''''''' | ||
|
||
``TypeExpr[]`` is covariant in its argument type, just like ``type[]``: | ||
|
||
- ``TypeExpr[T1]`` is a subtype of ``TypeExpr[T2]`` iff ``T1`` is a | ||
subtype of ``T2``. | ||
- ``type[C1]`` is a subtype of ``TypeExpr[C2]`` iff ``C1`` is a subtype | ||
of ``C2``. | ||
|
||
A plain ``type`` can be assigned to a plain ``TypeExpr`` but not the | ||
other way around: | ||
An unparameterized ``type`` can be assigned to an unparameterized ``TypeExpr`` | ||
but not the other way around: | ||
|
||
- ``type[Any]`` is assignable to ``TypeExpr[Any]``. (But not the | ||
other way around.) | ||
|
||
Relationship with UnionType | ||
''''''''''''''''''''''''''' | ||
|
||
``TypeExpr[U]`` is a subtype of ``UnionType`` iff ``U`` is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm. I don't think this is a good solution. There are many ways that union types can be formed in static analysis. For example, they arise from "joins" in code flow. This doesn't necessarily mean that at runtime the value is implemented with Consider the following: x: TypeExpr[int | str]
if random() > 0.5:
x = int
reveal_type(x) # type[int]
else:
x = str
reveal_type(x) # type[str]
reveal_type(x) # TypeExpr[int | str]
print(x) # Will print either "<class 'int'>" or "<class 'str'>", never "UnionType" or "int | str"
y: UnionType = x # This would be problematic! As another example, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Interesting. It sounds like I'll have to remove the following rule entirely:
A consequence is that a function parameter annotated as
|
||
the type expression ``X | Y | ...``: | ||
|
||
- ``TypeExpr[X | Y | ...]`` is a subtype of ``UnionType``. | ||
|
||
``UnionType`` is assignable to ``TypeExpr[Any]``. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, this seems problematic because |
||
|
||
Relationship with object | ||
'''''''''''''''''''''''' | ||
|
||
``TypeExpr[]`` is a kind of ``object``, just like ``type[]``: | ||
|
||
- ``TypeExpr[T]`` for any ``T`` is a subtype of ``object``. | ||
|
@@ -623,11 +605,33 @@ Changed signatures | |
'''''''''''''''''' | ||
|
||
The following signatures related to type expressions introduce | ||
``TypeExpr`` where previously ``object`` existed: | ||
``TypeExpr`` where previously ``object`` or ``Any`` existed: | ||
|
||
- ``typing.cast`` | ||
- ``typing.assert_type`` | ||
|
||
The following signatures transforming union type expressions introduce | ||
``TypeExpr`` where previously ``UnionType`` existed so that a more-precise | ||
``TypeExpr`` type can be inferred: | ||
|
||
- ``builtins.type[T].__or__`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Type checkers already special-case the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was assuming that static type checkers actually read typeshed to figure out what to do with the Thus the typeshed modifications I've proposed here are intended to get the correct effect (such that the value expression |
||
|
||
- Old: ``def __or__(self, value: Any, /) -> types.UnionType: ...`` | ||
- New: ``def __or__[T2](self, value: TypeExpr[T2], /) -> TypeExpr[T | T2]: ...`` | ||
|
||
- ``builtins.type[T].__ror__`` | ||
|
||
- Old: ``def __ror__(self, value: Any, /) -> types.UnionType: ...`` | ||
- New: ``def __ror__[T1](self, value: TypeExpr[T1], /) -> TypeExpr[T1 | T]: ...`` | ||
|
||
- ``types.GenericAlias.{__or__,__ror__}`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Type checkers never look at |
||
- «the internal type of a typing special form»``.{__or__,__ror__}`` | ||
|
||
However the implementations of those methods continue to return ``UnionType`` | ||
instances at runtime so that runtime ``isinstance`` checks like | ||
``isinstance('42', int | str)`` and ``isinstance(int | str, UnionType)`` | ||
continue to work. | ||
|
||
|
||
Unchanged signatures | ||
'''''''''''''''''''' | ||
|
@@ -662,12 +666,32 @@ not propose those changes now: | |
|
||
- Returns annotation expressions | ||
|
||
The following signatures accepting union type expressions continue | ||
to use ``UnionType``: | ||
|
||
- ``builtins.isinstance`` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Type checkers already need to do significant special-casing for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sounds good to me. I think Jelle made a similar suggestion elsewhere in the comments for this PR. |
||
- ``builtins.issubclass`` | ||
- ``typing.get_origin`` (used in an ``@overload``) | ||
|
||
The following signatures transforming union type expressions continue | ||
to use ``UnionType`` because it is not possible to infer a more-precise | ||
``TypeExpr`` type: | ||
|
||
- ``types.UnionType.{__or__,__ror__}`` | ||
|
||
|
||
Backwards Compatibility | ||
======================= | ||
|
||
Previously the rules for recognizing type expression objects | ||
in a value expression context were not defined, so static type checkers | ||
As a value expression, ``X | Y`` previously had type ``UnionType`` (via :pep:`604`) | ||
but this PEP gives it the more-precise static type ``TypeExpr[X | Y]`` | ||
(a subtype of ``UnionType``) while continuing to return a ``UnionType`` instance at runtime. | ||
Preserving compability with ``UnionType`` is important because ``UnionType`` | ||
supports ``isinstance`` checks, unlike ``TypeExpr``, and existing code relies | ||
on being able to perform those checks. | ||
|
||
The rules for recognizing other kinds of type expression objects | ||
in a value expression context were not previously defined, so static type checkers | ||
`varied in what types were assigned <https://discuss.python.org/t/typeform-spelling-for-a-type-annotation-object-at-runtime/51435/34>`_ | ||
to such objects. Existing programs manipulating type expression objects | ||
were already limited in manipulating them as plain ``object`` values, | ||
|
@@ -711,12 +735,38 @@ assigned to variables and manipulated like any other data in a program: | |
``TypeExpr[]`` is how you spell the type of a variable containing a | ||
type annotation object describing a type. | ||
|
||
``TypeExpr[]`` is similar to ``type[]``, but ``type[]`` can only used to | ||
``TypeExpr[]`` is similar to ``type[]``, but ``type[]`` can only | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the point you're trying to make here, but it's a little misleading because There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Huh. Indeed I see the following passes typechecking in both mypy and pyright:
It's quite surprising to me that lines So maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm inclined to think that both the Similar bugs were opened against mypy where There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
These are not special forms. From the perspective of a static type checker, the the type expression |
||
spell simple **class objects** like ``int``, ``str``, ``list``, or ``MyClass``. | ||
``TypeExpr[]`` by contrast can additionally spell more complex types, | ||
including those with brackets (like ``list[int]``) or pipes (like ``int | None``), | ||
and including special types like ``Any``, ``LiteralString``, or ``Never``. | ||
|
||
A ``TypeExpr`` variable looks similar to a ``TypeAlias`` definition, but | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is a " By " Perhaps you're talking about PEP 484 type aliases that have the syntactic form There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll alter the original line with inline examples like:
Hopefully also the examples following the paragraph help clarify the meaning. |
||
can only be used where a dynamic value is expected. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't understand what "only be used where a dynamic value is expected"? What is a "dynamic value" in this context? |
||
``TypeAlias`` (and the ``type`` statement) by contrast define a name that can | ||
be used where a fixed type is expected: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The term "fixed type" isn't defined anywhere. I think what you mean is that a type alias can be used in a type expression whereas variables cannot? |
||
|
||
- Okay, but discouraged in Python 3.12+: | ||
|
||
:: | ||
|
||
MaybeFloat: TypeAlias = float | None | ||
def sqrt(n: float) -> MaybeFloat: ... | ||
|
||
- Yes: | ||
|
||
:: | ||
|
||
type MaybeFloat = float | None | ||
def sqrt(n: float) -> MaybeFloat: ... | ||
|
||
- No: | ||
|
||
:: | ||
|
||
maybe_float: TypeExpr = float | None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, I think I now understand what you were trying to say above. This is all a (confusing) way to reiterate that a variable cannot be used in a type expression. That rule is already spelled out clearly in the "Type Annotations" section of the spec, so I don't think it needs to be repeated in this PEP. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 2 non-expert commentators have been confused about when to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the intent, but I don't think it's achieving the clarity you're seeking. I'm finding it to be very confusing. I think it could be made clear if it were reworked, but I'm not convinced this paragraph is needed. Type alias definitions and the |
||
def sqrt(n: float) -> maybe_float: ... # ERROR: Can't use TypeExpr value in a type annotation | ||
|
||
It is uncommon for a programmer to define their *own* function which accepts | ||
a ``TypeExpr`` parameter or returns a ``TypeExpr`` value. Instead it is more common | ||
for a programmer to pass a literal type expression to an *existing* function | ||
|
@@ -891,8 +941,9 @@ The following will be true when | |
`mypy#9773 <https://github.com/python/mypy/issues/9773>`__ is implemented: | ||
|
||
The mypy type checker supports ``TypeExpr`` types. | ||
A reference implementation of the runtime component is provided in the | ||
``typing_extensions`` module. | ||
|
||
A reference implementation of the runtime component is provided in the | ||
``typing_extensions`` module. | ||
|
||
|
||
Rejected Ideas | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to interpret this statement. The phrase "has type" isn't clear. Are you talking about type equivalence? Assignability?
Also, are these static types or runtime types? I presume it's static types, but if that's the case, then I don't know why
GenericAlias
is mentioned because that's not a type a static type checker would ever evaluate. It's a runtime implementation detail.What if the static type of
x
is a union, and some of the subtypes have a custom__or__
override and some do not? Presumably, this formulation assumes that an expansion of the types ofx
andy
has already been performed, andx
andy
are not union types?What if the
__or__
method is present, but evaluating it generates a type error (e.g. becausey
's type is incompatible with the signature)?We can try to hammer out all of these details, but this is getting really complex. One option is to say that unions never evaluate to
TypeExpr
unless you use aTypeExpr
constructor (i.e.TypeExpr(x | y)
). This would also avoid the issue withUnionType
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to say
x | y
is evaluated (still) as a normal value expression. If the type ofx
istype
orGenericAlias
then the signature in typeshed will be used, which will say that aTypeExpr[X | Y]
is returned, as described in §"Changed signatures".Use whatever behavior is used now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evaluated by whom? A static type checker? If so, the expression will never be evaluated as
GenericAlias
because that's not something a static type checker knows or cares about (nor should it).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your later comments, I now understand that at least pyright special-cases the
|
operator rather than using typeshed's definitions (for__or__
and__ror__
) and using the regular rules for calling an overloaded method.If there's a desire to continue special-casing the
|
operator, I might need some help from you to transcribe all the current rules for the|
operator (which I doubt are in any specification) so that I can effectively propose a minimal diff to those rules.