Skip to content

Feat/builtin models types #2590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

FallenDeity
Copy link

@FallenDeity FallenDeity commented Mar 30, 2025

I have made things!

This PR aims to add type hints to builtin model fields, i.e for example models in contrib, admin, auth etc.

Base generic fields are modified to use the default= from PEP-696 to allow the following behaviour for models

_ST_IntegerField = TypeVar("_ST_IntegerField", default=float | int | str | Combinable)
_GT_IntegerField = TypeVar("_GT_IntegerField", default=int)

class IntegerField(Field[_ST_IntegerField, _GT_IntegerField]):
    _pyi_private_set_type: float | int | str | Combinable
    _pyi_private_get_type: int
    _pyi_lookup_exact_type: str | int
class Redirect(models.Model):
    id: models.AutoField
    pk: models.AutoField
    site: models.ForeignKey[Site | Combinable, Site]
    site_id: int
    old_path: models.CharField
    new_path: models.CharField

It eliminates the need to add generic arguments explicitly at each step whenever defining models, and ensures all the models being used internally have type hints.

This PR is not complete yet as I have a few questions to ask as I am not entirely clear about the whole process here are a few points I had doubts on, and wanted some feedback before proceeding

  1. Is there any particular reason models for contrib/gis/db/backends are typed Any and not included in allowlist_todo.txt, since they seem to have types here, should I type out the model fields as such
    edit: models typed in line with source
class OracleGeometryColumns(models.Model):
    table_name: models.CharField
    column_name: models.CharField
    srid: models.IntegerField
    objects: ClassVar[Manager[Self]]
  1. I noticed the presence of most of the inbuilt model fields and methods in allowlist_todo.txt so how would that be handled do I write tests under the assert_type and typecheck/contrib folder for these models and move them to allowlist.txt manually if they are type hinted and tested?
    edit: Resolved created a category

  2. Final question, there is one case where the current model field typing system might need some explicit type hinting from the user in case of where fields become optional or with null=True, because with current system I don't think there is a way to infer types from the field parameters passed, Here is an example

edit: Resolved using __new__ overloads

text: models.CharField # get or return type is str, as expected
# But when params like blank=True, null=True are passed no way to autoinfer that and change _GT, _ST
# i.e without explicit typehints or generic params it would still show `str`, if we were to do `reveal_type`
# Redundant verbose example
text_nullable = models.CharField[Optional[Union[str, int, Combinable]], Optional[str]](max_length=100, null=True)
# A more generic user might have to do this
text_nullable = models.CharField[str | None, str | None](max_length=100, null=True)

This is fixable if we modify the django-mypy-plugin code but not sure if thats the best option, here is how one might go about it

def set_descriptor_types_for_field(
    ctx: FunctionContext, *, is_set_nullable: bool = False, is_get_nullable: bool = False
) -> Instance:
    default_return_type = cast(Instance, ctx.default_return_type)

    # Check for null_expr and primary key stuff
    ...

+  # We get expected nullable types here
    set_type, get_type = get_field_descriptor_types(
        default_return_type.type,
        is_set_nullable=is_set_nullable or is_nullable,
        is_get_nullable=is_get_nullable or is_nullable,
    )

    # reconcile set and get types with the base field class
    base_field_type = next(base for base in default_return_type.type.mro if base.fullname == fullnames.FIELD_FULLNAME)
    mapped_instance = map_instance_to_supertype(default_return_type, base_field_type)
+  # But mapped types give use the generic types we have in our fields without None
    mapped_set_type, mapped_get_type = tuple(get_proper_type(arg) for arg in mapped_instance.args)

    # bail if either mapped_set_type or mapped_get_type have type Never
    if not (isinstance(mapped_set_type, UninhabitedType) or isinstance(mapped_get_type, UninhabitedType)):
        # always replace set_type and get_type with (non-Any) mapped types
        set_type = helpers.convert_any_to_type(mapped_set_type, set_type)
        get_type = get_proper_type(helpers.convert_any_to_type(mapped_get_type, get_type))

-       # the get_type must be optional if the field is nullable
+      # Instead of nullable expression is set to True we make out mapped type optional
        if (is_get_nullable or is_nullable) and not (
            isinstance(get_type, NoneType) or helpers.is_optional(get_type) or isinstance(get_type, AnyType)
        ):
+            get_type = helpers.make_optional_type(get_type)
+           set_type = helpers.make_optional_type(set_type)
-            ctx.api.fail(
-               f"{default_return_type.type.name} is nullable but its generic get type parameter is not optional",
-                ctx.context,
-           )

    return helpers.reparametrize_instance(default_return_type, [set_type, get_type])

This was the only way I could think of going about it, and user needs to have the plugin installed.

Related issues

TODO

  • Add more tests
  • Work on the contrib/gis/backends models (most probably)
  • Add overloads per field for __new__ with null

@sobolevn sobolevn requested review from adamchainz and sobolevn March 30, 2025 08:57
Copy link
Member

@sobolevn sobolevn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is not a full review.

@FallenDeity
Copy link
Author

FallenDeity commented Mar 30, 2025

Resolved most of the comments so far

Additionally wanted to ask do I go ahead and also add types for the contrib/db/gis/backend models, wrt to here?

Edit: Added types and tests

@FallenDeity
Copy link
Author

FallenDeity commented Apr 11, 2025

@sobolevn I think we should be all done, should we also update the django_mypy_plugin to handle null=True?

there is one case where the current model field typing system might need some explicit type hinting from the user in case of where fields become optional or with null=True, because with current system I don't think there is a way to infer types from the field parameters passed, Here is an example

text: models.CharField # get or return type is str, as expected
# But when params like null=True is passed no way to autoinfer that and change _GT, _ST
# i.e without explicit typehints or generic params it would still show `str`, if we were to do `reveal_type`
# Typehinting for null=True would require manual typehints like the following:
# Redundant verbose example
text_nullable = models.CharField[Optional[Union[str, int, Combinable]], Optional[str]](max_length=100, null=True)
# A more generic user might have to do this
text_nullable = models.CharField[str | None, str | None](max_length=100, null=True)

@sobolevn
Copy link
Member

null=True should already work 🤔

@sobolevn sobolevn requested review from intgr and flaeppe April 11, 2025 18:39
@FallenDeity
Copy link
Author

FallenDeity commented Apr 11, 2025

null=True should already work 🤔

Thats because of the extension too, here is how the extension handles nulls, first it checks if the expression is nullable then based on that it fetches set_type and get_type making them nullable

set_type, get_type = get_field_descriptor_types(
        default_return_type.type,
        is_set_nullable=is_set_nullable or is_nullable,
        is_get_nullable=is_get_nullable or is_nullable,
    )

after that it fetches mapped types which is the types if generic params are provided and replace the default types gotten above with the mapped types

# bail if either mapped_set_type or mapped_get_type have type Never
    if not (isinstance(mapped_set_type, UninhabitedType) or isinstance(mapped_get_type, UninhabitedType)):
        # always replace set_type and get_type with (non-Any) mapped types
        set_type = helpers.convert_any_to_type(mapped_set_type, set_type)
        get_type = get_proper_type(helpers.convert_any_to_type(mapped_get_type, get_type))

the reason it works currently is because mapped types are Never and just the default get_type and set_type are returned

from django.db import models
from datetime import date as Date
from typing_extensions import reveal_type

class MyUser(models.Model):
    date = models.DateField(null=True)
    my_date = models.DateField[Date, Date](null=True)

u = MyUser()
reveal_type(u.date) # unknown for pyright/pylance, date | None for mypy
reveal_type(u.my_date) # type date for both
mypy .\test.py
null_expr=<mypy.nodes.NameExpr object at 0x00000115725D7CE0>, is_nullable=True
set_type=Union[builtins.str, datetime.date, django.db.models.expressions.Combinable, None], get_type=Union[datetime.date, None]
mapped_set_type=Never, mapped_get_type=Never
null_expr=<mypy.nodes.NameExpr object at 0x00000115725D7F60>, is_nullable=True
set_type=Union[builtins.str, datetime.date, django.db.models.expressions.Combinable, None], get_type=Union[datetime.date, None]
mapped_set_type=datetime.date, mapped_get_type=datetime.date
set_type=datetime.date, get_type=datetime.date
test.py:7: error: DateField is nullable but its generic get type parameter is not optional  [misc]
test.py:11: note: Revealed type is "Union[datetime.date, None]"
test.py:12: note: Revealed type is "datetime.date"

As you can see if types are mapped the default expression types are overwritten which is whats happening since all our fields are now inherently mapped with generic type params

So to fix this I think we can just make the mapped types optional too if the field is nullable

@FallenDeity FallenDeity requested a review from sobolevn April 11, 2025 19:59
@FallenDeity
Copy link
Author

@sobolevn any thoughts on this? should we have the plugin enforce null for us if null=True is passed and make the mapped types nullable? I don't see any drawbacks to this approach

@sobolevn
Copy link
Member

should we have the plugin enforce null for us if null=True is passed and make the mapped types nullable?

We 100% should, but I am pretty sure that it already works. If it does not, please open a new issue.

@FallenDeity
Copy link
Author

FallenDeity commented Apr 17, 2025

should we have the plugin enforce null for us if null=True is passed and make the mapped types nullable?

We 100% should, but I am pretty sure that it already works. If it does not, please open a new issue.

It works with the current fields types because the generic parameters are unknown as soon as you add those generic type parameters it overwrites the nullable expression with the generic parameters provided instead, which what's happening here since each field now have specified generic params such as _ST_IntegerField

I'll open a new issue regarding this and reference it here

@sobolevn
Copy link
Member

Sorry, I missed this message #2590 (comment)

Reading!

@sobolevn
Copy link
Member

Oh, now I see: my_date = models.DateField[Date, Date](null=True) that's the cornercase that does not quite work. I think that we should not change Date here, because it is explicit. But, we can raise an error in our plugin that it is expected to have | None of null=True is specified.

@FallenDeity
Copy link
Author

FallenDeity commented Apr 17, 2025

Oh, now I see: my_date = models.DateField[Date, Date](null=True) that's the cornercase that does not quite work. I think that we should not change Date here, because it is explicit. But, we can raise an error in our plugin that it is expected to have | None of null=True is specified.

But currently our internal fields are all typed with default types meaning our stubs already defines IntegerField as such for example IntegerField[int, int] and in such a case if someone does IntegerField(null=True) we need to make it nullable (which currently dosent happen due to reasons mentioned above), so now as a solution a user can either

  • add explicit typehints wherever null is used IntegerField[int | None, int | None](null=True) in their code (user side)
  • or the plugin handles and makes it nullable

The only cornercase in soln 2 (preferred to reduce redundancy) is that if user defines explicit typehints our plugin may override that and make it nullable if null=True is defined (imo thats expected behaviour) since currently I dont think there is a way to tell if the generic type params are coming from django-stubs types or user codebase, due to this raising an error might not work unless we can determine whether these explicit type hints are from stubs or user

maybe we can use inspect.getFile() or .__module__ on the generic type params not sure if its reliable or a good way though

@FallenDeity
Copy link
Author

One thing I noticed is that if the generic types are user defined the following mypyc attrs are filled or populated in the mypyc plugin, information such a line, column etc. Otherwise if they are defined on our side in the stubs __init__.pyi these mypyc attrs fields are -1 this can be a way to identify whether generic params were explicitly defined by the user and if so we leave them alone else if it's defined in the stubs we make them nullable if null=True

Not sure how foolproof or reliable this method is any thoughts @sobolevn ?

https://pastebin.com/7MMsJTMs Here is an example where the types were user defined, you see that line no. and column is populated, whereas in stubs it would just show -1 for both

  • user mapped types
from django.db import models
from typing_extensions import reveal_type


class TestModel(models.Model):
    my_field = models.CharField[str, str](null=True)


t = TestModel()
reveal_type(t.my_field)
  • stub mapped types
from django.db import models
from typing_extensions import reveal_type


class TestModel(models.Model):
    my_field = models.CharField(null=True)


t = TestModel()
reveal_type(t.my_field)

Difference between the 2 different types of type definition can be seen here https://www.diffchecker.com/0ujD9s30/

@FallenDeity
Copy link
Author

@sobolevn any ideas? What are your thoughts on the different options we can approach with in regard to this, I feel like using some meta properties from mypyc might be a bit too hacky of a soln? We could alternatively just make them nullable regardless of where the generic params are coming from and users can force a typing.cast if they are sure it's not nullable via virtue of their code, since django internals wise the model field typing returned is accurate if null=True .

@sobolevn
Copy link
Member

Couple of thoughts:

  • This should be an error -> models.CharField[str, str](null=True)
    Just like this case:
class A[T]:
    def __init__(self, x: T) -> None:
        self.x = x
        
A[int]('a')

https://mypy-play.net/?mypy=latest&python=3.12&gist=ccb14fd0cc0bf9711ba4d513c16adf5a

  • Checking for this in a plugin looks doable. We can start with this in a separate PR.

  • We should not modify type params in the plugin, ever. It is hard, pretty unstable, error-prone.

@FallenDeity
Copy link
Author

FallenDeity commented Apr 30, 2025

@sobolevn I think there is a misunderstanding here check this mypy example for context:
https://mypy-play.net/?mypy=latest&python=3.12&gist=420f15b0129632baeaeae2225ce8299b

What I am trying to say is as of now the mypy plugin uses the fields _pyi_private_get_type, _pyi_private_set_type to infer types for fields, and currently they are made nullable by the plugin if null=True check this

descriptor_type = make_optional_type(descriptor_type)

This works well and good since currently the mapped types for generics are always Unknown or Never, since we didn't provide the generic params before. In which case currently the types are remapped or changed to the mapped get/set types, if the user ever provides them in their code since they take priority.

if not (isinstance(mapped_set_type, UninhabitedType) or isinstance(mapped_get_type, UninhabitedType)):
# always replace set_type and get_type with (non-Any) mapped types
set_type = helpers.convert_any_to_type(mapped_set_type, set_type)
get_type = get_proper_type(helpers.convert_any_to_type(mapped_get_type, get_type))

Now I completely agree with you that above cases as you mentioned should raise errors, but right now the dilemma is that we are not using the default get/set types (i.e ones inferred from _pyi_private_get_type for example)

set_type, get_type = get_field_descriptor_types(
Instead we already have default mapped types instead from the TypeVars default= parameter, so the issue rises in the following points:

  • What happens if null=True and user has not supplied any types like the following, since in this case the types are already provided internally from _ST_CharField and _GT_CharField. These types are defined as such
    _ST_CharField = TypeVar("_ST_CharField", default=str | int | Combinable)
    _GT_CharField = TypeVar("_GT_CharField", default=str)
class A(models.Model):
    text_nullable = models.CharField(null=True)
    
a = A()
# So now this is just str instead of str | None
reveal_type(a.text_nullable)

The above is basically the main issue that we are encountering, since we need to make our own inbuilt mapped types nullable somehow via the plugin. Which imo is a valid use case for the plugin, but as a side effect if we make this decision this can give rise to the following specific edge case

  • Say now hypothetically we make the mapped types nullable as well based on null= which we are already doing for default types inferred ( _pyi_private_get/set_type) via the plugin. If in a particular case user supplies null=True and maps their own types, we are in risk or danger of making their mapped types nullable as well. Since I am unable to find a reliable of telling whether the types are coming from the user or internally (i.e _ST/GT_CharField)
class A(models.Model):
    # No reliable way to tell if the mapped types are from user code here, or from the stubs .pyi file
    text_nullable = models.CharField[str, str](null=True)  # wrong typehint we should raise an error
   
a = A()
reveal_type(a.text_nullable)

I hope this clears it up a bit this is kind of a chicken and egg situation 😅

@sobolevn
Copy link
Member

It feels like we should not use default= from TypeVar, because it does not really work in this case with null=True modifing the default.

@FallenDeity
Copy link
Author

It feels like we should not use default= from TypeVar, because it does not really work in this case with null=True modifing the default.

This isn't really a default= issue either the mapped types would be present no matter which method we use to pass generic arguments to the inbuilt model fields, even if we don't use default and manually type out the fields. This is more of a plugin issue since the plugin doesn't expect mapped generic types from the stub files and uses an internal logic to extract types from _pyi_private_get/set_type and considers them default.
For pyright/pylance to pick up the types we would still need to map the fields to their generic argument at some point which the plugin is not meshing well with for this particular edge case.

@intgr @flaeppe you folks have had some prior experience with this in #2214 any ideas on how to approach this?

@FallenDeity
Copy link
Author

FallenDeity commented Apr 30, 2025

One solution to this can be to use typing.overload and write overloads for Field overloading the __new__ method like the following, this also eliminates any plugin logic for handling nulls

from __future__ import annotations

import typing as t


_ST = t.TypeVar("_ST")
_GT = t.TypeVar("_GT")


class Field(t.Generic[_ST, _GT]):
    @t.overload
    def __new__(cls, *, null: t.Literal[True]) -> Field[_ST | None, _GT | None]:
        ...

    @t.overload
    def __new__(cls, *, null: t.Literal[False] = False) -> Field[_ST, _GT]:
        ...

    def __new__(cls, *, null: bool = False) -> Field[_ST, _GT] | Field[_ST | None, _GT | None]:
        return super().__new__(cls)
    
    def __init__(self, *, null: bool = False) -> None:
        super().__init__()


_ST_CharField = t.TypeVar("_ST_CharField", default=str | int)
_GT_CharField = t.TypeVar("_GT_CharField", default=str)


class CharField(Field[_ST_CharField, _GT_CharField]):
    ...
    

x = CharField(null=False)
reveal_type(x)
y = CharField(null=True)
reveal_type(y)
z = CharField()
reveal_type(z)

image

edit: This seems to work with pyright but not mypy python/mypy#15220 😭 , also this one python/mypy#17251, If we change the order of methods and redefine the overloads locally for each field individually mypy seems to work (this is due to mypy using mro to pick between init and new for return types 😭 )

Overall after a bunch of edits this works on both mypy and pyright, https://mypy-play.net/?mypy=latest&python=3.12&gist=1f9176a8c2bef850b4e9a5d9af2e8211 only downside is __new__ overloads needs to be defined for all fields individually, and any existing __init__'s need to be switched to __new__'s since mypy prefers __init__ over __new__ for a return type in case of a tie which doesn't make a lot of sense
https://github.com/python/mypy/blob/c724a6a806655f94d0c705a7121e3d671eced96d/mypy/typeops.py#L148-L149

@FallenDeity
Copy link
Author

@sobolevn I made the migration for fields to overload __new__ based on null, I think all the objectives of the pr as mentioned initially are completed. As of now with this pr all models and fields would be typed using just pyright/pylance and mypy out of the box with django-stubs installed. The plugin is not necessary for field types anymore. Please review whenever you are free 😅

Also thanks for your incredible patience, much appreciated 😄

Copy link
Contributor

@UnknownPlatypus UnknownPlatypus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gave a look because this is really exciting, noticed a few minor things so I left a few minor comments

@huynguyengl99
Copy link

Hi @FallenDeity, I suggest defining the __new__ overload for the Field which define/override the __init__ only, like Field, DecimalField, CharField, etc. (not IntegerField, SmallIntegerField, etc. because they do not override __init__ function). For example, here will be the definition of Field's __new__ (we keep __init__ as it is):

@overload
def __new__(
    cls,
    null: Literal[False] = False,
    # ... other params
) -> Self: ...

@overload
def __new__(
    cls,
    null: Literal[True] = True,
    # ... other params
) -> Field[_ST | None, _GT | None]: ...

If we do that, the IntegerField and other ones which do not override __init__ do not need to override the __new__ again and again, so that the code will be more compact.

The problem I see is that if a field inherits and defines extra info, they might not work because we override the return type as Field type. I have tried to use generic Type[_F] for cls and return as _F[_ST | None, _GT | None] but unfortunately it does not work. So in that case, users need to define their __new__ again to return their class.

Also I tried to use TypedDict to define params, like:

class BaseFieldParams(TypedDict, total=False):
    verbose_name: _StrOrPromise | None
    name: str | None
    primary_key: bool
    max_length: int | None

class NonNullableParams(BaseFieldParams):
    null: Literal[False]

class NullableParams(BaseFieldParams):
    null: Literal[True]

and use like:

def __init__(self, **kwargs: Unpack[BaseFieldParams]) -> None: ...

@overload
def __new__(cls, **kwargs: Unpack[NonNullableParams]) -> Self: ...

@overload
def __new__(cls: type[_F], **kwargs: Unpack[NullableParams]) -> Field[_ST | None, _GT | None]: ...

But it does not work, too. So I think the only working thing now is just to define the __new__ function for all Field classes which declare __init__ function. How do you feel about this suggestion?

@FallenDeity
Copy link
Author

@huynguyengl99 While what you suggested is what I would've preferred myself too, to avoid repeating the __new__ signatures but unfortunately mypy requires explicit repetition, i.e overloaded signatures are not inherited.

@huynguyengl99
Copy link

huynguyengl99 commented May 13, 2025

Hi @FallenDeity, I'm not sure if I understand you correctly, but unless we redefine the __init__, we need to redefine the __new__. Otherwise, for normal inheritance, like IntegerField inheriting from Field, we don't need to redefine the __init__ function, and it will still work fine.

For example, this case will work following the mypy example from the link, due to not reimplementing fun:

from typing import overload
class Foo:
    @overload
    def fun(self, s: str) -> str:
        ...
    @overload
    def fun(self, i: int) -> int:
        ...
    def fun(self, x):
        ...
class Bar(Foo):
    ...
Bar().fun(0.1)

#  error: No overload variant of "fun" of "Foo" matches argument type "float"

So, in our case, the Field without reimplementing __init__, like IntegerField:

class IntegerField(Field[_ST, _GT]):
    _pyi_private_set_type: float | int | str | Combinable
    _pyi_private_get_type: int
    _pyi_lookup_exact_type: str | int

It should work normally because we don't reimplement the __init__, so it will inherit all the overloads of the base class, aka the Field.

For fields like DecimalField, which redefine __init__, we need to redefine all the __new__ overloads.

Is that correct, or am I misunderstanding your point?

@FallenDeity
Copy link
Author

@huynguyengl99 there are a few issues with what you are suggesting if I am interpreting it correctly, you are suggesting something similar to this

Second limitation is if we do define all the __new__ in the base Field class all return types of subclassed fields would now be Field[_ST_WhateverField, _GT_WhateverField], for this particular case Self was introduced but Self doesn't support type annotations and its recommended to go with explicit types https://peps.python.org/pep-0673/#use-in-generic-classes

image

So overall I think having explicit annotations is probably the best move, given similar scenarios and cases mentioned in typing specs and related issues. Also in a related scenario django-types a fork of django-stubs also seems to be doing something very similar to support inbuilt model typehints https://github.com/sbdchd/django-types/blob/main/django-stubs/db/models/fields/__init__.pyi

lemme know if I mentioned something inaccurate anywhere 😅

@huynguyengl99
Copy link

Hi @FallenDeity, you are right. It's my fault because I only tested using VSCode/Pyright instead of mypy. It's quite interesting that Pyright can detect variable y as a nullable field.

image

However, in my opinion, re-annotating the __new__ method for all fields, which would significantly increase the size of the type file (__init__.pyi), is not a good approach. So should we find a better solution for this?

@FallenDeity
Copy link
Author

FallenDeity commented May 14, 2025

I doubt a 1-2 kB increase is gonna affect the typechecker (pylance/pyright) by any noticeable amount. And even so the pyi files are indexed at the start when opening the workspace so it's one time very very negligible cost. But if you do stumble upon a better solution or alternative we could try out that too, unfortunately I don't think any other method is possible which keeps both mypy and pyright compatibility at the same time. There aren't too many advantages to an alternative other than reducing the repeating amount of boilerplate code (if that's possible)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants