Skip to content

✨: add CanArrayX protocols #32

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 30 commits into
base: main
Choose a base branch
from
Open

Conversation

nstarman
Copy link
Collaborator

@nstarman nstarman commented Jun 22, 2025

No description provided.

@nstarman
Copy link
Collaborator Author

Ok This PR is doing too much. Let me pair it down to just a few Protocols and do the rest as a series of followups.

@nstarman nstarman force-pushed the has_x branch 5 times, most recently from 96067a4 to a1be18e Compare June 23, 2025 19:19
@nstarman nstarman marked this pull request as ready for review June 23, 2025 21:59
@nstarman nstarman requested a review from jorenham June 23, 2025 21:59
@nstarman
Copy link
Collaborator Author

Ping @NeilGirdhar, given related discussions.

@nstarman nstarman changed the title ✨: add HasArrayX protocols ✨: add CanArrayX protocols Jun 23, 2025
...


class CanArrayAdd(Protocol):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about parametrizing by dtype. Self, other, output. Bit of a mess. Maybe tackle parametrizing as a followup?

@nstarman
Copy link
Collaborator Author

Should all the Protocols inherit from HasArrayNamespace?
Also should it be rename to CanArrayNamespace ?

@NeilGirdhar
Copy link
Contributor

NeilGirdhar commented Jun 23, 2025

Should all the Protocols inherit from HasArrayNamespace?
Also should it be rename to CanArrayNamespace ?

I don't know what Joren will say, but I would guess no and no? (I think you got it right in this PR?)

Also, I'm guessing you're aware that int | float is float, and you're intentionally specifying both?

@nstarman
Copy link
Collaborator Author

nstarman commented Jun 24, 2025

don't know what Joren will say, but I would guess no

My thought was for building stuff like

class Positive(Protocol):
    def __call__(self, array: CanArrayPos, /) -> CanArrayPos: ...

is wrong.

It should be something like

class Positive(Protocol):
    def __call__(self, array: HasArrayNamespace, /) -> HasArrayNamespace: ...

But I think we want

class Positive(Protocol):
    def __call__(self, array: CanArrayPos, /) -> HasArrayNamespace: ...

Which I think works best if it's

class CanArrayPos(HasArrayNamespace, Protocol): ...

Also, I'm guessing you're aware that int | float is float, and you're intentionally specifying both?

Yes. :).

@NeilGirdhar
Copy link
Contributor

I see, you're kind of using it as a poor man's intersection?

Also, I'm guessing you're aware that int | float is float, and you're intentionally specifying both?

Yes. :).

Okay, is that because you're going to generate some documentation from these annotations? Or you find it less confusing?

Also, are you going to add complex to the union?

@nstarman
Copy link
Collaborator Author

Okay, is that because you're going to generate some documentation from these annotations? Or you find it less confusing?

It's for 2 reasons: the array api does it in their docs and because I think the Python numerical tower is a mess and since ints and floats aren't subclasses of each other, it makes little sense for them to be interchangeable at the static type level. 😤😆

Also, are you going to add complex to the union?

Worth discussing. The array api does not.

@NeilGirdhar
Copy link
Contributor

NeilGirdhar commented Jun 24, 2025

It's for 2 reasons: the array api does it in their docs

The docs are that way to help beginners who might be confused. (At least that was the argument that was presented.) But you aren't expecting beginners to read your code, are you?

And, you aren't using this repo to build docs?

The downside of populating the unions unnecessarily is overcomplicated type errors. So from a user standpoint, I think this is worse.

From a developer standpoint, it's a matter of taste. Personally, I think more succinct is easier to understand.

because I think the Python numerical tower is a mess and since ints and floats aren't subclasses of each other, it makes little sense for them to be interchangeable at the static type level.

As much as you might like to turn back time and change the typing decisions that were made, the fact is that the static type int is a subclass of float as far as type checkers are concerned, and that will not change for the foreseeable future.

I think I understand what you're doing and why. I spent years writing if x != 0 for a similar reason. But I think this is a fact that you just have to accept even if you dislike it.

Worth discussing. The array api does not.

Does it not?

array.__add__(other: int | float | complex | array, /) → array

Have I misunderstood the documentation?

@nstarman
Copy link
Collaborator Author

nstarman commented Jun 24, 2025

Ah. We're building towards v2021 first.
A release branch for every major version.
The versions have almost been entirely additive, so it's not too onerous.
This also makes backporting easier.

Copy link
Member

@jorenham jorenham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to use optype for this, as it already provides single-method generic protocols for each of the special dunders:

https://github.com/jorenham/optype/blob/master/optype/_core/_can.py

There's even documentation: https://github.com/jorenham/optype#binary-operations

And of course it's tested and thoroughly type-checked and stuff

@nstarman
Copy link
Collaborator Author

nstarman commented Jun 24, 2025

Sounds good to me...
It's good to have in-house expertise.

CleanShot 2025-06-24 at 10 22 36@2x

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 1, 2025

@jorenham is this prep for using optype?

@nstarman nstarman mentioned this pull request Jul 1, 2025
@jorenham
Copy link
Member

jorenham commented Jul 1, 2025

@jorenham is this prep for using optype?

Yea, pretty much.

nstarman and others added 5 commits July 6, 2025 16:17
Support right power operator for array classes
Support right modulo operator for array classes

Signed-off-by: Nathaniel Starkman <[email protected]>
@nstarman
Copy link
Collaborator Author

nstarman commented Jul 9, 2025

And, you aren't using this repo to build docs?

I think the plan is to build docs, which would show the types.

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 9, 2025

@jorenham do you want to switch some of these to be optype objects, or does the Self and docstring mean we should go ahead with rolling our own Protocols ?

@jorenham
Copy link
Member

jorenham commented Jul 9, 2025

@jorenham do you want to switch some of these to be optype objects, or does the Self and docstring mean we should go ahead with rolling our own Protocols ?

I've thought about this, but I'm not sure what the best approach is. I considered four approaches:

  1. Use optype but monkeypatch the __doc__ of the protocols. The downside is that we'd pollute these protocols, which might be annoying for users that use optype for other things as well.
  2. Bundle optype as git submodule, so that we can monkeypatch __doc__ without polluting the "actual" optype protocols.
  3. We write our own protocols (copy-pasting those of optype). This won't pollute optype, but we'd have to do quite a lot of work to write- test- and maintain them.
  4. Use optype, but ignore the docstrings. If we later want docstrings after all, then we can revisit the 3 options above.

Now that I've written these down, I think I feel most for option 4. As far as I'm concerned, docstrings are a "should-have", not a "must-have" (MoSCow jargon). By postponing worrying about docstrings, we can focus on building the actual functionality first. This feels like the most agile approach to me.

Thoughts?

@nstarman
Copy link
Collaborator Author

For magic dunder methods I agree we can start with 4.

What about doing

@modify_docstring("", __float__="")
class CanFloat(opt.CanFloat): ...

@modify_docstring("", __int__="")
class CanInt(opt.CanInt[R]): ...

@jorenham
Copy link
Member

For magic dunder methods I agree we can start with 4.

What about doing

@modify_docstring("", __float__="")
class CanFloat(opt.CanFloat): ...

@modify_docstring("", __int__="")
class CanInt(opt.CanInt[R]): ...

I like that!

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 11, 2025

We still have the problem of Self in the type annotations. `

E.g.

class CanArrayAdd(Protocol):
    def __add__(self, other: Self | int | float, /) -> Self: ...

which isn't compatible with optype.CanAdd .

Edit: the closest I can get is

opt.CanAdd["HasArrayNamespace[NS_contra] | int | float", "Array[NS_contra]"],

Doing

opt.CanAdd["Array[NS_X] | int | float", "Array[NS_X]"], doesn't seem to work.

@nstarman nstarman closed this Jul 11, 2025
@nstarman nstarman reopened this Jul 11, 2025
@jorenham
Copy link
Member

We still have the problem of Self in the type annotations. `

E.g.

class CanArrayAdd(Protocol):
    def __add__(self, other: Self | int | float, /) -> Self: ...

which isn't compatible with optype.CanAdd .

I'll add them to optype then

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 11, 2025

I'll add them to optype then

Awesome, so then it'll be...

CanAddSelf[T, R=Self] = CanAdd[Self | T, Self | R]

so we can do CanAddSelf[int | float] ?

@jorenham
Copy link
Member

Something like this, @nstarman?

class CanAddSelf(Protocol[_T_contra]):
    def __add__(self, rhs: Self | _T_contra, /) -> Self: ...

@nstarman
Copy link
Collaborator Author

Great! I guess the return type probably isn't necessary.

@jorenham
Copy link
Member

Great! I guess the return type probably isn't necessary.

Yea indeed. And if anyone needs it after all, then we can always add it as optional type parameter later on.

@jorenham
Copy link
Member

E.g.

class CanArrayAdd(Protocol):
    def __add__(self, other: Self | int | float, /) -> Self: ...

BTW, this wouldn't work in case of boolean arrays.

@nstarman
Copy link
Collaborator Author

E.g.

class CanArrayAdd(Protocol):
    def __add__(self, other: Self | int | float, /) -> Self: ...

BTW, this wouldn't work in case of boolean arrays.

Yeah. I noticed that. It's in the signature of the Array API, but without a way to detect boolean dtypes, how else do we write this statically?

Also we need CanRAddSelf, etc.

@nstarman
Copy link
Collaborator Author

nstarman commented Jul 11, 2025

I don't think we need to do single-method Protocols now that we're using optype

@docstring_setter(
    __pos__ = """...""",
    ...
)
class Array(
    HasArrayNamespace[NS_co],
    opt.CanPosSelf,
    opt.CanNegSelf,
    opt.CanAddSelf[int | float],
    opt.CanIAddSelf[int | float],
    opt.CanRAddSelf[int | float],
    opt.CanSubSelf[int | float],
    opt.CanISubSelf[int | float],
    opt.CanRSubSelf[int | float],
    opt.CanMulSelf[int | float],
    opt.CanIMulSelf[int | float],
    opt.CanRMulSelf[int | float],
    opt.CanTrueDivSelf[int | float],
    opt.CanRTrueDivSelf[int | float],
    opt.CanFloorDivSelf[int | float],
    opt.CanIFloorDivSelf[int | float],
    opt.CanRFloorDivSelf[int | float],
    opt.CanModSelf[int | float],
    opt.CanIModSelf[int | float],
    opt.CanRModSelf[int | float],
    opt.CanPowSelf[int | float],
    opt.CanIPowSelf[int | float],
    opt.CanRPowSelf[int | float],
    Protocol,
):

@jorenham
Copy link
Member

jorenham commented Jul 11, 2025

It's in the signature of the Array API

Then that should be changed 🤷🏻‍♂️

how else do we write this statically?

I'd make it generic:

class CanAddSelf(Protocol[_T_contra]):
    def __add__(self, rhs: Self | _T_contra, /) -> Self: ...

😏

Also we need CanRAddSelf, etc.

Yea I'll add *Self variants or all binops 👌🏻.

But I'm thinking of leaving out the Self as input for the reflected ops, so it'll be

def __radd__(self, rhs: _T_contra, /) -> Self: ..

because it shouldn't be needed, ...right?

@jorenham
Copy link
Member

jorenham commented Jul 11, 2025

I don't think we need to do single-method Protocols now that we're using optype

We'll still need some for the non-python dunders like __array_namespace_info__ and attributes like dtype

@nstarman
Copy link
Collaborator Author

We'll still need some for the non-python dunders like array_namespace_info and attributes like dtype

Yes, ones that don't have a natural fit in optype.

@jorenham
Copy link
Member

We don't care about __divmod__, right?

@nstarman
Copy link
Collaborator Author

how else do we write this statically?
I'd make it generic:

That's a good idea. We can define a generic Array[InputT] and then also provide some common-sense defaults, like (names TBD)

Array[InputT]
NumericArray = Array[int | float]
BoolArray = Array[bool]

Signed-off-by: Nathaniel Starkman <[email protected]>
@nstarman
Copy link
Collaborator Author

Pushing a commit that won't work, but does most of the things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants