Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unsupported NumPy features and other differences w.r.t. NumPy #73

Open
ev-br opened this issue Feb 28, 2023 · 8 comments
Open

Unsupported NumPy features and other differences w.r.t. NumPy #73

ev-br opened this issue Feb 28, 2023 · 8 comments
Labels
documentation Improvements or additions to documentation

Comments

@ev-br
Copy link
Collaborator

ev-br commented Feb 28, 2023

  • We only aim to support numeric dtypes, which are undestood by pytorch. This rules out

    • object arrays
    • datetimes
    • strings, chars and void dtypes
    • structured dtypes and recarrays
    • np.longdouble and np.clongdouble (a.k.a. np.float128 and np.complex256, respectively)
  • ndrray subclasses are out of scope.

  • masked arrays are out of scope

  • numpy polynomials are out of scope, both np.poly1d, np.polynomial

  • __array_function__ protocol is out of scope. This way, non-default like=... arguments raise.

  • __array_interface__ is out of scope

  • ndarray.ctypes attribute not supported

  • Negative strides: tnp.flip and slicing with negative step return a copy.


These differences exist currently, but might be fixable if desired:

  • We do not distinguish between 0D arrays and scalars. That is, tnp.float32(3) creates a zero-dim array.

    • One corollary is that scalars never silently decay into Python scalars, as NumPy scalars do in some situations:
In [1]: np.int32(2) * [1, 2, 3]         # scalar decays to a python int
Out[1]: [1, 2, 3, 1, 2, 3]

In [2]: np.asarray(2) * [1, 2, 3]     # zero-dim array is an array-like
Out[2]: array([2, 4, 6])

In our implementation, np.int32(2) behaves identically to np.asarray(2).

  • We do not implement value-based casting. This will be deprecated in NumPy 2.0 as per NEP 50.

  • __array_wrap__ protocol is currently not implemented.

  • gufunc machinery is not implemented, e.g. axes=[(n,k),(k,m)->(n,m)] arguments of ufunc objects.

  • ufunc methods (np.add.reduce etc) are not implemented.

  • Fortran ordered arrays in general, and order="CFKA" in various creation functions are not implemented.

  • numpy.linalg, handles zero-size arrays (sort of) uniformly, and pytorch doesn't handle these at all. We do not currently implement it.

  • various estimators for the np.histogram bin selection are not implemented.

  • nout=2 ufuncs out1=..., out2=... positional arguments do not work (out=tuple kwargs work)

  • sorting/ordering of complex data: numpy defines some ordering of complex values, pytorch errors out; we follow pytorch; relevant functions are min/max, argmin/argmax, sort and searchsorted. cf min/max for complex inputs #67 for discussion.

  • tril_indices_from/triu_indices_from return tensors rather than list of tuples to avoid a graph break

@rgommers
Copy link
Member

__array_wrap__ protocol is out of scope. This way, non-default like=... arguments raise.

This is something that torch.Tensor itself uses - should that really be out of scope? It's unrelated to like= arguments AFAIK, that'd be __array_function__.

@ev-br
Copy link
Collaborator Author

ev-br commented Feb 28, 2023

Thanks, edited! Have to admit, I'm not up-to-date on various __array_dunder__ protocols. Not sure which of them does what, thus do not have a sensible suggestion on which ones we want here.

@rgommers
Copy link
Member

rgommers commented Feb 28, 2023

Is there any test case like:

import torch
import numpy as np
from numpy.testing import assert_raises

t1 = torch.arange(3)
t2 = torch.ones(3)
t3 = np.add(t1, t2)
assert isinstance(t3, torch.Tensor)

# Test mixing torch and numpy tensors
x1 = np.full((3,), 2)
assert isinstance(t1 + x1, torch.Tensor)  # calls numpy, which is happy to accept tensors
assert_raises(TypeError, lambda f: x1 + t1)  # calls torch, which doesn't accept numpy arrays

That requires __array_wrap__. We don't want to encourage mixing tensors and arrays too much, but this one thing is used annoyingly often. So if it's possible to support with a minor amount of effort, then that's probably worth it.

@ev-br
Copy link
Collaborator Author

ev-br commented Feb 28, 2023

I'm not quite sure what is the desired behavior in your snippet above if import as np is replaced by the torch_np input, could you clarify? I'm probably just being dense.

On a tangentially related note, is there a way to control how numpy arrays and our wrapper arrays mix:

In [28]: tnp.array([1, 2, 3]) + np.array([4, 5, 6])              # tnp.array wins over
Out[28]: array_w([5, 7, 9])

In [29]: np.array([1, 2, 3]) + tnp.array([4, 5, 6])
Out[29]: array([array_w(5), array_w(7), array_w(9)], dtype=object)     #  ugh, object arrays! 

@rgommers
Copy link
Member

I'm not quite sure what is the desired behavior in your snipped above if import as np is replaced by the torch_np input, could you clarify? I'm probably just being dense.

The exact same - you either get a torch.Tensor back (from numpy), or you get an exception (from pytorch).

On a tangentially related note, is there a way to control how numpy arrays and our wrapper arrays mix:

There is, e.g. through implementing __array__ to control how wrapper ndarrays get turned into numpy ndarrays (or raise). This mix should never happen though?

@ev-br
Copy link
Collaborator Author

ev-br commented Feb 28, 2023

Re your example. Is the expected behavior python/numpy/pytorch version dependent? (Sincerely hope it shouldn't be!). Here's what I get locally with x1 and t from #73 (comment):

In [38]: x1 + t1
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [38], in <cell line: 1>()
----> 1 x1 + t1

TypeError: Concatenation operation is not implemented for NumPy arrays, use np.concatenate() instead. Please do not rely on this error; it may not be given on all Python implementations.

In [39]: t1 + x1
Out[39]: tensor([2, 3, 4])

In case it matters,

In [40]: np.__version__
Out[40]: '1.24.1'

In [41]: torch.__version__
Out[41]: '1.13.0'

In [42]: import sys; sys.version
Out[42]: '3.9.0 | packaged by conda-forge | (default, Nov 26 2020, 07:57:39) \n[GCC 9.3.0]'

@rgommers
Copy link
Member

rgommers commented Mar 1, 2023

No, I just edited the wrong line during our call. Fixed now by swapping x1 and t1`.

@ev-br
Copy link
Collaborator Author

ev-br commented Mar 15, 2023

The current behavior is that the wrapper ndarray wins over in both __add__ and __radd__:

In [1]: import torch_np as tnp

In [2]: import torch

In [3]: torch.ones(3) + tnp.ones(3)
Out[3]: array_w([2., 2., 2.], dtype=float64)

In [4]: tnp.ones(3) + torch.ones(3)
Out[4]: array_w([2., 2., 2.], dtype=float64)

Here I think what happens is that torch.Tensor does not recognize tnp.ndarrays and gives up; tnp.ndarray, in turn, does tnp.asarray on the other argument, which happily accepts a Tensor. I'd think it's the behavior we want, no?

@ev-br ev-br added the documentation Improvements or additions to documentation label Mar 16, 2023
This was referenced Apr 3, 2023
This was referenced Apr 25, 2023
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
…tem__"

In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
…tem__"

In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
…tem__"

In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
…tem__"

In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
lezcano added a commit to pytorch/pytorch that referenced this issue Aug 22, 2023
In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

cc mruberry rgommers voznesenskym penguinwu anijain2305 EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx chenyang78 aakhundov

[ghstack-poisoned]
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this issue Aug 23, 2023
In this case, we copy, but this is part of the set of divergences
described in Quansight-Labs/numpy_pytorch_interop#73.

This does not work with dynamic shapes, but it's not clear to me what
would be the best fix

Pull Request resolved: #107688
Approved by: https://github.com/ezyang
ghstack dependencies: #107687
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants