-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bugfix: ensure different KeysView/ValuesView type names for int16_t/int32_t or float/double. Fix #4529 #4972
Conversation
…nt32_t or float/double. Fix pybind#4529
@aimir could you help reviewing this PR?
I'm super sleep-deprived at the moment ... but anyway: it sounds like the right thing to do? Instead of less of it (this PR), could we alternatively do more of it? Concretely, could we work on this condition instead?
Please don't hold back letting me know if you think this line of thinking is completely off the tracks. I'll look again after catching some sleep. |
Hi @rwgk thanks for your quick feedback.
This line seems to be not related to the string name for the python binding. This particular line is also mentioned in issue #4529, and after mentioning this line the issue author also wrote "but in this case two different typeids have the same string name in pybind11". To make it clearer, the problem is only related to the "same string name" of the two different types, which is, a naming problem. The behavior that there are "two different typeids" is correct. So issue #4529 cannot be fixed by changing this
Let me provide some more rationale for why I have to deleted the lines with Now let's say we keep the lines from PR #4353, which basically says
To solve this problem, we should name the container for py::bind_vector<vector<int32_t>>(m, "vector_int");
py::bind_vector<vector<int64_t>>(m, "vector_long"); rather than py::bind_vector<vector<int32_t>>(m, ("vector_" + detail::make_caster<int32_t>::name.text).c_str()); // the type name will be "vector_int"
py::bind_vector<vector<int64_t>>(m, ("vector_" + detail::make_caster<int64_t>::name.text).c_str()); // the type name will be "vector_int" (but should actually be "vector_long") which will cause the In summary, for this naming purpose, the C++ to Python caster @aimir Please correct me if I misunderstood something. |
What seems unfortunate is that we get a mix of Python type names and C++ type names. See:
Specifically (from python/mypy#16306 (comment)):
With this PR, we'd get e.g. How could we make this work nice also for the purposes of stubgen?
Does a Python user even want to know that detail? I'm thinking I think a better direction would be to
Then add Using That's a bigger project, unfortunately. And probably I'm overlooking a thing or two. ... what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I tried to create wrapper names that would be pythonic in nature - specifically motivated by auto-generating typing hints for mypy
using stubgen
.
That's why I opted to always use the python type names - as far as python is concerned, those really are the same types. Unfortunately I didn't think about this leading to issues with multiple wrappers of different integer types.
@rwgk I agree that the right thing is to treat them as the same python wrapper, but make sure they are wraped only once - this would have resolved issue #4529 while keeping the typing info pythonic.
This might be a larger task that I'd love to work on, but I don't see myself having the time for it soon. The solution suggested by @hczhai seems like the right thing to do for now. This is a significant issue, and fixing this sooner rather than later is more important than having nice typing hints.
tests/test_stl_binders.py
Outdated
assert map_string_double.values().__class__.__name__ == "ValuesView[double]" | ||
assert map_string_float.values().__class__.__name__ == "ValuesView[float]" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep the tests for keys()
and items()
to make sure that those are also named as expected
Thanks a lot @aimir for the feedback! Looks good to me then, as far as I understand, what we'll have is still better than the state before #4353. To deal with the mix of Python type names and C++ type names, this super simple idea crossed my mind: (I.e. we can take care of that separately.) @hczhai I'll approve this PR after you add the |
@rwgk @aimir Thanks a lot for the feedback! @aimir The reason I deleted the tests for In the example used for testing, we have assert (map_string_double.keys().__class__.__name__ in [
"KeysView[std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >]",
"KeysView[std::basic_string<char,std::char_traits<char>,std::allocator<char> >]",
"KeysView[std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >]"
] instead of @rwgk So I am thinking of the alternative way that you suggested - namely, how to make it work with the python names. So if we |
I'm really not sure, but offering what comes to mind: In my understanding, Now, that doesn't always hold true, but with the backtick trick I'm experimenting with under #4888, at least we're clearly exposing when we're falling back to a C++ type name. That's "good enough" IMO, hoping that stubgen will be changed in the future to process the backtick notation in a meaningful way. Does that help? @sizmailov for awareness. |
The goal of #4353 was to fix #3986. What we're getting with this PR doesn't solve the original problem. — Is that a fair assessment? In any case, I initiated Google global testing with this PR, to see how many stubgen tests are failing, if any. I'll report back when I have the results. |
@rwgk Thanks for the comments.
In my opinion, #3986 is still about "the right type signature" rather than a real bug. #4529 is a severe bug, generating runtime error messages and breaking many existing applications. Nevertheless, based on our previous discussions, I found a working way to generate correct Python type names for KeysView, etc. See PR #4983, It is a slightly more complicated way, but should solve both #4529 and #3986 and is compatible with your #4888. I am happy to work on either this PR or PR #4983. But I do hope that we can fix the bug in #4529 soon. |
To report the results of the Google global testing with this PR: There was only one failing test, comparing existing stubgen results with new results. The reduced diff is below. (For my own easy reference, the test ID was OCL:591943630:BASE:592407555:1703042759080:eb7a4a92.) I'm not sure how many similar stubgen tests we have, my best guess is on the order of 100. — To put this in perspective, the total number of tests is many millions. This regression is not great of course, but just from a practical viewpoint, it would be defendable. -class ItemsView[int, object]:
+class ItemsView[long, object]:
-class ItemsView[str, object]:
+class ItemsView[std::__u::basic_string<char, std::__u::char_traits<char>, std::__u::allocator<char>>, object]:
-class KeysView[int]:
+class KeysView[long]:
-class KeysView[str]:
+class KeysView[std::__u::basic_string<char, std::__u::char_traits<char>, std::__u::allocator<char>>]:
- def items(self) -> ItemsView[int,object]: ...
- def keys(self) -> KeysView[int]: ...
+ def items(self) -> ItemsView[long,object]: ...
+ def keys(self) -> KeysView[long]: ...
- def items(self) -> ItemsView[str,object]: ...
- def keys(self) -> KeysView[str]: ...
+ def items(self, *args, **kwargs) -> Any: ...
+ def keys(self, *args, **kwargs) -> Any: ... |
Closing this after #4985 was merged. |
Description
This PR fixes issue #4529. The problem was introduced in PR #4353.
The problem is the following:
Here we want
type(map_int_int.keys())
to beKeysView[int]
andtype(map_long_int.keys())
to beKeysView[long]
. But in after PR #4353 they are all assigned the same type nameKeysView[int]
which is caused by the usage ofdetail::make_caster<KeyType>::name
for determining the type name. It maps all C++ integer types (int16_t, int32_t, ...) to the same python type nameint
and C++ float/double types to the same namefloat
, etc.The solution is to always use the C++ type name in
KeyView[...]
andValuesView[...]
.Note that before PR #4353, the type names were
KeysView[map_int_int]
andKeysView[map_long_int]
for the above example (which is okay).Suggested changelog entry: