-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make struct _typeobject
a plain C struct and use it to build immutable, shareable PyTypeObject
s
#672
Comments
We can add the function |
I think this would have to be debated by the C API WG. It feels like a big deal to propose these changes (with a long migration trajectory) that ought to be considered and planned carefully. Casually adding one new C API in 3.13 and hoping the rest will be easy sounds, um, optimistic. :-) |
In the |
FWIW: I'd definitely make that two APIs: one API for the |
If I read this correctly this (making PyTypeObject opaque) would break PyObjC, which subclasses Functionally I need a low-cost way to get additional C data associated with a Python class. I could probably find a way to store that data out of line, but that's likely a lot more expensive because this would require looking up the additional data in a separate data structure. And to be clear: I can and will adjust to a new API when available, but would prefer to keep a cheap way to get at the associated data. |
I'm strongly in favour of the idea, but as Guido says this isn't going to be a fast transition. I believe we (@ericsnowcurrently mainly) cleaned up a decent amount of our own state on type objects so they are close to being immutable, but going the rest of the way is going to be tougher. Is there a way we can get benefit from easily being able to tell that a type object is immutable (checking the extra size for 0)? If we can start telling people that not allocating extra bytes is a significant benefit for certain operations, they have some motivation to migrate. If we haven't already, we ought to be able to add an API that hides whether the extra data is stored out of line (returns a pointer to it) and then migrate it behind the scenes to interpreter state rather than directly on the type object. Like Ronald, I would also quite like a cheap way to get data associated with any object (having just spent some time writing native profiling code, that needs to store data against any arbitrary callable). It's possible that our HAMT implementation may suit, and it could be worth exposing that as an alternative for per-interpreter, per-object data? The first step is certainly adding the APIs to access members of PyTypeObject indirectly rather than directly, even if we don't make anything opaque yet. |
@markshannon wrote:
FWIW, there is probably room to explore solutions that extend beyond @markshannon wrote:
Alternately, we could add a public equivalent to @colesbury wrote:
I'd think heap allocated. @zooba wrote:
+1 |
If I hasn't just had an in-person chat with @ericsnowcurrently about this I would still be utterly confused. It appears that types have exactly three mutable attributes that are a problem: For static types that are part of the CPython implementation, this is currently solved by storing those three attributes (and only those) in the interpreter state, in the array I am nevertheless still confused about the motivation of the proposal. The motivational section above seems high on advantages of immutable, immortal objects (I have no argument there) but doesn't go into specifics about current pain points. Is the goal to have the And if we use the It would seem that (assuming we eat our own dogfood) this proposal would defeat the advantage of Finally, @zooba wrote:
But in @ronaldoussoren's post I only see a need for data associated with specific type objects, not with arbitrary objects (which is how I understand @zooba's post). All in all I really hope that @markshannon can clarify his motivation and proposed implementation (at least at a high level so I can reason about some properties of the new type objects). |
That's correct. PyObjC dynamically creates (a fairly large number of) subclasses of class NSObject(type):
native_class: ... The Finally: PyObjC's use case generates these type objects at runtime. I currently don't support subinterpreters, but when I do add that support this will not involve sharing these type objects between sub interpreters. |
In June, I took notes on PyTypeObject members and how they are used: python/cpython#105970 |
I guess I was too general with the API. Per-interpreter types can be mutable and need to have attached data as @ronaldoussoren and @zooba point out. So let's focus on the immortal, immutable case for now. This should cover the Cython generator type, Dropping the Most of the machinery is already present in typedef _typeobject PyClassDefinition;
int
Py_MakeClass(PyClassDefinition *def, PyTypeObject **result)
{
*result = malloc(sizeof(PyTypeObject));
if (*result == NULL) return -1;
memcpy(*result, def);
return _PyStaticType_InitBuiltin(*result);
} |
You understood correctly. I skipped the bit about "by storing a dynamic index in the so-called 'immutable' type struct" straight to "what if we just had a per-interpreter data structure that could look up indirectly-attached data for anything". That works both for the current attached data on static types (which, I'll note, are not performance critical members) and for other tasks that may require storing attached data against any object. As far as I can tell, the only benefit of doing this at all if is we can make the entire type object completely immutable. It doesn't actually matter if it's opaque or not - I don't think we can avoid supporting native "subclasses" of Also, opaque structs are just generally good for other languages that integrate with CPython. That really just means having all the APIs you need to be able to treat them as opaque if you want, but long-term I do hope that using those APIs becomes the default even for C developers. |
Sounds like there's quite a number of different motivations, solutions, and properties getting mixed in here. (E.g. Do we do something for all objects, or just for types? Are types going to be opaque or not?) It sure feels like it's going to take a PEP to sort out what we're going to do and why, what impact it will have, and how the APIs involved will be able to evolve. |
In my mind I now summarize the motivation here as a simpler alternative to |
Well, that's the init machinery. The hard part here is the teardown, which AFAIK currently involves a carefully curated list of types, known at compile-time. (The reason for that eludes me; I've successfully avoided that rabbit hole so far.) If this can be made to work for arbitrary extension types, it should work for "regular"/"legacy" static types as well.
How would that work in this proposal? A C global pointer? How do you know that
I see several points the proposal could be broken down to, each one pretty good but with subtle issues (and unknown unknowns):
Yeah, this does feel like a pre-PEP discussion :) |
|
The mutability of
PyTypeObject
(builtin classes) and the shareability of statically allocated (C) objects makes safe handling of builtin types fragile, difficult, and ultimately unsafe.Whereas, immutable objects are great for free-threading and multiple interpreters. They can be freely shared and there are no race conditions, and for immortal objects no contention when using them.
Immortal, immutable objects are also much easier to use from C code. Accessing
int
class is easy:&PyLong_Type
. But accessingarray.array
is a real pain, requiring API calls.We are making unreasonable demands of third-party code to support multiple interpreters, for code that used to be simple.
We can provide immutable, sharable
PyTypeObject
s with a simple, easy to use API that makes it easy to port old code with the following few API changes:Strip the
PyObject_VAR_HEAD
from the start ofstruct _typeobject
making it a plain C struct,struct _classdef
not a Python object.Make
PyTypeObject
an opaque structProvide an API to create a
PyTypeObject
from astruct _classdef
:int Py_MakeClass(struct _classdef def, PyInterpreterState *interp, PyTypeObject **result)
interp == NULL
. The result would be a pointer to an immortal and immortal class.interp != NULL
. The result would be a new reference to a mortal, mutable class belonging to the given interpreter.The sooner we can make
PyTypeObject
opaque, the better, but we might need to keep it open for a release or two for backwards compatibility reasons.The text was updated successfully, but these errors were encountered: