You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
cloudpickle generates random uuids to track dynamic classes, and those random uuids are added to outputs. For example, if there is a class serialized by value, the following lines are found with pickletools.dis
@gmcatsf feel free to open a PR for that. I am not sure how the proposal of using a sequential id would pan-out in practice. We need to try and see if the existing tests pass unchanged. We also need new tests to specify what we mean by deterministic pickle files.
We might need a thread-safe counter increment, probably with a lock. We cannot rely on the GIL because we would like this code to work with the nogil fork of CPython (and also PyPy).
An alternative to sequential ids would be to hash the contents of the class def, but that might be too complex / expensive to do because if it might imply scanning the reference graph of the class object twice instead of once.
cloudpickle generates random uuids to track dynamic classes, and those random uuids are added to outputs. For example, if there is a class serialized by value, the following lines are found with pickletools.dis
This string comes from
class_tracker_id
in cloudpickle and makes binary outputs different even though there is no code change.Can random ids be replaced with deterministic ids, say a sequential number, for
class_tracker_id
?This could be part of existing #453
The text was updated successfully, but these errors were encountered: