Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dill fails to access global dict on windows #323

Closed
mmckerns opened this issue Jun 30, 2019 · 7 comments
Closed

dill fails to access global dict on windows #323

mmckerns opened this issue Jun 30, 2019 · 7 comments
Labels
Milestone

Comments

@mmckerns
Copy link
Member

Using multiprocess with dill-0.3.0 on windows, it seems that dill fails to access the global dict... however, older versions of dill (e.g. 0.2.x) work as expected. These errors are not seen on non-windows systems. See original reporting here:
See: uqfoundation/multiprocess#61

@mmckerns mmckerns added the bug label Jun 30, 2019
@mmckerns
Copy link
Member Author

Also see: uqfoundation/multiprocess#65 for some possibly related context

@molinav
Copy link
Contributor

molinav commented Apr 24, 2020

I also include a link to this comment in the pathos issue tracker, I think that the NameError that I get in my minimal working example is triggered by this issue.

In my case, the error also appears with dill 0.2.9 on Windows. If I downgrade dill to 0.2.8.2, then the error disappears.

@molinav
Copy link
Contributor

molinav commented Apr 24, 2020

Just for the record, I cloned the repo and tested my minimal working example for all the commits in the master branch between tag dill-0.2.8.2 and tag dill-0.2.9. Starting with commit 3b043d6 the example fails. So the problematic change comes here: 9fe100b7..3b043d67.

@molinav
Copy link
Contributor

molinav commented Apr 25, 2020

Let's copy here the minimal working example that I was mentioning:

# test_example_multiprocess.py


def func1():
    print("Hello world!")


def func2():
    func1()


if __name__ == "__main__":

    from multiprocess import Process
    proc = Process(target=func2)
    proc.start()
    proc.join()
(py37) D:\vic\Desktop>python test_example_multiprocess.py
Process Process-1:
Traceback (most recent call last):
  File "C:\GNU\Anaconda3\envs\py37\lib\site-packages\multiprocess\process.py",
  line 297, in _bootstrap
    self.run()                                                                      
  File "C:\GNU\Anaconda3\envs\py37\lib\site-packages\multiprocess\process.py",
  line 99, in run
    self._target(*self._args, **self._kwargs)
  File "test_example_multiprocess.py", line 9, in func2
    func1()
NameError: name 'func1' is not defined

(py37) D:\vic\Desktop>

The function that takes care of serialising the module variables is save_module_dict. If I print the obj dictionary for the file above when entering save_module_dict, it looks like below:

{
    '__name__': '__main__',
    '__doc__': None,
    '__package__': None,
    '__loader__': <_frozen_importlib_external.SourceFileLoader object at 0x000002BC66F1C108>,
    '__spec__': None,
    '__annotations__': {},
    '__builtins__': <module 'builtins' (built-in)>,
    '__file__': 'test_example_multiprocess.py',
    '__cached__': None,
    'func1': <function func1 at 0x000002BC66F6E288>,
    'func2': <function func2 at 0x000002BC66F6E318>,
    'Process': <class 'multiprocess.context.Process'>,
    'proc': <Process(Process-1, initial)>
}

And this obj dictionary looks the same if run in dill-0.2.8.2 or dill-0.3.1.1. The function func1 that causes the NameError is still there.

Inside save_module_dict, there are 4 ways of serialising this module dictionary (labelled in the code as "D1", "D3", "D4" and "D2"). All of them depend on the boolean returned by the function is_dill applied on the pickler object being used. This function is the one that changed in commit 3b043d67.

So before commit 3b043d67 (the same as implemented in dill-0.2.8.2), it is:

# use to protect against missing attributes
def is_dill(pickler):
    "check the dill-ness of your pickler"
    return 'dill' in pickler.__module__
   #return hasattr(pickler,'_main')

Starting with commit 3b043d67 (the same as implemented in dill-0.3.1.1), it is:

# use to protect against missing attributes
def is_dill(pickler, child=None):
    "check the dill-ness of your pickler"
    if (child is False) or PY34 or (not hasattr(pickler.__class__, 'mro')):
        return 'dill' in pickler.__module__
    return Pickler in pickler.__class__.mro()

In this minimal working example, the pickler object looks like below:

<multiprocess.reduction.ForkingPickler object at 0x000001C83AE07808>

so it is an instance of multiprocess.reduction.ForkingPickler, then pickler.__module__ is the string "multiprocess.reduction". So until dill-0.2.8.2 the evaluation of is_dill(pickler) was returning False and the module variables were saved through the clause "D3". Starting with dill-0.2.9, the evaluation of is_dill(pickler) returns True because it uses the second return statement, then the module dictionary starts going into the clause "D1", and the module dictionary is serialised differently, using 'c__builtin__\n__main__\n' instead of 'c__main__\n__dict__\n'.

@molinav
Copy link
Contributor

molinav commented Apr 25, 2020

I would like to contribute with a pull request but I do not feel sure enough about the patch. I could check that changing the default value of child from None to False does the trick, but I do not know about possible side effects. I was searching where is_dill is used inside the library, and it is always called just with the first argument, so child is always being redirected to None, but I do not have a deep understanding of the insides of dill.

@mmckerns
Copy link
Member Author

mmckerns commented Apr 25, 2020

@molinav: Thanks for your investigations. I had some time to check out the issue, edit the code, and run some tests. You are definitely right -- I looked into it w/o seeing your full thread above, and came to the same conclusion as you. The patch I made is to be explicit with child=True everywhere except in save_module_dict, where I used child=False (so three False, and the rest True). If you make a PR with those changes, I'll accept it. Otherwise, I can patch it. Up to you. Either way, thanks, I appreciate the find. I've tested pathos and multiprocess and dill on Mac, linux, and windows... and it looks like the patch doesn't create any new issues as far as I can tell. The reason the default of child essentially is True was to enable ForkingPickler and other derived classes to better use dill (inside multiprocess, for example).

@mmckerns
Copy link
Member Author

Should be fixed by #363. Closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants