Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function dependencies are not pickled, resulting in a NameError #58

Closed
matthewgdv opened this issue Jun 11, 2019 · 3 comments
Closed

Function dependencies are not pickled, resulting in a NameError #58

matthewgdv opened this issue Jun 11, 2019 · 3 comments

Comments

@matthewgdv
Copy link

matthewgdv commented Jun 11, 2019

So, it seems that unlike the multiprocessing module, multiprocess loses track of any external dependency used within a function that isn't declared inside it.

A very simple example that replicates this problem would be:

import multiprocess as mp
import time


def test(seconds=5):
    time.sleep(seconds)
    print(f"Slept for {seconds} second(s)!")


if __name__ == '__main__':
    p = mp.Pool(processes=1)
    p.apply_async(test, (3,)).get()

This will result in: NameError: name 'time' is not defined
This exact same piece of code works if you replace the multiprocess import with a multiprocessing import

This seems to be the result of the pickling that multiprocess does, but I don't really understand why this has to be the case, since dill (which this library uses) allows for modules to be pickled.

I've also found a resolved issue in the dill repo which specifies how function dependencies can be pickled alongside the function:

uqfoundation/dill#176

Would it not be possible to make an enhancement to use this functionality?

Currently, things are really awkward, since re-declaring your imports within every single function is verbose, confusing (it's easy to forget somewhere, making it easy to get errors you later have to track down), and can cause all sorts of problems with linters that believe you're redeclaring an unused module, and cannot actually warn you when you've forgotten to include a local import, since they believe it's already in the namespace.

Any progress on this would be hugely appreciated

Cheers!

@mmckerns
Copy link
Member

mmckerns commented Sep 27, 2019

Which version of python are use using? What version of multiprocess and dill? What operating system? Your test code works for me on every combination I've tried -- however, there is an issue for python 3.8 beta series on MacOS where multiprocessing converted the default context from fork to spawn... and when using spawn and the recurse=False setting on dill, you can get effects like the NameError you are seeing as (I believe) dill doesn't handle the global dict correctly within the context under those settings.

If you are using python 3.8 and MacOS -- then you can either update to a version of multiprocess at 7257130 or later, or (alternately) set dill.settings['recurse'] = True. See more details in #65.

@mmckerns mmckerns added this to the multiprocess-0.70.10 milestone Jun 16, 2020
@mmckerns
Copy link
Member

Should be fixed by: uqfoundation/dill#323. Hard to tell since there's not much information coming from the reporter. Closing.

@anshuchen
Copy link

Interestingly, I had exactly the same issue using the test example provided here.
Windows 10
Python 3.10.8
Multiprocess 0.70.14
Dill 0.3.6

I was able to get around the problem by setting dill.settings['recurse'] = True.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants