You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, it seems that unlike the multiprocessing module, multiprocess loses track of any external dependency used within a function that isn't declared inside it.
A very simple example that replicates this problem would be:
import multiprocess as mp
import time
def test(seconds=5):
time.sleep(seconds)
print(f"Slept for {seconds} second(s)!")
if __name__ == '__main__':
p = mp.Pool(processes=1)
p.apply_async(test, (3,)).get()
This will result in: NameError: name 'time' is not defined
This exact same piece of code works if you replace the multiprocess import with a multiprocessing import
This seems to be the result of the pickling that multiprocess does, but I don't really understand why this has to be the case, since dill (which this library uses) allows for modules to be pickled.
I've also found a resolved issue in the dill repo which specifies how function dependencies can be pickled alongside the function:
Would it not be possible to make an enhancement to use this functionality?
Currently, things are really awkward, since re-declaring your imports within every single function is verbose, confusing (it's easy to forget somewhere, making it easy to get errors you later have to track down), and can cause all sorts of problems with linters that believe you're redeclaring an unused module, and cannot actually warn you when you've forgotten to include a local import, since they believe it's already in the namespace.
Any progress on this would be hugely appreciated
Cheers!
The text was updated successfully, but these errors were encountered:
Which version of python are use using? What version of multiprocess and dill? What operating system? Your test code works for me on every combination I've tried -- however, there is an issue for python 3.8 beta series on MacOS where multiprocessing converted the default context from fork to spawn... and when using spawn and the recurse=False setting on dill, you can get effects like the NameError you are seeing as (I believe) dill doesn't handle the global dict correctly within the context under those settings.
If you are using python 3.8 and MacOS -- then you can either update to a version of multiprocess at 7257130 or later, or (alternately) set dill.settings['recurse'] = True. See more details in #65.
So, it seems that unlike the multiprocessing module, multiprocess loses track of any external dependency used within a function that isn't declared inside it.
A very simple example that replicates this problem would be:
This will result in: NameError: name 'time' is not defined
This exact same piece of code works if you replace the multiprocess import with a multiprocessing import
This seems to be the result of the pickling that multiprocess does, but I don't really understand why this has to be the case, since dill (which this library uses) allows for modules to be pickled.
I've also found a resolved issue in the dill repo which specifies how function dependencies can be pickled alongside the function:
uqfoundation/dill#176
Would it not be possible to make an enhancement to use this functionality?
Currently, things are really awkward, since re-declaring your imports within every single function is verbose, confusing (it's easy to forget somewhere, making it easy to get errors you later have to track down), and can cause all sorts of problems with linters that believe you're redeclaring an unused module, and cannot actually warn you when you've forgotten to include a local import, since they believe it's already in the namespace.
Any progress on this would be hugely appreciated
Cheers!
The text was updated successfully, but these errors were encountered: