Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pickling in parallel.py #122

Open
mileslucas opened this issue Dec 31, 2018 · 5 comments
Open

Pickling in parallel.py #122

mileslucas opened this issue Dec 31, 2018 · 5 comments
Labels

Comments

@mileslucas
Copy link
Member

I'm not sure what the history of parallel.py is, but I tried running it with some data files and I ran into the issues that nothing would work as is because of pickling.

I understand the problem is that, for instance, in the initialize function we start a process with target model.brain, but Python cannot pickle bound methods. Bound methods are any kind of method that belong to a class that aren't classmethods- in other words, any method that requires self as one of the arguments.

I was easily able to edit the code from scripts/star.py to avoid this issue, but I'm curious if/how this has worked before since this pickle problem has existed for as long as I've used python.

@mileslucas mileslucas added the bug label Dec 31, 2018
@iancze
Copy link
Collaborator

iancze commented Dec 31, 2018

Hrm, that's strange. The reason why parallel.py and star.py looked the way they did was to avoid the pickling issue. I looked through the commit history and there was nothing that seemed obvious to cause it to stop working. It sounds like you have it working now, but if you post some more error messages maybe I can think of something else that might be causing it.

Granted, the confusing nature of parallel.py was something I wasn't very happy about in the long term, since as we are finding out, it's pretty brittle code. The main reason we had it was to enable the nested Gibbs sampling with multiple echelle orders running simultaneously.

@mileslucas
Copy link
Member Author

I will try a minimal example using a VM to see if I can recreate my error.

@mileslucas
Copy link
Member Author

I cannot recreate on an Ubuntu VM (using Docker). I will investigate further.

@iancze
Copy link
Collaborator

iancze commented Dec 31, 2018

Ok that's pretty strange. I've done all my developing and testing on Arch Linux, if that helps as a point of reference.

@mileslucas
Copy link
Member Author

mileslucas commented Jan 1, 2019

So on windows if I call

star.py --optimize=Theta

I get

TypeError: can't pickle _thread.RLock objects
.
.
.
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

when calling Process.start() from within parallel.initialize()

When I take the exact same code by mounting it onto an Ubuntu VM, I have no problems.

E:
I've opened a question on StackOverflow since this seems like a platform issue. As we rewrite parallel.py we will definitely have to do some discussion about multiprocessing. For our base use-case, though, we can defer the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants