-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nasty segfault when running experiments just from run.py #8
Comments
So does the original run.py work (without the changes)? |
yes, the problem only happens if you try running two experiments from one |
Here's a full log:
|
gdb bt |
I see. In our test setup each run.py run is only for one test (one dot in the figure). It was not written to run multiple time. I assume there will be a lot of bugs if you try to run it multiple times within the same python process. And the amount of work to fix those would be huge... So my suggestion is simply call this run.py using sub_process (and add enough sleep time between to make sure it dies), not as a function call. |
ah ok, I suspected that might be the case. thanks for looking at it |
my hope was that I could minimize the amount of data serialization that needed to be done, and keep the data as python dictionaries as long as possible |
I agree with Shuai that it may not be worth the effort, but another route to take could be to force the simplerpc module to re-initialize itself by shutting down, removing the module, and importing again. However, I have not tried this method. |
https://github.com/DSEF/janus. Here is my updated code for janus. I've just ported things to python3 which involved pretty minimal changes. I needed to change the cpp python module stuff, update the build system, and change
ps.py
. Running./dsef/run.py
will run two experiments. If you take a look at that code, you will see its basically the originalrun.py
but reorganized a bit and I added some RPyC stuff to make it easier to call from the DSEF server and send data back and forth. The current version has all the RPyC server code commented out to try and isolate this segmentation fault.I believe that the segmentation fault comes from something in
_pyrpc
not reseting state properly between runs. Before this wasn't a problem because the python instance was completely exiting between each experiment.I would greatly appreciate any insight into what might be causing this.
P.S. This is built from an old version of Janus, commit f45fd04
The text was updated successfully, but these errors were encountered: