-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is PyCall thread safe? #96
Comments
No, the current pycall isn't thread safe both Ruby and Python sides. |
Does this mean that if I queue say lot of background jobs in ruby, which can run concurrently obviously, those jobs cannot call |
@ziaulrehman40 No, they can't call Python via pycall at the same time. |
Well, that's a deal breaker for my use case. Most payloads are web related in today's world and require concurrency. I wonder what makes it fail in those scenarios and is there anyway community can help fix it. For my use case, i had to use https://github.com/camelot-dev/camelot python library in my ruby code, we have decided now to just use its CLI option and call that CLI with Thanks for the great effort in PyCall though, and I really hope we can enjoy this in concurrent scenarios soon. |
Any news on this topic? We want to call python libraries within a background job (sidekiq+rails). |
Is there any info on where the pycall gaps are on this so that others can contribute? Like @simonfranzen, we need to use the same stack. |
@simonfranzen you could consider rails + resque, which uses a separate process for each job. |
It definitely feels like this should be thread-safe, given how common threaded usage is these days, with web servers and background jobs. But if it's not thread-safe, it also feels like the main README should have a big warning about that. We've spent a lot of time trying to figure out why our application is seg faulting! Has anyone worked around this by wrapping pycall usage in a semaphore? (that's what we are trying now) (we are also calling this from sidekiq background jobs) |
@jeremyhaile I want to accept your pull-request if you can make pycall thread safe even if it doesn’t introduce any overhead to single-threaded applications. By the way, I’m working for streamlit-julia-call in my job nowadays. I’ve succeeded to bridge between multithreaded Python application and Julia in that project. I believe we can employ the similar approach in this project. When I have more time to tackle this issue again in the future, I want to try this approach. However, unfortunately, I am currently very busy and cannot afford to dedicate time to this project, so I hope someone eager to resolve this issue quickly can take over for me. |
Note that if you are using Puma or Rails, even if you set thread=1 it may not be safe to use PyCall from a web request handler, because the request thread is STILL a different thread from the main thread. If you have thread>1, it is always unsafe to use Puma from a web request handler. |
Related but different issue: even if you use PyCall from only one thread, if that thread is not the main thread, the process will not exit when the main thread exits (even though the side thread exited): #186 |
Workaround for PyCall safety in Rails / PumaI made a helper gem called pycall_thread (rubygems) that helps workaround PyCall's lack of thread-safety. You use it like:
All it does is initialize PyCall on an inner thread, and pass blocks to the inner thread for execution. This keeps PyCall happy and thread-safe, and lets you use it without too much issue from Rails or Puma. It has a few guard-rails:
|
@mrkn would you be open to a thread-safety helper like this being included in PyCall.rb? I would be happy to make a PR that adds this as, for example as PyCall::Thread, with any modifications you might suggest. Let me know if this would be helpful. |
@snickell Is the queue-based approach what you really need? In this approach, each pycall-dependent thread is blocked by others. If you want to use pycall on Puma threads, you need to use process-based multi-tasking to call Python via PyCall to avoid such thread blocking. |
@mrkn, thank you for your thoughts. Please let me know if I misunderstood. I will be using this on a large existing Ruby on Rails deployment (code.org / https://github.com/code-dot-org/code-dot-org) with more than 1000 puma processes (and 5-threads-per-process). I wish it was only 1-thread-per-process, but because it is a large application with a 10 year history, I cannot change this configuration easily. Because our multitasking is mostly process-based (~1000 processes in parallel), it is acceptable (but not perfect), that python threads will block each other inside each process. Our Python code will mostly be CPU-heavy, not IO-heavy. The GIL will block CPU-heavy Python threads, so I believe the difference will be small? I like perfect 😹: is there a better solution? I would like to understand if there is. |
@snickell, Could you please release these features as an external gem? I don't want to officially support multi-threading in PyCall. Currently, CPython developers will drop GIL in CPython. In the near future, we cannot handle multi-threading by the pair of |
I explicitly described multi-threading in README at 60c6656. I appreciate your cooperation. |
Seems that when we do anything with PyCall on multiple threads we get Segmentation faults.
Happy to provide more context, just wanted to check if this was designed to be in a multi-threaded environment first
The text was updated successfully, but these errors were encountered: