-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up cv2-based motion function with multiprocessing #213
Comments
Although Numba could potentially add some speed improvements, I think it might not solve the multicore-part of the issue, it would rather speed up the process that happens on a core. A bit of research hinted that opencv does not always cooperate with numba in obvious ways. So I put numba aside (for now) and went ahead to implement the scalable motion function using |
- mg_motion_mp will now produce _exactly_ the same results with any number of processes (tested from 1 to 12). - added num_processes parameter #213
OK. Multicore version of |
A quick (single-shot) benchmarking attempt on my 6-core 12-thread laptop:
It is a bit curious why the performance dropped with the maximum amount of cores available in the end, maybe it is just a measurement error. However it is also clear that (at least on Windows) spawning more and more processes leads to diminishing results. The improvement is enormous going from a single core to dual core. It is interesting that the leap from 1 core to 2 cores is bigger than the improvement from 2 cores to 12 cores. |
The motion function (technically method) is implemented in Opencv (though there is an FFmpeg-based implementation in the
_utils.py
, that however produces slightly different results), and since it is doing a lot of matrix operations in one big loop, it basically maxes out 1 core of the CPU. I recently started to study the Numba library, and I think this situation is very adequate for its use. It even supports CUDA, which could also be an item on our enhancement lists, but for now I would be happy to see improved speeds with only the CPU. With most of the functions now based on FFmpeg, the speed of the motion function sticks out a bit too much (especially considering that it is one of the most-used functions).One thing that could simplify the implementation is that luckily we already work mostly with numpy arrays in the motion function, so there probably won't be too many changes necessary.
An +1: since librosa has numba as a dependency, we wouldn't extend our dependencies by using it.
The text was updated successfully, but these errors were encountered: