Speed up cv2-based motion function with multiprocessing #213

balintlaczko · 2021-04-08T09:19:00Z

The motion function (technically method) is implemented in Opencv (though there is an FFmpeg-based implementation in the _utils.py, that however produces slightly different results), and since it is doing a lot of matrix operations in one big loop, it basically maxes out 1 core of the CPU. I recently started to study the Numba library, and I think this situation is very adequate for its use. It even supports CUDA, which could also be an item on our enhancement lists, but for now I would be happy to see improved speeds with only the CPU. With most of the functions now based on FFmpeg, the speed of the motion function sticks out a bit too much (especially considering that it is one of the most-used functions).
One thing that could simplify the implementation is that luckily we already work mostly with numpy arrays in the motion function, so there probably won't be too many changes necessary.
An +1: since librosa has numba as a dependency, we wouldn't extend our dependencies by using it.

The text was updated successfully, but these errors were encountered:

balintlaczko · 2021-06-26T19:04:57Z

Although Numba could potentially add some speed improvements, I think it might not solve the multicore-part of the issue, it would rather speed up the process that happens on a core. A bit of research hinted that opencv does not always cooperate with numba in obvious ways. So I put numba aside (for now) and went ahead to implement the scalable motion function using multiprocessing. This will be much more scalable, since it will use all the available cores on the system (will be great for VDI hopefully). Currently implemented as a separate method, but after successful platform testing, I'll make multiprocessing (and then the number of cores to use) as parameters.

#213

- mg_motion_mp will now produce _exactly_ the same results with any number of processes (tested from 1 to 12). - added num_processes parameter #213

balintlaczko · 2021-06-28T20:33:17Z

OK. Multicore version of motion is thoroughly tested, so it produces identical results regardless of the number of processes (checked csv line by line, motiongrams pixel by pixel, and videos frame by frame). Tested on Ubuntu, it seems to check out (after the bugfixes). Need to check in Mac OS and Colab before moving to the next step (which will be fully integrating it into the default mg_motion).

balintlaczko · 2021-06-28T21:09:50Z

A quick (single-shot) benchmarking attempt on my 6-core 12-thread laptop:

With 2 cores it is 1.832377 times faster.
With 3 cores it is 2.367094 times faster.
With 4 cores it is 2.751237 times faster.
With 5 cores it is 3.021663 times faster.
With 6 cores it is 3.061283 times faster.
With 7 cores it is 3.138544 times faster.
With 8 cores it is 3.218801 times faster.
With 9 cores it is 3.183287 times faster.
With 10 cores it is 3.219616 times faster.
With 11 cores it is 3.270581 times faster.
With 12 cores it is 3.157296 times faster.

It is a bit curious why the performance dropped with the maximum amount of cores available in the end, maybe it is just a measurement error. However it is also clear that (at least on Windows) spawning more and more processes leads to diminishing results. The improvement is enormous going from a single core to dual core. It is interesting that the leap from 1 core to 2 cores is bigger than the improvement from 2 cores to 12 cores.

balintlaczko added the enhancement New feature or request label Apr 8, 2021

balintlaczko self-assigned this Apr 8, 2021

balintlaczko changed the title ~~Speed up cv2-based motion function with Numba~~ Speed up cv2-based motion function with multiprocessing Jun 26, 2021

balintlaczko added a commit that referenced this issue Jun 26, 2021

added mg_motion_mp, experimental for the time being

32c4dee

#213

balintlaczko added a commit that referenced this issue Jun 28, 2021

fixed minor bugs when using different numbers of processes

b9089ff

- mg_motion_mp will now produce _exactly_ the same results with any number of processes (tested from 1 to 12). - added num_processes parameter #213

alexarje added the video label Dec 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed up cv2-based motion function with multiprocessing #213

Speed up cv2-based motion function with multiprocessing #213

balintlaczko commented Apr 8, 2021 •

edited

Loading

balintlaczko commented Jun 26, 2021

balintlaczko commented Jun 28, 2021

balintlaczko commented Jun 28, 2021

Speed up cv2-based motion function with multiprocessing #213

Speed up cv2-based motion function with multiprocessing #213

Comments

balintlaczko commented Apr 8, 2021 • edited Loading

balintlaczko commented Jun 26, 2021

balintlaczko commented Jun 28, 2021

balintlaczko commented Jun 28, 2021

balintlaczko commented Apr 8, 2021 •

edited

Loading