Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

thread process_stack(img, ref) #35

Open
rg314 opened this issue Apr 20, 2021 · 6 comments
Open

thread process_stack(img, ref) #35

rg314 opened this issue Apr 20, 2021 · 6 comments
Assignees
Labels
enhancement New feature or request

Comments

@rg314
Copy link
Owner

rg314 commented Apr 20, 2021

multithread _process_stack method and include num_workers arg i.e. process_stack(img, ref, num_workers=x). Note that the log file (either h5 or defaultdict) will need to have a lock.

The part that needs threading is

#pytraction/core.py
    def _process_stack(self, img_stack, ref_stack, bead_channel=0, roi=False, frame=[], crop=False):
            ...
            file
            for frame in list(range(nframes)):
                res = do_somestuff()
                file.add(res)

becomes

#pytraction/core.py
    def _process_stack(self, img_stack, ref_stack, bead_channel=0, roi=False, frame=[], crop=False):
            ...
            file = p.map(do_somestuff, range(nframe))

The following scripts should be considerably faster.

from pytraction.core import TractionForce
from pytraction.utils import plot

pix_per_mu = 1.3 # The number of pixels per micron 
E = 100 # Youngs modulus in Pa

img_path = 'data/example1/e01_pos1_axon1.tif'
ref_path = 'data/example1/e01_pos1_axon1_ref.tif'

traction_obj = TractionForce(pix_per_mu, E=E)
img, ref, _ = traction_obj.load_data(img_path, ref_path)
log = traction_obj.process_stack(img, ref)

plot(log, frame=0)
plt.show()
@rg314 rg314 added the enhancement New feature or request label Apr 20, 2021
@rg314
Copy link
Owner Author

rg314 commented Apr 20, 2021

@Phlair create new branch as iss35 :)

@Phlair
Copy link
Collaborator

Phlair commented Apr 20, 2021

Rewrote _process_stack() in a multithreading manner using concurrent.futures, looks like minimal (possibly zero) gains on runtime. Looking at resource monitor during testing it looks like my CPU is maxing out at ~13% regardless of whether it's being threaded or not (this number makes sense as 1 core out of 8). This makes me think there are possible performance gains to be had with multi processing instead... I'll look into that next, will need slightly different thinking for implementation.

@rg314
Copy link
Owner Author

rg314 commented Apr 21, 2021

I've always tried with multiprocessing lib and it never works!

# we're yielding each complete frame (futures) back in the main process here so we're threadsafe
for future in concurrent.futures.as_completed(futures):
# this is ugly but should be performant
frame_dict = future.result()

How much harder is it to write directly to a locked dict? Or is the cost-benefit not worth it?

@Phlair
Copy link
Collaborator

Phlair commented Apr 22, 2021

Managed to partially implement multiprocessing and seeing expected performance gains 👍
Couple of things:

  • seeing much more verbose output from the processing when run in processes... no idea why that is? Looks like it's coming from PIV.
  • in order for the multiprocessing to execute successfully, the high-level code itself, as in this code:
    traction_obj = TractionForce(pix_per_mu, E=E)
    img, ref, _ = traction_obj.load_data(img_path, ref_path)
    log = traction_obj.process_stack(img, ref)
    must be run in a
    if __name__ == "__main__":
    This has implications for how someone would use the library

@Phlair
Copy link
Collaborator

Phlair commented Apr 22, 2021

Also in response to your message, generally I would avoid locking unless you really need to lock. In this case I'm only populating that defaultdict ('log') back in the main thread/process with the response from the processes as they complete, so no need to lock anything

@rg314
Copy link
Owner Author

rg314 commented Apr 23, 2021

Thanks for implimenting the verbose is coming from the PIV. I've been using a hacky work around to supress this prcoess and need to look into a looging lib to handle this.

Thanks for added the part on multithreading :) the if name == "main" part is super annoying for non expect python users. Do you think it's possible to call a script else where with an arg parse so this is hidden from the user i.e.

#core.py
def process_stack(pool=True)
    ...
    if pool:
        tmp = tempfile.TemparyFile()
        os.system('python pool_process_stack.py --tmp_dir --args')
        with open(tmpfile) as f:
            results = f.read()
    ...
#pool_process_stack.py

from core import process_stack
import argparse

if __name__ == "__main__":
    args = args.paser()
    pfunc = partical(func, args)
    res = p.map(pfunc, items)
    res.save(tmpfile)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants