Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kill Zombies from the Fractal Dimension #638

Merged
merged 2 commits into from
Aug 13, 2021
Merged

Conversation

mdoube
Copy link
Contributor

@mdoube mdoube commented Aug 13, 2021

Fixes #637
ExecutorService thread pools were being created but not cleanly shutdown(). Now they are properly terminated. Happy Friday 13th!

mdoube added 2 commits June 8, 2021 15:01
It still seems like instead of there being nProcessors number of
threads, about 300 threads are created per processor and quickly
terminated. Now peak threads are more like 3× number of processors, but
there are still approximately nProcessors-worth of zombie threads
surviving after the plugin finishes executing.
Happy Friday 13th
@mdoube
Copy link
Contributor Author

mdoube commented Aug 13, 2021

Closes bonej-org/BoneJ2#300

@mdoube mdoube requested review from ctrueden and imagejan August 13, 2021 09:44
@ctrueden ctrueden merged commit fa7f03e into imagej:master Aug 13, 2021
@ctrueden
Copy link
Member

ctrueden commented Aug 13, 2021

Wow, nice! Thanks. 🍻

I'm wondering why these ops use their own ExecutorServices rather than the one already baked into the ThreadService. But since they do, doing the cleanup as well is definitely a step forward. The new incarnation of ImageJ Ops will use ImgLib2's unified multithreading support.

@ctrueden
Copy link
Member

ctrueden commented Aug 13, 2021

I tried to cut a new release, but am seeing some test failures on my local system (but not on CI):

[INFO] [ERROR] Failures: 
[INFO] [ERROR]   OutlineTest.testEdgeSquare:177 Wrong number of foreground elements in interval expected:<7> but was:<5>
[INFO] [ERROR]   OutlineTest.testEdgeSquareExcludeEdgesFalse:206 Wrong number of foreground elements in interval expected:<8> but was:<6>
[INFO] [ERROR]   OutlineTest.testOutlineSquare:148 Wrong number of foreground elements in interval expected:<8> but was:<5>

I'll investigate when I can.

@mdoube
Copy link
Contributor Author

mdoube commented Aug 16, 2021

test failures on my local system (but not on CI)

This smells like something we've seen for a while. Test failures on some machines (usually bigger ones, e.g. workstations) but passing on CI and smaller machines (e.g. laptops).

bonej-org/BoneJ2#196 (comment)
bonej-org/BoneJ2#229

I wonder if there is a problem with the parallelisation model that leads to overwriting of pixel values depending on number of threads in use.

@mdoube
Copy link
Contributor Author

mdoube commented Aug 16, 2021

I tried to cut a new release, but am seeing some test failures on my local system (but not on CI):

[INFO] [ERROR] Failures: 
[INFO] [ERROR]   OutlineTest.testEdgeSquare:177 Wrong number of foreground elements in interval expected:<7> but was:<5>
[INFO] [ERROR]   OutlineTest.testEdgeSquareExcludeEdgesFalse:206 Wrong number of foreground elements in interval expected:<8> but was:<6>
[INFO] [ERROR]   OutlineTest.testOutlineSquare:148 Wrong number of foreground elements in interval expected:<8> but was:<5>

I'll investigate when I can.

This PR fixes it at the cost of running in a single thread, which is OK for now.
#639

@mdoube
Copy link
Contributor Author

mdoube commented Aug 16, 2021

why these ops use their own ExecutorServices rather than the one already baked into the ThreadService

There is another PR from last year that does exactly that, and which should be used instead of my hacks on Outline.
#624

but unfortunately that one is buggy and we should use #639 until we figure out a better multithreading strategy. Test and builds now pass and user operation works on my 40-core workstation.

@rimadoma
Copy link
Contributor

I chose to use my own ExecutorService instead of ThreadService because at the time using a FixedThreadPool ran faster with a certain N than what ThreadService used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fractal Dimension creates thousands of zombie threads that crash ImageJ
3 participants