-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unnamed threads don't appear to be freed, at least not in a timely manner. #396
Comments
This may just be a documentation issue of ABT_thread_exit is required to free the ULT. |
Hmm, if I add an ABT_thread_yield, in the thread creation loop, it is much better. So I suppose it's not a leak but may indicate a peak memory usage issue. |
@jolivier23 just curious, does the memory consumption change if you set the ABT_MEM_MAX_NUM_STACKS environment variable? FWIW the default value of this parameter is 65536, but in Margo we set it to a much lower default value (8) to prevent Argobots from accumulating too many unused stacks for potential reuse. Although now that I look back at it I'm wondering if we need to consider if that value is having any other side effects. At any rate, I'm curious if that's a factor in the behavior here. |
@carns For the example above, it makes no difference what I set that to. We can try in DAOS and see if it makes any difference. I'm trying to understand the behavior of this particular example. If I add ABT_thread_yield, the problem essentially goes away in that test case. My assumption is that the primary ULT is creating the threads to execute on the other xstreams and is the party responsible for freeing them but never gets a chance to do so until the ABT_thread_yield or some other yield point. If this is the case, perhaps we should be at least checking a condition on create and potentially cleaning up old threads before creating new ones? It seems like this may be a red herring for the DAOS problem but seems like an issue nonetheless. |
@jolivier23 Found the code to free memory for unnamed threads automatically. No call to ABT_thread_exit() required. argobots/src/include/abti_thread.h Line 143 in 6d216a9
|
@carns in looking at the code, it looks like the max is 1024 now
65536 could probably explain how much memory is used during rebuild in daos but 1024 does not. |
Ah, right. I forgot it was tuned down. Re: your other comment I'm not sure who's responsible for the free (though I do know it eventually happens; we've had similar patterns with no runaway memory usage), so I'm not sure what's needed to encourage it to happen faster... |
I'm using v1.1. To illustrate, I made this patch to the hello_world example. If I run with a large number of ULTs, the memory grows very fast. Since it is executing hello_world, I would expect it to be a much smaller memory footprint. Note, I also changed the stacksize, not sure if that has anything to do with the problem but it matches an issue we see with memory growth in DAOS
The text was updated successfully, but these errors were encountered: