Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INCIDENT] all memory somehow gets used #37

Closed
ddkohler opened this issue Oct 24, 2023 · 6 comments
Closed

[INCIDENT] all memory somehow gets used #37

ddkohler opened this issue Oct 24, 2023 · 6 comments

Comments

@ddkohler
Copy link
Contributor

Every two weeks or so, our lab computers will have ~100% memory usage and become extremely lagged and buggy (applications crash/cannot open). I have observed this on the ps and fs tables, and I have heard that something similar happens on the Waldo system.

This happens without most services running (docker, most daemons), but attune and attune-delay will be running. Task manager shows a single python processes taking up a large chunk of memory (~0.5-4 GB). The memory used in all task manager is well short of the total computer memory usage (~30 GB).

I am attributing this to attune, but I might well be off. Just documenting the incident for now. I can imagine this being a yaqd-control/nssm issue as well. I will try running attune daemons in the foreground to have more information.

In all cases, the issue is resolved by restarting.

@ksunden
Copy link
Member

ksunden commented Oct 24, 2023

By restarting what?

@ddkohler
Copy link
Contributor Author

Sorry, crucial detail. The problem is fixed by restarting the computer.

When the daemons are closed, I can verify the python process closes in Task Manager, but the computer memory usage remains mostly unchanged and the computer remains lagged. Again, the reported memory usage greatly exceeds the sum of contributions from individual processes.

@ksunden
Copy link
Member

ksunden commented Oct 24, 2023

I am very skeptical about this being attributable to yaqd-attune, I'd be more inclined to think that perhaps the docker stuff is eating memory, and would be interested to know if just restarting that helps? (That may also slightly hide its memory usage from windows, at least from individual user processes, but may show up in the total)

That is my gut instinct/first reaction, but certainly not proven.

@untzag
Copy link
Member

untzag commented Nov 15, 2023

Hi all was this fixed by yaq-project/yaq-python#77

@ddkohler
Copy link
Contributor Author

@untzag I am not certain yet, but there is evidence that is the case. The fs computer just had a repeat incident and the high memory usage was relieved by restarting the daemons. This behavior is consistent with a daemon memory leak. We have updated yaqd-core and will be waiting to see if the problem repeats itself over the next week or two.

@ddkohler
Copy link
Contributor Author

ddkohler commented Dec 6, 2023

I am not really observing memory issues anymore, so I am going to call this issue resolved by yaq-project/yaq-python#77 🥳

@ddkohler ddkohler closed this as completed Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants