Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KIP crashes from time to time #311

Open
Techstyleuk opened this issue Feb 25, 2024 · 68 comments
Open

KIP crashes from time to time #311

Techstyleuk opened this issue Feb 25, 2024 · 68 comments
Labels
bug A bug report

Comments

@Techstyleuk
Copy link

I have been running KIP on a new RPi 5 running Openplotter 4 Starting (12-29-2023), up to date. Signal K is version 2.5.0, Kip is 2.7. It is running in Chromium, and connected to the Signal K demo server.

I am running it 24/7 and each time I come back to it, perhaps every 6 hrs, Kip looks like this:

image
@godind
Copy link
Collaborator

godind commented Feb 25, 2024

Hi,

Thanks for the report. Could you share the browser console log? That might help troubleshooting.

Also can you share a screenshot of the scream you lean it on. It could be a memory leak issue with a specific widget.

I'll see if I can find a problem.

@godind godind added the bug A bug report label Feb 25, 2024
@Techstyleuk
Copy link
Author

I think this is the console log entry associated:
image

I have not changed anything yet, so KIP is as per the demo:
image
this is page zero and was the one I was on

@godind
Copy link
Collaborator

godind commented Feb 25, 2024

This warning image is just a meta key for Apple devices so you can save KIP as an App. It's needed even if Google complains about it!

I suggest leaving the console log open so you'll see what's logged if the browser runs out of memory. I'll test on my end and see if I can reproduce the problem but it could be related to your setup.

No one has reported a similar issue, yet!

@godind
Copy link
Collaborator

godind commented Feb 27, 2024

Next Time it crashes can you check in the Signalk logs if you see any errors please. I ran for several hours with memory profiling but nothing unusual so far.

@godind
Copy link
Collaborator

godind commented Feb 28, 2024

Can you also update Chromium and share the version please?

@Techstyleuk
Copy link
Author

Techstyleuk commented Feb 29, 2024 via email

@godind
Copy link
Collaborator

godind commented Mar 21, 2024

Is the issue still present? If so leave KIP running with the console log open and share the latest errors.

Thanks!

@Techstyleuk
Copy link
Author

Sorry for my lack of feedback, I had a few other issues so last weekend I did a new install and I have been watching it, but not closely. I did just see it crash but did not have the console open, I am now continuing to run with the console open. I am also monitoring the memory usage, and it looks a little strange - this is all after the crash:
image
I was not doing anything that would cause the system to just increase memory, or to drop like that.

@Techstyleuk
Copy link
Author

I did reload KIP browser window when my memory was around 90%, as it wasn't responding and the Memory and CPU usage dropped dramatically:
image

@godind
Copy link
Collaborator

godind commented Mar 25, 2024

Looks like there is a potential memory leak. I've ran KIP over night and not seen any leaks, yet!

Can you share the KIP versions of your last upgrade.

Also upgrade Chromium to the latest version.

In Chromium when it crashes, do you have other browser tabs or browser window open?

@Techstyleuk
Copy link
Author

KIP is at 2.9.0, I upgraded yesterday afternoon.

Today I came home from work and it looks like this:
image
Chromium has crashed but is still on the taskbar, and there is still a black window there, but no tabs or anything. I cannot get the log without restarting it. when restarted it comes up normally. I am running 4 tabs - KIP, Gmail, Signalk, Pandora Music. judging by the graph below, I'd say it crashed at about 8:00am this morning:
image
Chromium version is: Version 120.0.6099.102 (Official Build) Built on Debian , running on Debian 12 (64-bit).

I am monitor the memory curves for a couple of hours then kill KIP and see if the curve flattens out

@godind
Copy link
Collaborator

godind commented Mar 25, 2024

This will be difficult to find but with some patience and collaboration I'll fix it.

Can you upload your config file here so I can reproduce a similar layout? You will find the file named 9.json under ~/.signalk/applicationData/users/"signed in user"/kip

@Techstyleuk
Copy link
Author

my KIP layout is just the demo and using the demo server for data. I am running Openplotter 4, so Bookworm. The only Kip directory was at: ~/.signalk/node_modules/@mxtommy/kip and the only json file is packages.json

@godind
Copy link
Collaborator

godind commented Mar 26, 2024

Ok. So you are using local browser storage and not signed in to the server.

Running some tests on my end.

@Techstyleuk
Copy link
Author

I was signed into the server but am I signed into KIP? not sure:
image
I killed the KIP window just after 8pm and the memory no longer escalates, so it looks like it is definitely KIP.
image

@Techstyleuk
Copy link
Author

I noticed that when I went to KIP's Setting screen, the CPU usage and memory escalation stops:
image
I have since gone back to the earlier screen (page0) and the CPU and memory increase. I navigated to page 1 (a big dial with COGS) and the CPU drops off and the Memory flattens

@godind
Copy link
Collaborator

godind commented Mar 26, 2024

Nice find! So could be related to a specific widget.

Charts take up cpu and memory but should stay flat.

@Techstyleuk
Copy link
Author

I just got home and it had crashed. I had already deleted the Wind display and the historic True wind graph. After waiting to see if the memory continued to rise, I deleted the "Apparent wind speed" widget (ref: the first picture on this thread). after this was deleted, the memory stayed low, then I added back in the steering widget, and the memory continued to be flat:
image
Therefore it appears to be associated with the Apparent wind speed widget

@godind
Copy link
Collaborator

godind commented Mar 27, 2024

Great find! I'll try it myself. I've tested a few core services (everything disabled, no Datasets and only one Blank widget) that worst does not appear to leak.

I'll try with only the linear gauge widget.

@godind
Copy link
Collaborator

godind commented Mar 27, 2024

Sorry, reading again I'm not sure about one point:

In the picture, per yellow line sections, I'm not sure what widget was present or removed? You talk about Wind Steering, Data Chart (historical data) and linear gauge (Apparent Wind), right?

@Techstyleuk
Copy link
Author

I started removing stuff and the Historic data chart was removed first, then the Wind Steering dial and neither stopped the condition. I then removed the Linear Gauge (Apparent wind) and this flattened the memory usage:
image
after that, I started adding gauges back to make sure the issue didn't come back.

@godind
Copy link
Collaborator

godind commented Mar 27, 2024

Ok. So you did not add back Apparent Wind Speed widget and you did not confirm the leak comes back when it is present ?

I will test.

@Techstyleuk
Copy link
Author

I ran out of time last night, I will try that tonight.

@Techstyleuk
Copy link
Author

Just got home and I am not sure it is fixed. My Kip screen is frozen, and I had to reload it. The graphs looked like this:
image
so it looks like it crashed, perhaps, at 8am.

@godind
Copy link
Collaborator

godind commented Mar 28, 2024

Are you running in KIP fulll screen mode using the KIP fullscreen menu option?

@Techstyleuk
Copy link
Author

No, I run KIP in a browser window, occupying the right half of the screen, with OCPN on the left half of the screen

@godind
Copy link
Collaborator

godind commented Mar 28, 2024

Just checking! I made a version with a few clean up for you to test. I am still not able to put a finger right on the problem. But I still have a few leads to investigate.

Can you install a beta version on your system to test please? To install using a command shell:
cd ~/.signalk
npm install @mxtommy/kip@beta

Once completed, hit the Restart link in the Signal K Admin site. Once restarted, in the Signal K Admin site's Appstore / Installed menu, you should see KIP 2.9.1-beta1 listed.

Run KIP and see if it's any better.

Thank you!

@Techstyleuk
Copy link
Author

it is running the Beta version as of now, I can run it overnight and then check it. is there a quick way to reload to the default set of widgets if we want to use that?

@godind
Copy link
Collaborator

godind commented Mar 28, 2024

Yes. Go in Configuration / Settings, then Storage tab. At the bottom you will see a button labeled Load Demo

@Techstyleuk
Copy link
Author

with Notifications turned off, KIP Crashed around 8am yesterday.
image
I'm deleting the linear gauge and then running it

@godind
Copy link
Collaborator

godind commented Mar 30, 2024

I've been at it all week. Not easy to find as it takes many hours before crashing.

I'll try with only one blank widget and everything else's turned off.

@godind
Copy link
Collaborator

godind commented Mar 31, 2024

I've ran KIP using Firefox for 48h with no problem and memory stays flat at 50%. The problem is Firefox have a bug with the radial gauge :(

I could not reproduce the problem with Safari either. That makes me think it might be related to how Chromium handles memory... I'll dig this some more...

@Techstyleuk
Copy link
Author

Similar results in Firefox. after running for 24hrs, memory is 45% and flat with all widgets in the default layout running

@Aitonos
Copy link

Aitonos commented Apr 2, 2024

I have been using KIP for long time and I have always found this issue.

In Openplotter 3, it will remain solid as a rock if the browser is left minimized (but data is still flowing as the MIN/MAX keeps being updated). But if the browser or the app (Both versions) are being maximized / showing the window, then it will collapse after some time ( hard to say or measure not a clear pattern) independently of whatever you do ( I have tried it with even 1 numerical widget).

I tried this with clean installation of OP 3 , and same issue.

I have tried with Mobile, tablet, Firefox, Opera, Chromium, etc and it always happens.

I learnt to live with this issue as it's a minor payoff compared for the good thing that Kip is!

@Techstyleuk
Copy link
Author

@Aitonos,
interesting, I have never seen KIP crash before in the 2 years I have been running Openplotter 2 and I always run the screen like this:
image
Never minimized, always half and half.
with Chromium, I was seeing crashes every 8 hours. I have been running for 48 hours (split screen) and the data for the last 24 is below:
image
there is some drift upwards of Memory, but much slower

@godind
Copy link
Collaborator

godind commented Apr 3, 2024

Well that might be a good clue! Could be due to some threads going to sleep when browser is minimized, screen saver or power management comes on... worth looking into.

@Techstyleuk
Copy link
Author

it finally crashed this evening after over 3 days of running on Firefox:
image

@Techstyleuk
Copy link
Author

I tried again with Chromium and it had failed in less than 24 hours and the Memory profile is clearly different, with Firefox looking very step like and Chromium building constantly:
image

@godind
Copy link
Collaborator

godind commented Apr 5, 2024

@Techstyleuk You don't have the radial gauge spinning issue with Firefox?

@Techstyleuk
Copy link
Author

@godind if you are referring to #222 then no, I have never seen that in Chromium or Firefox

@godind
Copy link
Collaborator

godind commented Apr 5, 2024

@Techstyleuk strange. I have it on my Debian 10 VM running Firefox

@Sparhawk76
Copy link

I don't know if I am having the same issue or a different one. My use case is a bit different in that I run my OP3 install headless on a pi4-8gb. I access kip via Chrome on my Windows based mini-pc. I recently installed a second monitor connected to the mini-pc to have kip loaded all the time. My config is 4 measuring radial dials at the top monitoring my batteries and solar setup, below them an embedded grafana graph of my battery/solar info over time, then below that 3 more measuring radial dials, and a text field (this row for starlink stats), then on the bottom row I have a radial baseplate compass (rudder position), and beside that an embeded Pypilot Control tab. I have found that every morning since I set this up, I wake up and find the Chrome Aw Snap page. A bit of searching later and I found this bug report, which seemed current and similar issues.

@godind
Copy link
Collaborator

godind commented Apr 18, 2024

Hey guys. SK 2.7.2 is out and fixes a leak that caused repeated connection termination on the server side. It takes a while before the bug starts. Like 6-8h. It did not crash the server, just killed all clients.

I'm not sure this will resolve the issue, but it's possible that unknowingly, after 6-8h, KIP's connection is being repeatedly terminated causing an auto reconnect in KIP in the background.

Can someone test for me please? I dont't have a rip at hand.

Thanks

@godind
Copy link
Collaborator

godind commented Apr 18, 2024

eadless on a pi4-8gb. I access kip via Chrome on my Windows based mini-p

That means you get the problem in Chrome on windows, right? It's really hard to find this kind of problem! I'm rewriting lots of code and cleaning up all over. Have not found it yet. Next release will tell us more...

@Techstyleuk
Copy link
Author

this is this morning, after update to latest KIP and SK and a reboot:
image
Chromium window did not crash

@godind
Copy link
Collaborator

godind commented Apr 18, 2024

Thanks for the report!

Have we won a small fight? Let see if need a few more fights to win the battle.

In any case, this issue has triggered a lot of rewrite and refactoring of KIP. All positive.

@Techstyleuk
Copy link
Author

when I got home from work today, KIP had crashed, looking at the curves, I cannot see the point where it failed, but it was working 12hrs ago.
image
image

@godind
Copy link
Collaborator

godind commented Apr 19, 2024

So it doubles run time. Let's see if the next release improves things.

@godind
Copy link
Collaborator

godind commented May 6, 2024

@Techstyleuk still observing partial improvements?

I'm still working on to publish a refactored version that should improve ressource usage and close a few "potential" leaks, although minor.

@Techstyleuk
Copy link
Author

I have been busy getting the boat ready to go back in the water, but I have this pi running in my Basement. I looked at it and it was frozen, I reset it and then will check it in 6 hrs or so.

@Techstyleuk
Copy link
Author

it lasted about 12 hrs

@godind
Copy link
Collaborator

godind commented May 28, 2024

Hi all. Latest release v2.10.0 has a few fixes. It runs better with n my end. Can you report when you find time. Thanks!

@Techstyleuk
Copy link
Author

I upgraded last night, It was still going strong after 12hrs. I had to shut it down for another reason, but will run it from later this morning and let you know in a couple of days

@Techstyleuk
Copy link
Author

It did just crash after 13hrs. the funny thing is, this is my test environment running in my basement - Pi5, Openplotter V4 (Bookworm), latest KIP and SignalK. I was on my boat all weekend (55hrs) and never turned the system off - it is a Pi4, Openplotter 2, an earlier version of SignalK and a 2 week old version of KIP, upgraded just before launch 2 wks ago

@godind
Copy link
Collaborator

godind commented May 29, 2024

Thanks for the report. I can't pinpoint exactly where the problem is. I keep searching and plugging potential leaks. I double the time so far.

It is only reported with Chromium so far, not on all Chromium, os and hardware versions. It's related to Chromium install setting or builds and/or the way it manages the Java virtual machine memory cleanup and memory allocation.

@godind
Copy link
Collaborator

godind commented Jun 16, 2024

@Techstyleuk is KIP still running fine on pi4 and crashing on pi5 after 12+ hours?

Just keeping track.

Thanks

@Techstyleuk
Copy link
Author

I had to turn my basement one off, but I will set it back up and check again after updating to the latest

@Techstyleuk
Copy link
Author

after about 12hrs last night the whole thing was unresponsive. I will do an update and start testing again

@Techstyleuk
Copy link
Author

after updating KIP to the latest and Signalk to the latest KIP ran for 11 hrs before needing to be refreshed due to being non responsive

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug report
Projects
None yet
Development

No branches or pull requests

4 participants