-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
udp_queues_linux.go: Expose UDP drops via gauge analogous to queue sizes #2993
base: master
Are you sure you want to change the base?
Conversation
084bd27
to
b297b22
Compare
@discordianfish @SuperQ I'm assuming that the failing circleci tests are because I need to update the fixtures (via |
b297b22
to
271bcee
Compare
I'm not sure why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, no, this seems incorrect. Drops appears to be a counter, so this needs to be a new metric, not a label on an existing metric.
Drat; you're right. Let me see if I can fix this quickly. |
Hello, |
ecf14b8
to
7d1e8c6
Compare
…zes. Signed-off-by: Chris Cleeland <[email protected]>
Update e2e test output to include the drops. Signed-off-by: Chris Cleeland <[email protected]>
Don't use a pointer without checking it first. Signed-off-by: Chris Cleeland <[email protected]>
I think it makes sense for "drops" to sit alongside the queue length gauges in prom stats because they are all neighbors in procfs (source of these stats). Moreover, in reading the commit log message for the original creating work for udp_queues, I think there may have been some misreading or confusion between the word "state" and the common short-form of "stats" to mean "statistics". The original author "chose the name 'udp_queue' instead of 'udpstat' as UDP has no state"; I believe that 'udpstat' might actually be the more appropriate name. Signed-off-by: Chris Cleeland <[email protected]>
7d1e8c6
to
1a67c62
Compare
Signed-off-by: Chris Cleeland <[email protected]>
9b5b3ce
to
146d5e8
Compare
What I thought would be a couple of easy commits/pushes is turning into a bunch of work (maybe because I'm running on macos and not on linux?) In any case, I have to set this aside for awhile. |
@cleeland can you summarize what you tried and what was the issue? |
1c9044e
to
e7b372f
Compare
The biggest impediment is that I can't run the makefiles on ventura 13.7.1. I thought I might still have a dockerfile or lima setup from when I initially did this PR, but I've not found it. I think once I get a chance to spin up lima with proper dev environment I'll be able to make progress. |
@nicolas-laduguie I think I finally understand the issue I'm having: changing To do this I think I'll have to change the creation in I was able to use lima to set up two linux machines (amd64 and arm64) to use as runtime environments. |
Signed-off-by: Chris Cleeland <[email protected]>
This separates the gauges related to udp_queues and the counter related to the total number of drops, but keeps them under the same namespace. Signed-off-by: Chris Cleeland <[email protected]>
e7b372f
to
13f019f
Compare
@nicolas-laduguie I got something that achieves my goal, though I'm not sure if it's necessarily the best choice. Feel free to critique and/or modify. Most importantly, things that are gauge-ish are gauges, and things that are counter-ish are counters, so @SuperQ 's original issue is addressed. |
@@ -158,7 +158,7 @@ thermal | Exposes thermal statistics like `pmset -g therm`. | Darwin | |||
thermal\_zone | Exposes thermal zone & cooling device statistics from `/sys/class/thermal`. | Linux | |||
time | Exposes the current system time. | _any_ | |||
timex | Exposes selected adjtimex(2) system call stats. | Linux | |||
udp_queues | Exposes UDP total lengths of the rx_queue and tx_queue from `/proc/net/udp` and `/proc/net/udp6`. | Linux | |||
udp | Exposes UDP statistics from `/proc/net/udp` and `/proc/net/udp6`. | Linux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess it makes sense to rename this but:
- We should rename the file as well
- Technically its a breaking change, e.g people might enable/disable the collector explicitly so when upgrading it will refused to start
@SuperQ wdyt? maybe we should give registerCollector a list of aliases and have it print a deprecation warning if any of them are used?
Or to just get this merged, maybe just not rename it for now..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I figured this wasn't an ideal solution, but I recognize that I'm too ignorant to know what sorts of API-ish guarantees are in place. I opted for "make sure that the metric names stay the same", although since I changed the namespace from "udp_queues" to "udp" it's possible I didn't even achieve that.
FWIW, I chose this after looking at some of the other metrics, and settled on something similar to the pattern used by cpu
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi guys, I contributed this PR based on your work @cleeland .
As @discordianfish said, the existing exporter name udp_queues
might be already used by people disabling that collector.
So it's worth to not change it otherwise it would be a breaking change and would need to wait a major release from node exporter.
If it's fine for you, let's work in my PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nicolas-laduguie thanks! I left my thoughts over on the new PR. Thanks for continuing to help shepherd this.
Since a common reason for monitoring UDP queue sizes can be in teasing out performance and QoS issues, it would be convenient to have drops available in the same frames of data. Since the underlying source of the data provides that information, this PR exposes drops modeled as a gauge just as queue sizes are.
@discordianfish @SuperQ