-
Notifications
You must be signed in to change notification settings - Fork 835
Monitor free disk percentage, not just absolute space #4518
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds disk space percentage monitoring alongside the existing absolute disk space tracking. The system now tracks both available disk bytes and available disk percentage, allowing for more flexible disk space health checks.
Key Changes:
- Modified
AvailableBytes()to return both available bytes and percentage of free disk space - Added new configuration option for percentage-based disk space warning threshold (default 97%)
- Updated health checks to monitor both absolute and percentage-based disk space thresholds
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/storage/storage_unix.go | Modified to calculate and return percentage of available disk space |
| utils/resource/usage.go | Added percentage tracking fields and methods to resource manager |
| utils/resource/resourcemock/user.go | Added mock implementation for percentage method |
| utils/resource/no_usage.go | Updated no-op implementation return values |
| snow/networking/tracker/resource_tracker.go | Added percentage tracking to disk resource tracker |
| node/node.go | Updated health check to validate percentage threshold |
| config/node/config.go | Added percentage threshold configuration field |
| config/keys.go | Added configuration key for percentage threshold |
| config/flags.go | Added command-line flag for percentage threshold |
| config/config.go | Updated disk space config validation to handle percentage |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
2ab9bcb to
068325c
Compare
Signed-off-by: Yacov Manevich <[email protected]>
068325c to
035cc6b
Compare
Signed-off-by: Yacov Manevich <[email protected]>
node/node.go
Outdated
| } | ||
|
|
||
| if availableDiskPercentage < n.Config.WarningThresholdAvailableDiskSpacePercentage { | ||
| err = fmt.Errorf("remaining available disk space percentage (%d%%) is below minimum required available space percentage (%d%%)", availableDiskPercentage, n.Config.WarningThresholdAvailableDiskSpacePercentage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| err = fmt.Errorf("remaining available disk space percentage (%d%%) is below minimum required available space percentage (%d%%)", availableDiskPercentage, n.Config.WarningThresholdAvailableDiskSpacePercentage) | |
| err = fmt.Errorf("remaining available disk space percentage (%d%%) is below the warning threshold of disk space percentage (%d%%)", availableDiskPercentage, n.Config.WarningThresholdAvailableDiskSpacePercentage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Yacov Manevich <[email protected]>
Why this should be merged
We currently have metrics for available disk space and we fail the health check if we have less than a predefined amount of space left.
This commit adds metrics for available disk percentage, and fails the health check if we have less than a predefined percentage of space left.
The motivation is that if one wants to re-use the configuration of the validator to several nodes with different volume sizes, using absolute space is less fitting than using percentages.
For example, a volume that started with all of its 2TB free and now has 200GB space left is not the same as than a volume that started with 400GB and has half of its space left.
How this works
Adds a flag
--system-tracker-disk-warning-threshold-available-space-percentagethat defaults to 3% space left.Changes
AvailableBytesonstorage_unix.go,storage_openbsd.go, to return also percentage, not only absolute.How this was tested
Only tested on Linux, however I looked at the
Statfs_tstruct of OpenBSD to make sure I picked the right fields.I took a Fuji node I had and compiled the avalanchego binary from this branch, and re-launched the node and then I created a big file using
fallocate -l 400G bigfileand observed the metrics:Soon enough, the node's health check began failing:
I then deleted the file and observed the metrics:
And checked that the node returned to be healthy once more:
Need to be documented in RELEASES.md?