-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kstat: allow multi-level module names #17142
Conversation
On FreeBSD
I haven't looked deep, but I wonder what happen when the original one is finally deleted? |
|
ac9f08f
to
3461c5d
Compare
I'm pretty sure this was failing to initialise If this round of tests pass I'll consider that bit good. |
Great, thanks for confirming that!
We remove the old name from sysctl ( |
The test failures now seem to be caused by #17145 . I'd appreciate some reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave this a test with @robn's demo commit and it worked as advertised.
@robn this looks ready to merge - can you squash and rebase it? |
Module names are mapped directly to directory names in procfs, but nothing is done to create the intermediate directories, or remove them. This makes it impossible to sensibly present kstats about sub-objects. This commit loops through '/'-separated names in the full module name, creates a separate module for each, and hooks them up with a parent pointer and child counter, and then unrolls this on the other side when deleting a module. Sponsored-by: Klara, Inc. Sponsored-by: Syneto Signed-off-by: Rob Norris <[email protected]>
This extends the existing special-case for zfs/poolname to split and create any number of intermediate sysctl names, so that multi-level module names are possible. Sponsored-by: Klara, Inc. Sponsored-by: Syneto Signed-off-by: Rob Norris <[email protected]>
Rad, thanks! |
[Sponsors: Klara, Inc., Syneto]
Motivation and Context
In #16200 I added support for multi-level kstat nodes so that I could more easily add per-vdev queue stats. I intend to come back to that PR, but I have separately wanted this support in another project, so I've lifted it out.
In general, this gives us an easy way to create a richer tree of stats, by building on the simple one-level support we have for
zfs/<pool>
. With this, we could avoid the mess we have forobjset-0xXX
dataset stats.Description
Both Linux and FreeBSD implementations of
kstat_create()
have support for creating a single sub-level stat base, by splitting on a single/
char and creating an additional intermediate node. This PR extends that support by turning that into a loop, such that each/
creates a new intermediate node."Intermediate node" here means an extra
SYSCTL_ADD_NODE()
on FreeBSD, or an extraproc_mkdir()
on Linux.There's extra teardown code needed on Linux to remove the intermediate directories when the last concrete kstat object is removed. FreeBSD doesn't have this problem (I think), at least because the intermediate nodes don't seem to "exist" except as a hook point for other kstats to hang off. I can and will do more here if asked, but this is the same situation as the previous code has, so its kinda half each way.
There's one extra commit here. Previously, if you tried to create a second kstat with the same module & name as an existing one, it would be ignored. This changes it so the old one is removed and the new one added. This also came from #16200 because during import, we create a
vdev_t
for each drive (which create kstat nodes) and then as we transition to a trusted config, we create anothervdev_t
before destroying the old one, which creates another set of kstats to replace the first ones.That behaviour is of course specific to that PR, but it does raise the question of what should happen if we try to create a kstat a second time. "Keep the first" was mostly an accident of the implementation, and never happened in normal operation anyway. The other options are to take the second, or reject it until explicitly removed. This commit is making a choice, but I don't want to hold this PR up on it if there's strong opinions. I'm happy to drop it for now, or move it to another PR, or discuss it when I get back to #16200. I think it's pretty benign as-is, and obviously something we can change down the track if its not working for us.
How Has This Been Tested?
There's no actual stats changes here, so this should be an effective no-op change (only once round the loop is the same as an
if (...)
. That's not much fun though, so I've made a demo commit robn/zfs@e18264f that moves the existingzfs/<poolname>
stats tozfs/pool/<poolname>
andzfs/<poolname>/objset-<id>
tozfs/pool/<poolname>/dataset/<id>
.On Linux, that gives us:
and on FreeBSD:
Note the extra
misc
anddataset
nodes are part of the existing implementation; I'm not here to remake the world just yet.Meanwhile, my own ZTS run is progressing nicely. I'm not expecting any issues, though we never do so who knows 😅.
Types of changes
Checklist:
Signed-off-by
.