Skip to content

Conversation

gwenya
Copy link
Contributor

@gwenya gwenya commented May 12, 2025

This adds support for bpf token delegation (closes #2099).

TODOs:

  • better documentation, right now it's a bit barebones, but I'm not really sure what to add either
  • tests if the CI can support them
  • validation if possible: the delegation options are kernel strings and which ones are available depend on the kernel version, hardcoding the current allowed set seems like a bad idea; if validation is impossible then document this fact
  • improve failure mode, right now if the delegation options are wrong or the feature is not supported by the kernel the forkbpf hook fails and prevents instance startup; this is recorded in lxc.log but the specific error message only shows up when running incus with the --debug flag
  • kernel support check? token delegation is a pretty new feature, it's not supported yet on standard ubuntu installations, so maybe we should check if it's available and issue according error messages, and document the minimum kernel version and config
  • clean up the forkbpf code
  • restrict to projects with low_level access

@github-actions github-actions bot added Documentation Documentation needs updating API Changes to the REST API labels May 12, 2025
@gwenya gwenya force-pushed the bpf-tokens branch 6 times, most recently from 1078454 to e31bb93 Compare May 12, 2025 18:49
Copy link
Contributor

@irhndt irhndt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just took a look to learn and found a typo.

@gwenya
Copy link
Contributor Author

gwenya commented May 13, 2025

@stgraber would it be possible / make sense for lxc to capture stderr from hooks and log it at a higher level than debug?

}

func (c *cmdForkbpf) runChild(socket int, mountPath string) error {
fsfd, err := unix.Fsopen("bpf", 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this call fails, and the parent has already reached the Recvmsg() call, it will hang forever.
I'm not quite sure how to best work around this.

Maybe the parent should call both waitpid and recvmsg in goroutines and send their results through channels so it can select on them?

Comment on lines 69 to 79
// Attach to the user namespace.
snprintf(path, sizeof(path), "/proc/%s/ns/user", pidstr);
if (dosetns_file(path) < 0) {
_exit(EXIT_FAILURE);
}

// Attach to the user namespace.
snprintf(path, sizeof(path), "/proc/%s/ns/mnt", pidstr);
if (dosetns_file(path) < 0) {
_exit(EXIT_FAILURE);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you probably want to use pidfd_nsfd+change_namespaces, as well as pidfds if possible (just like forkproxy does)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This runs as an lxc hook, so it gets the pid as LXC_PID environment variable.
Getting an fd for the pid just to then check if the pid is still valid right afterwards seems a bit pointless, I think it only makes sense if the pidfd gets passed into the process. If lxc hooks support that we can do it, but I couldn't find any information about that and the forkproxy which also runs as a hook does it like this as well.
I don't think it would be particularly useful either. From my experiments lxc seems to wait for the hook to finish before continuing container startup, so there should be no way for the pid to get recycled in the meantime.

@gwenya
Copy link
Contributor Author

gwenya commented Jun 7, 2025

@stgraber do you know if we can test this in the github runners? I have a feeling that they might not get enough privileges. Apart from that, I think this could be ready, the TODO points I listed would be nice to have of course but not mandatory I think, and I currently don't have the time to figure them out.

@gwenya gwenya marked this pull request as ready for review June 7, 2025 14:17
@gwenya gwenya requested a review from stgraber as a code owner June 7, 2025 14:17
@stgraber
Copy link
Member

stgraber commented Jun 9, 2025

@stgraber do you know if we can test this in the github runners? I have a feeling that they might not get enough privileges. Apart from that, I think this could be ready, the TODO points I listed would be nice to have of course but not mandatory I think, and I currently don't have the time to figure them out.

The tests run as root inside of the Github runner VMs, so that should be fine as far as permissions.

@gwenya
Copy link
Contributor Author

gwenya commented Jun 9, 2025

Alright then I'll try to come up with some tests for this.
Do they have unrestricted internet connectivity? I will probably need to use libbpf-bootstrap and download it from git within the runner or instance.

@stgraber
Copy link
Member

stgraber commented Jun 9, 2025

Yep, no internet restrictions from the runners, so you can download and build a test then push it into a test container easily enough.

@stgraber
Copy link
Member

@gwenya ping

@gwenya
Copy link
Contributor Author

gwenya commented Jun 21, 2025

Sadly didn’t get around to this yet, was more busy with work recently. I hope I’ll get around to it soon, but it feels like a somewhat tricky thing to figure out so it might actually take a while.
I think I should first figure out how to run the tests locally, they don’t work on my system but I should be able to do it in a VM.

@stgraber
Copy link
Member

@gwenya did you have some time to look at this one again?

@stgraber stgraber marked this pull request as draft July 28, 2025 18:12
@gwenya
Copy link
Contributor Author

gwenya commented Oct 4, 2025

Sorry for not responding in so long, I will give it a try now

@gwenya gwenya force-pushed the bpf-tokens branch 4 times, most recently from ca9ada7 to ead8313 Compare October 5, 2025 14:42
@stgraber
Copy link
Member

stgraber commented Oct 6, 2025

@gwenya great to have you looking into this again!

gwenya added 10 commits October 12, 2025 02:45
The forkbpf helper creates a socket pair and forks, the child process then enters the user and mount namespaces of the container.
The child process creates a filesystem configuration context and passes it to the parent process via the socketpair.
The parent process configures the token delegation options and creates a mount file descriptor and passes that back to the child.
The child process then attaches the mount to the desired path.

Signed-off-by: Gwendolyn <[email protected]>
The token delegation is automatically enabled if any of the delegate_* options are set. It works by configuring forkbpf as a start-host lxc hook.

The delegate_* options are not properly validated since they refer to internal kernel enums and therefore depend on the running kernel.

Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
@gwenya
Copy link
Contributor Author

gwenya commented Oct 12, 2025

@stgraber do you have an idea what might cause this test failure? I have never seen that error message ("ERROR: Output buffer space exceeded") before and google only finds me an unconfirmed bug report in the file utility...

I'm not sure if it even makes sense to build that tool in the tests, we could instead build it once and host it somewhere to download, or just wait until bpftool tags a new release that includes the bpftool token subcommands.

I also appreciate input on the other open points in the task list, I'm not coming up with any good solutions for them and am tempted to just document those shortcomings and leave them be for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

API Changes to the REST API Documentation Documentation needs updating

Development

Successfully merging this pull request may close these issues.

BPF Tokens support

4 participants