-
-
Notifications
You must be signed in to change notification settings - Fork 362
Add support for bpf token delegation #2109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
1078454
to
e31bb93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a look to learn and found a typo.
@stgraber would it be possible / make sense for lxc to capture stderr from hooks and log it at a higher level than debug? |
} | ||
|
||
func (c *cmdForkbpf) runChild(socket int, mountPath string) error { | ||
fsfd, err := unix.Fsopen("bpf", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if this call fails, and the parent has already reached the Recvmsg()
call, it will hang forever.
I'm not quite sure how to best work around this.
Maybe the parent should call both waitpid and recvmsg in goroutines and send their results through channels so it can select on them?
// Attach to the user namespace. | ||
snprintf(path, sizeof(path), "/proc/%s/ns/user", pidstr); | ||
if (dosetns_file(path) < 0) { | ||
_exit(EXIT_FAILURE); | ||
} | ||
|
||
// Attach to the user namespace. | ||
snprintf(path, sizeof(path), "/proc/%s/ns/mnt", pidstr); | ||
if (dosetns_file(path) < 0) { | ||
_exit(EXIT_FAILURE); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you probably want to use pidfd_nsfd
+change_namespaces
, as well as pidfds if possible (just like forkproxy
does)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This runs as an lxc hook, so it gets the pid as LXC_PID
environment variable.
Getting an fd for the pid just to then check if the pid is still valid right afterwards seems a bit pointless, I think it only makes sense if the pidfd gets passed into the process. If lxc hooks support that we can do it, but I couldn't find any information about that and the forkproxy which also runs as a hook does it like this as well.
I don't think it would be particularly useful either. From my experiments lxc seems to wait for the hook to finish before continuing container startup, so there should be no way for the pid to get recycled in the meantime.
@stgraber do you know if we can test this in the github runners? I have a feeling that they might not get enough privileges. Apart from that, I think this could be ready, the TODO points I listed would be nice to have of course but not mandatory I think, and I currently don't have the time to figure them out. |
The tests run as root inside of the Github runner VMs, so that should be fine as far as permissions. |
Alright then I'll try to come up with some tests for this. |
Yep, no internet restrictions from the runners, so you can download and build a test then push it into a test container easily enough. |
@gwenya ping |
Sadly didn’t get around to this yet, was more busy with work recently. I hope I’ll get around to it soon, but it feels like a somewhat tricky thing to figure out so it might actually take a while. |
@gwenya did you have some time to look at this one again? |
Sorry for not responding in so long, I will give it a try now |
ca9ada7
to
ead8313
Compare
@gwenya great to have you looking into this again! |
The forkbpf helper creates a socket pair and forks, the child process then enters the user and mount namespaces of the container. The child process creates a filesystem configuration context and passes it to the parent process via the socketpair. The parent process configures the token delegation options and creates a mount file descriptor and passes that back to the child. The child process then attaches the mount to the desired path. Signed-off-by: Gwendolyn <[email protected]>
The token delegation is automatically enabled if any of the delegate_* options are set. It works by configuring forkbpf as a start-host lxc hook. The delegate_* options are not properly validated since they refer to internal kernel enums and therefore depend on the running kernel. Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
Signed-off-by: Gwendolyn <[email protected]>
@stgraber do you have an idea what might cause this test failure? I have never seen that error message ("ERROR: Output buffer space exceeded") before and google only finds me an unconfirmed bug report in the I'm not sure if it even makes sense to build that tool in the tests, we could instead build it once and host it somewhere to download, or just wait until bpftool tags a new release that includes the I also appreciate input on the other open points in the task list, I'm not coming up with any good solutions for them and am tempted to just document those shortcomings and leave them be for now. |
This adds support for bpf token delegation (closes #2099).
TODOs:
--debug
flaglow_level
access