fix(sandbox): two-phase Landlock to fix privilege ordering and add enforcement tests#810
fix(sandbox): two-phase Landlock to fix privilege ordering and add enforcement tests#810johntmyers wants to merge 7 commits intomainfrom
Conversation
Landlock status was only logged from inside the pre_exec child process where the tracing/OCSF pipeline is non-functional after fork. This made Landlock failures completely invisible in sandbox logs. Add a probe_availability() function that issues the raw landlock_create_ruleset syscall to check kernel support, and call it from the parent process before fork in all three spawn paths (entrypoint, SSH PTY, SSH pipe). Uses std::sync::Once to emit exactly once per sandbox lifetime. WIP - addresses logging gap from #803.
Verify Landlock availability logging and enforcement in e2e: - OCSF probe event appears in sandbox logs - Read-only paths block writes, allow reads - Read-write paths allow both - Paths outside policy are denied entirely - User-owned paths outside policy are still blocked (proves Landlock enforces independently of Unix DAC permissions) Requires Linux host with Landlock support (GitHub Actions runners, Docker Desktop linuxkit). Related to #803.
Two strict xfail tests that demonstrate the root cause of #803: - PathFd::new() runs as uid 998 after drop_privileges, so root-only paths (mode 700) silently fail and Landlock degrades - When ALL paths fail, best_effort silently drops Landlock entirely These tests will pass after the two-phase Landlock fix (open PathFds as root before drop_privileges, restrict_self after).
- Remove log-reading tests: the Landlock probe logs to the supervisor's container stdout, not the in-sandbox file appender at /var/log/openshell*.log* - Replace broken xfail test (checked for child-process log messages that never reach the file appender) with a stat()-based test that verifies /root is actually in the Landlock allowlist - Keep enforcement tests (all passing) and the root-only policy xfail test (correctly proves #803 bug)
Landlock doesn't restrict stat(), so the test passed unexpectedly. In the mixed policy case where /root is silently skipped, Landlock still applies (using other paths) and blocks /root even harder (not in allowlist = denied). The observable security degradation only occurs when ALL paths fail, which the existing xfail test already covers.
Split Landlock apply into prepare() and enforce(): - prepare() runs as root before drop_privileges: opens PathFds, creates ruleset, adds rules. Root-only paths (mode 700) now succeed instead of silently failing as uid 998. - enforce() runs after drop_privileges: calls restrict_self() which does not require root. This fixes the root cause of #803 where drop_privileges() ran before sandbox::apply(), causing PathFd::new() to fail on root-only paths. In best_effort mode this silently dropped all Landlock restrictions. The fix applies to all three spawn paths: entrypoint (process.rs), SSH PTY (ssh.rs), and SSH pipe exec (ssh.rs). Removes xfail marker from e2e test that now passes.
df272a6 to
2f8185f
Compare
- enforce() now respects best_effort: if restrict_self() fails and policy is best_effort, log and degrade instead of aborting startup - log_sandbox_readiness distinguishes best_effort (degraded) from hard_requirement (will fail) in OCSF messages
|
Local validation for PR #810:
I have not done a live supervisor swap yet, so this is code-path validation rather than deployment validation. |
|
Follow-up with live deployment validation:
So in this real deployment, I still do not see Landlock enforcement taking effect with the PR binary, even though the branch builds cleanly and the Landlock-focused Rust tests pass locally here. I rolled the cluster back to the stock supervisor immediately after the test. |
|
@mjamiv thanks for testing that. Are you able to build and deploy from the git branch directly on a host (i.e. |
|
Also |
|
Thanks for the correction — Method. Built PR #810 window (supervisor respawned ~
Stock v0.0.25 window (rolled back, new supervisor PIDs, same sandbox, probe ran at Grep returned nothing. No probe, no apply, no build. Stock is silent on Landlock — which lines up with #803. Conclusion. On a Linux 6.8 host with I didn't run Happy to grab a full sandbox log dump if useful. Rollback is clean, nothing left on the cluster from the test. |
Summary
Fix the Landlock privilege ordering bug reported in #803 and add comprehensive e2e tests for Landlock filesystem enforcement. The root cause was that
drop_privileges()ran beforesandbox::apply(), causingPathFd::new()to execute as uid 998 where root-only paths silently failed. Inbest_effortmode this could drop all Landlock restrictions entirely.Related Issue
Fixes #803
Changes
landlock.rs,mod.rs): Splitapply()intoprepare()(opens PathFds as root) andenforce()(callsrestrict_self()after privilege drop). Root-only paths (mode 700) now succeed instead of silently failing.landlock.rs,mod.rs): Addedprobe_availability()using rawlandlock_create_rulesetsyscall to check kernel support, with OCSF logging from the parent process where tracing works. Previously, Landlock status was only logged from thepre_execchild where the tracing pipeline is non-functional after fork.process.rs,ssh.rs): All three spawn paths (entrypoint, SSH PTY, SSH pipe) now use prepare-before-drop-enforce-after ordering.test_sandbox_landlock.py): 5 tests verifying read-only blocks writes, read-write allows both, paths outside policy are denied, and user-owned paths outside policy are still blocked (proves Landlock enforces independently of Unix DAC).Testing
cargo check -p openshell-sandboxpasses (no new warnings)cargo test -p openshell-sandbox— 416 tests passmise run e2e:python -- -k test_sandbox_landlock -v— 5 passedopenshell logsshowsCONFIG:PROBED [INFO] Landlock filesystem sandbox available [abi:v6]Checklist