Skip to content

WIP: feat: split SELinux policy into core and scenario rpms, create test for core policy#639

Draft
bfjelds wants to merge 59 commits into
mainfrom
user/bfjelds/mjolnir/split-selinux-policy
Draft

WIP: feat: split SELinux policy into core and scenario rpms, create test for core policy#639
bfjelds wants to merge 59 commits into
mainfrom
user/bfjelds/mjolnir/split-selinux-policy

Conversation

@bfjelds
Copy link
Copy Markdown
Member

@bfjelds bfjelds commented May 11, 2026

Summary

Decomposes the monolithic trident SELinux policy into a core module plus scenario-specific additive modules, and adds a pipeline stage to validate the core policy in isolation.

SELinux Policy Decomposition

The single trident.pp module has been split into six independent modules that layer additively:

trident.pp (core, ships in trident-selinux RPM):

  • All permissions required by every trident deployment
  • Removed scenario-specific and test-only blocks (see below)
  • Removed audit2allow artifacts (man pages, xkb keyboard layouts)
  • Tightened permissions on optional binaries (chronyd, logrotate, sudo, kadmind, gpg-agent): removed unnecessary execute/execute_no_trans where only getattr/read/map is needed

trident-selinux-raid (new subpackage in trident.spec):

  • mdadm execution, unit files, and runtime directories
  • bootloader_t tmpfs access for RAID configurations

trident-selinux-encryption (new subpackage in trident.spec):

  • TPM device access (dev_rw_tpm)
  • systemd-pcrphase execution and tmpfs state
  • LVM/cryptsetup permissions (semaphores, keyring, tmp reads)
  • tcsd (TPM daemon) state directory management

trident-selinux-grub (new subpackage in trident.spec):

  • Bootloader execution (bootloader_exec)
  • /boot directory management and file relabeling
  • dracut/loadkeys support for initrd regeneration

trident-selinux-cloud-init (new subpackage in trident.spec):

  • Trident ↔ cloud-init coordination during provisioning
  • cloud-init management of unlabeled/usr content
  • udev access to cloud-init file descriptors

trident-test.pp (new trident-test-selinux RPM):

  • Steamboat CI transition (ci_unconfined_ttrident_t)
  • Interactive unconfined_run_to for manual debugging
  • Must NOT be installed in production images

Systemd Update Service

New trident-update@.service template unit that runs trident update operations via systemd domain transition (init_ttrident_t) instead of direct SSH execution.

Pipeline: SELinux Update Validation Stage

New selinux-update-testing.yml stage added to the e2e pipeline that:

  1. Builds a UKI image with only the core trident-selinux policy (no trident-test-selinux)
  2. Deploys to a QEMU VM via netlaunch
  3. Runs an A/B update cycle through trident-update@.service
  4. Verifies the VM reboots successfully — proving the production SELinux path works without test policies

Build & Packaging Updates

  • Dockerfiles (azl3, full) updated to build and include trident-test-selinux RPM
  • Makefile dependencies updated for new policy directories
  • Test images (direct-streaming, installer, grub-verity, MOS) updated to install trident-test-selinux

@bfjelds
Copy link
Copy Markdown
Member Author

bfjelds commented May 11, 2026

/azp run [GITHUB]-trident-pr-e2e

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@bfjelds bfjelds changed the title feat: split SELinux policy into public and test modules, invoke update via systemd WIP: feat: split SELinux policy into public and test modules, invoke update via systemd May 11, 2026
bfjelds and others added 3 commits May 11, 2026 12:41
Split the trident SELinux policy into two additive modules:

1. trident.pp (public) - ships in trident-selinux RPM, contains all
   production-required policies. Removed test-only blocks:
   - Steamboat CI transition (ci_unconfined_t -> trident_t)
   - Interactive unconfined_run_to transition

2. trident-test.pp (test) - new trident-test-selinux RPM, contains
   CI and interactive transitions. Must NOT be installed in production.

The test module uses SELinux 'require' blocks to reference types from
the base module, making it purely additive (no duplication).

Changes:
- New packaging/selinux-policy-trident-test/ with .te/.fc/.if files
- New packaging/rpm/trident-test-selinux.spec for standalone RPM build
- Updated Dockerfiles to build both RPMs in the same container
- Updated Makefile dependencies
- Updated test image definitions to install trident-test-selinux
- Added selinux-public-only test config placeholder

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add trident-update@.service template unit that runs trident update
operations (stage/finalize) via systemd domain transition (init_t ->
trident_t) instead of direct SSH execution (unconfined_t -> trident_t).

The service reads /var/lib/trident/update-env for the config path and
log level, and uses the instance name as the --allowed-operations value.

Update storm servicing tests to write the env file and start the
service instead of running 'sudo trident grpc-client update' directly.
This means the update tests now exercise the production SELinux domain
transition path.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The selinux/ directory may already exist when rpmbuild runs with
--build-in-place (e.g., from the test policy build running in the
same working directory).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bfjelds bfjelds force-pushed the user/bfjelds/mjolnir/split-selinux-policy branch from 5550854 to 0daac44 Compare May 11, 2026 19:42
@bfjelds
Copy link
Copy Markdown
Member Author

bfjelds commented May 11, 2026

/azp run [GITHUB]-trident-pr-e2e

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

bfjelds and others added 2 commits May 11, 2026 13:40
systemd EnvironmentFile doesn't handle multi-word values in ExecStart
correctly. Split TRIDENT_LOG_LEVEL='-v DEBUG' into separate
TRIDENT_VERBOSITY=DEBUG variable and hardcode '-v' in the service unit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add dedicated SELinux test images (tests/images/trident-selinux-testimage/)
that use SELinux enforcing mode with ONLY the public trident-selinux policy
(no trident-test-selinux). The base image includes trident-update@.service
for systemd-activated updates.

Revert the update.go changes that switched all servicing tests to use
systemd invocation — most tests have SELinux disabled so the change
added complexity without benefit. The systemd update path will be
validated through the dedicated SELinux test images instead.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bfjelds bfjelds force-pushed the user/bfjelds/mjolnir/split-selinux-policy branch from c199319 to bac962c Compare May 11, 2026 20:57
bfjelds and others added 18 commits May 11, 2026 14:07
New testing_selinux/selinux-update-testing.yml stage that:
1. Builds SELinux-enforcing UKI images with public policy only
2. Installs base image on QEMU VM via netlaunch
3. Runs A/B update via trident-update@stage/finalize.service (systemd)
4. Validates VM reboots and SELinux remains enforcing with no AVCs

Added to pr-e2e pipeline in e2e-template.yml.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This test config was a placeholder that's no longer needed — the
SELinux validation is handled by the dedicated pipeline stage, not
the e2e test matrix.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use the standard trident_images/build-image.yml template instead of
directly calling trident-testimg-template.yml. Split into a separate
BuildSELinuxTestImages stage for the image builds.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…load

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SELinux test images are built in a separate stage and need to be
explicitly downloaded via DownloadPipelineArtifact@2 before use.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SELinux test needs the trident-installer ISO for netlaunch.
Added DownloadPipelineArtifact and TridentTestImg_trident_installer
dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace manual DownloadPipelineArtifact calls with the standard
download-test-images.yml template, which handles ISO, Go tools,
prepare-images, and chmod in one shot.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ages.yml

Only download the 4 artifacts actually needed: Go tools (netlaunch,
storm-trident, virtdeploy), installer ISO, SELinux base image, and
SELinux update COSI. No unnecessary testimage downloads or
prepare-images processing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
build-image.yml produces *_0.cosi and *_1.cosi by default, so a
separate update image yaml is unnecessary. Single build job now
produces both base and update COSIs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…config

The SELinux test uses tests/e2e_tests/trident_configurations/usr-verity/
trident-config.yaml which expects usrverity.cosi. Rename the build
artifacts (*_0.cosi -> usrverity.cosi, *_1.cosi -> usrverity_v2.cosi).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SELinux test image uses BaseImage.QEMU_GUEST in testimages.py,
so the pipeline needs to download the matching qemu_guest base image.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ocal paths

Copy files and scripts from trident-vm-testimage into the SELinux test
image directory and update baseimg.yaml to use local paths instead of
relative references to sibling directories.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rename directory structure:
  tests/images/trident-selinux-testimage/base/ ->
  tests/images/trident-testimage/selinux/

Update testimages.py config and ssh_key paths to match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
bfjelds and others added 29 commits May 11, 2026 20:19
Use the standard build_image template which integrates with test-images
repo via @test-images resource. Removes the custom BuildSELinuxTestImages
stage in favor of the shared TridentTestImg_trident_selinux_testimage stage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
No longer needed — the build_image/build-image.yml template in the
trident repo delegates to test-images which handles clones via its
own build-image.yml template with the clones parameter.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The systemctl start failure exits the step before journal capture.
Restructure to capture the exit code, always dump the journal and
audit log, then fail at the end. This ensures we can diagnose
SELinux denials when the service fails.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The SELinux test image doesn't have prepare-update-config-verity.sh
in postCustomization (it's based on host.yaml which doesn't include
it). Write the update config via SSH before starting the service,
pointing at the COSI served by netlaunch on the gateway.

Also add ConnectTimeout to ausearch SSH call to prevent hangs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The VM needs to download the update COSI via HTTP. Start netlisten
in a separate step before the update steps, serving artifacts/test-image
on port 4000. The background process persists across pipeline steps.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
netlisten has phonehome/logstream overhead and may not start cleanly
as a background process. Use python3 -m http.server which is simple,
reliable, and already available on the pipeline agent.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use netlisten to align with other tests. Add a retry loop (10 attempts
x 2s) to wait for netlisten to start listening before proceeding.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Background processes from & may not survive between pipeline steps.
Start netlisten in the same bash step as the update service invocation,
matching the pattern used by e2e-test-run.yml. Also start netlisten
in the finalize step for post-reboot commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
netlisten was failing silently because stdout/stderr were redirected
to a file. Remove the redirect so output appears in the pipeline log.
Add kill -0 check to detect if netlisten crashed on startup. List
serve directory contents for debugging.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add hard failure if netlisten never becomes ready after 20 seconds.
Dump process list and socket status for debugging.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keep netlisten output in log files but copy them to the artifact
output directory for post-run debugging. Also dump netlisten.log
in the fail-fast block.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add ConnectTimeout=10, ServerAliveInterval=5/CountMax=3 to SSH calls,
and timeoutInMinutes=3 on the step. Use set -x instead of set -eux
so partial failures don't prevent other log collection.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ausearch command hangs for 10 minutes causing the step to timeout,
even though staging succeeded. Wrap with 'timeout 30' and add SSH
keepalive options.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After reboot, sshd needs time to start. Retry SSH connection up to
30 times (5s intervals, ~2.5 min max) before running validation
checks. Also add ConnectTimeout and timeout on ausearch.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wait-for-login was matching the install boot's login prompt still in
the serial log, returning immediately before the VM actually rebooted.
Truncate the serial log before starting finalize so wait-for-login
only sees the post-reboot login prompt.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Multiple <log file=...> elements in the domain XML caused SERIAL_LOG
to be a multi-line string. Use head -1 to get only the first match.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After reboot, the VM may get a new DHCP lease or the old IP may not
be routable yet. Query libvirt for the current VM IP instead of using
the stale IP from virt-deploy-metadata.json.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Host-side: VM domain state, interfaces, bridge config, ARP table,
DHCP leases, serial log tail on IP resolution failure.
VM-side: network info, SELinux modules, service statuses, recent journal.
Ping check before SSH to distinguish network vs SSH issues.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
wait-for-login returns when the serial console shows the login prompt,
but networking (systemd-networkd, DHCP) hasn't started yet. Poll with
ping (30 x 3s) after wait-for-login to ensure the VM is network-ready
before the validation step runs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Simpler invocation, trace-level logging for maximum diagnostics.
Removes TRIDENT_VERBOSITY env var since verbosity is now hardcoded.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of handwriting a minimal update config, copy the full install
trident-config.yaml (which includes os, storage, etc.) and update just
the image URL and internalParams. This ensures the update has all the
sections trident expects.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move RAID-only policies to a separate additive module:
- mdadm exec/unit/runtime permissions (optional_policy block)
- bootloader_t tmpfs access for RAID

New packaging/selinux-policy-trident-raid/ with .te/.fc/.if files.
New trident-selinux-raid subpackage in trident.spec that Requires
trident-selinux. Only install on systems using RAID storage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…x-encryption

Move encryption/pcrlock-only policies to a separate additive module:
- trident_t: systemd_pcrphase_exec_t execute, dev_rw_tpm
- lvm_t: sem (encrypted volumes), key search (luksOpen), tpm write, tmp read
- systemd_pcrphase_t: tmpfs access for PCR measurements

New packaging/selinux-policy-trident-encryption/ with .te/.fc/.if files.
New trident-selinux-encryption subpackage in trident.spec.
Only install on systems using encryption or pcrlock features.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move GRUB+dracut-only policies to a separate additive module:
- bootloader_exec (GRUB tool execution)
- boot_t dir/file management (/boot)
- files_create_boot_dirs, files_manage_boot_files
- loadkeys_exec_t and loadkeys_t optional blocks

Not needed on UKI/systemd-boot systems where trident skips GRUB
config, initrd regeneration, and /boot management entirely.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace vague 'Policies below must be optional for Steamboat' with a
clear explanation that these policies exist because trident encounters
files/dirs owned by optional package types during OS provisioning and
relabeling (bluetooth, colord, dhcpd, etc.).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move all cloud-init interaction policies to a separate additive module:
- trident_t: getattr on cloud_init_exec_t/cloud_init_state_t, ps_process_pattern
- cloud_init_t: manage unlabeled/usr files trident creates
- udev_t: use cloud_init_t fd/fifo during device events

Install on any system using cloud-init (standard for Azure Linux).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Remove unnecessary 'execute execute_no_trans' from chronyd, gpg-agent,
  logrotate, sudo, and kadmind — trident never executes these, it only
  inspects them during filesystem relabeling. Narrowed to getattr/read/map.
- Remove miscfiles_read_man_pages — audit2allow artifact, trident does
  not read man pages.
- Remove xkb_var_lib_t — audit2allow artifact, trident does not
  configure keyboard layouts.
- Move tcsd_var_lib_t to trident-encryption module (TPM-related).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@bfjelds bfjelds changed the title WIP: feat: split SELinux policy into public and test modules, invoke update via systemd WIP: feat: split SELinux policy into core and scenario rpms, create test for core policy May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant