Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: add use cases #4

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,8 @@
# Kinvolk Seccomp Agent

The Kinvolk Seccomp Agent is receiving seccomp file descriptors from container runtimes and handling system calls on behalf of the containers.
Its goal is to support different use cases:
- unprivileged container builds (procfs mounts with masked entries)
- support of safe mknod (e.g. /dev/null)

See the [different use cases](docs/usecases.md)

It is possible to write your own seccomp agent with a different behaviour by reusing the packages in the `pkg/` directory.
The Kinvolk Seccomp Agent is only about 100 lines of code. It relies on different packages:
Expand Down
79 changes: 79 additions & 0 deletions docs/usecases.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: Use cases
weight: 10
description: >
Use cases for the Seccomp Agent.
---

There are several use cases for using a Seccomp Agent.

## Mounting procfs in unprivileged containers

An unprivileged Kubernetes pod might want to use
[RootlessKit](https://github.com/rootless-containers/rootlesskit). There is one
step that is difficult in this setup: [mounting procfs in a unprivileged user
namespace](https://kinvolk.io/blog/2018/04/towards-unprivileged-container-builds/#the-exception-of-procfs-and-sysfs).
This is because Kubernetes pods are normally running with a masked procfs (see
[AllowedProcMountTypes](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#allowedprocmounttypes)
in the Pod Security Policy documentation).

To avoid running a pod with `ProcMountType=UnmaskedProcMount` (which could be a
security issue), users can run a seccomp agent to capture the `mount` system
call and perform the procfs mount in the inner container in the seccomp agent
on behalf of the container. This allows users to use RootlessKit and still
keep the security of masked procfs mount.

## Support for a subset of device mknod

A VPN container might need `/dev/net/tun` but cannot create the device without
`CAP_MKNOD`. Giving this capability to the container could be risky: the
container would be able to abuse the mknod call to get access to disks such as
`/dev/sda`.

The alternative could be to keep the container without `CAP_MKNOD` and add a
seccomp filter on `mknod` to let the Seccomp Agent run `mknod()` on behalf of
the container,

## Rootless Containers without /etc/subuid (`subuidless`)

The goal of subuidless is to allow running containers without /etc/subuid,
which isn't good fit for shared environments.

See:
https://github.com/rootless-containers/subuidless

## Accelerator for slirp4netns (`bypass4netns`)

When using slirp4netns as a networking solution for rootless containers, the
performance impact can be big. However, by capturing the `connect` call and
handling it in the seccomp agent, we avoid the performance impact: the network
traffic is no longer routed through a userspace process.

See:
https://github.com/rootless-containers/bypass4netns

## Emulating privileged sysctl

TODO

## Detection and reporting of unusual behavior with system calls

TODO

## Error injections (Chaos Engineering)

The Seccomp policy could include a scenario defining which system calls to make
fail.

## Network Proxy/sniffing/load-balancing

In the future, the seccomp agent can be used to redirect connections to a
network proxy for debugging. Another option is to sniff the payload sent and
allow the target to continue afterwards.

In a similar note, if we notify on the connect syscall, we can do a load
balancing with [quite good performance compared to envoy and haproxy][link] in
conjuntction with `pidfd_getfd()`. As the link mentions, we can get some info
with `TCP_INFO` and other stuff.

[link]: https://people.kernel.org/brauner/the-seccomp-notifier-new-frontiers-in-unprivileged-container-development