-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: add encrypted scratch space for sandbox #2212
base: main
Are you sure you want to change the base?
RFC: add encrypted scratch space for sandbox #2212
Conversation
a6c6fe4
to
c19898a
Compare
@@ -64,6 +64,8 @@ type ServerConfig struct { | |||
SecureCommsPpOutbounds string | |||
SecureCommsKbsAddress string | |||
PeerPodsLimitPerNode int | |||
UseEncryptedDisk bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover
5fb77d7
to
ab7fccf
Compare
@@ -30,6 +30,7 @@ optionals+="" | |||
[[ "${SECURE_COMMS_PP_OUTBOUNDS}" ]] && optionals+="-secure-comms-pp-outbounds ${SECURE_COMMS_PP_OUTBOUNDS} " | |||
[[ "${SECURE_COMMS_KBS_ADDR}" ]] && optionals+="-secure-comms-kbs ${SECURE_COMMS_KBS_ADDR} " | |||
[[ "${PEERPODS_LIMIT_PER_NODE}" ]] && optionals+="-peerpods-limit-per-node ${PEERPODS_LIMIT_PER_NODE} " | |||
[[ "${DISK_SIZE}" ]] && optionals+="-disk-size ${DISK_SIZE} " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For AWS provider we use a param "ROOT_VOLUME_SIZE" as a way to expand the VM root volume size if requested explicitly. May be the same can be reused? Or we can settle on another name for all the providers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ROOT_VOLUME_SIZE would be good IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should it be made a generic param or copied to azure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Afaik it will not be straight forward to implement disk resize for all the providers. So I think copying to azure should be ok for now.
ab7fccf
to
0e961cf
Compare
src/cloud-api-adaptor/podvm-mkosi/mkosi.skeleton/usr/lib/repart.d/30-scratch.conf
Show resolved
Hide resolved
Would it also make sense to extend this approach to use add-on disk like available with some instance types ? Not needed for initial iteration, but is this something we should consider and lay the foundation now?
I think it would be better to also add an annotation (eg |
for future extensibility, I think we can freely iterate. We don't currently have a version contract between PodVM and CAA. We would have to look at the user-facing api maybe to make this possible.
I agree, I didn't spot any existing fitting annotation in kata at first glance, so we might have to add it. |
0e961cf
to
6655aa3
Compare
6655aa3
to
34a553c
Compare
Yes, a new annotation will need to be added. |
ea95283
to
a097ec7
Compare
a097ec7
to
ba8cc4c
Compare
This introduces a new CAA parameter that allows to specify the disk size. If we have a disk size that is larger than 0 a cloud provider can attempt to consider this size to create space for an encrypted scratch partition that can be mounted to /run/container. Signed-off-by: Magnus Kulke <[email protected]>
This adds the configuration for an encrypted scratch space on an mkosi image. At bootup a /dev/sda4 partition will be created and encrypted with LUKS using an ephemeral key. The partition will use the space available on the image volume. By default the qcow2 image has 100mb allocated for this space. This amount of space will only work for very small images, hence we do not mount the scratch space to `/run/kata-container` by default. If the kata-agent service units encounters a `/run/peerpod/mount-scratch` file it will mount the encrypted partition `/dev/sda4` to `/run/kata-containers`. This file is provisioned by `process-user-data`, configured by the CAA daemonset. Signed-off-by: Magnus Kulke <[email protected]>
The CI wasn't building debug images, due string/bool mismatch. Signed-off-by: Magnus Kulke <[email protected]>
4b05766
to
c0c9f4c
Compare
fixes #1921
This is a PoC to address the problem of secure scratch space in which we can download and unpack images to the sandbox, write files at runtime etc.
Currently this is either backed by storage that is not protected by the TEE (packer) or as tmpfs mounts that will consume ram during runtime (mkosi). For larger images like pytorch w/ cuda this implies in theory that we have to set aside 8GB of ram just to be able to download and unpack the image. tmpfs mounts are by default limited to a ratio of the ram, so we might end up having to overprovision the machine to be able to run a workload.
The proposes change introduces a CAA-wide parameter that allows to specify the disk size. A cloud provider can use this as an indicator of the size of the to-be-provisioned podvm disk. A zero value means that we stick to the default size of the image.
if a size has been given (the mkosi image is ~0.5gb) it will extend the image to that size. At launch the podvm will claim that space and create an ad-hoc encrypted volume on it. Depending on the presence of a marker file in user-data the agent unit will mount that to
/run/kata-containers
prior to launching kata-agent.I'm not super happy about the API (I would prefer to specify it per pod) and the implementation (the marker file) but I wanted to push early to get feedback. (also, it's only implemented for azure at the moment)
FWIW it does what it's supposed to do, see some numbers on #1958