[OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. #58

captainreality · 2024-05-22T17:22:48Z

The spec at https://github.com/opencontainers/image-spec/blob/main/image-layout.md describes an OCI Image Layout directory structure. This PR updates the ociutil tool with a new command to produce such directories based on an input oci_image_layout. It adds an oci_image_format Bazel rule that can be used to produce such directories.

The spec at https://github.com/opencontainers/image-spec/blob/main/image-layout.md describes an OCI Image Layout directory structure. This commit updates the ociutil tool with a new command to produce such directories based on an input layout.

This commit creates a bazel rule that produces an OCI Image Format directory based on a provided OCI Image index. The OCI Image Layout is a standardized OCI format described in https://github.com/opencontainers/image-spec/blob/main/image-layout.md.

guyboltonking · 2024-06-14T11:03:00Z

oci/oci_image_layout.bzl

+        In order for this rule to work correctly in its current state, the
+        following flags must be provided to bazel:
+        --incompatible_strict_action_env=false
+        --spawn_strategy=local
+
+        The incompatible_strict_action_env flag is required because in order to
+        access the registry, a credential helper executable (named
+        docker-credential-<SOMETHING>) must be available so that ocitool can
+        execute it. The incompatible_strict_action_env flag makes the system
+        PATH available to bazel rules.
+
+        The spawn_strategy flag must be set to local because currently,
+        oci_image_index is only declaring the new JSON files it creates as
+        outputs; it's not declaring any manifests or layers from the images as
+        outputs. By default, Bazel only permits rules to access specifically
+        declared outputs of the rule's direct dependencies. In order for this
+        rule to access the transitive set of outputs of all dependencies, we
+        must disable bazel's sandboxing by setting spawn_strategy=local.


These requirements are fairly big impediments to using this rule, since users would either have to invoke it in a separate bazel instance or we'd have to investigate using bazel transitions to switch these flags off for those specific targets.

I completely understand why you want oci_image_layout to pull as part of bzl build, since you want to use it as part of a bazel test rule that scans the local image. However, downloading from the network during the bazel build phase is considered bad practice: this is meant to happen in repository rules, and that's why you're having to jump through hoops when you do it in a build rule. Having a repository rule (say oci_pull_image_layout) would make sense from that perspective, but would make usage fairly awkward, and you would not be able to depend on oci_push targets.

The more I look at this, the more I think that it's appropriate to model this as something that is executed not as part of a bzl test dependency, but as a bzl run target i.e. you want something that:

can refer to the output of the oci_push rule, so that can be fed to the...

...call of the oci_image_layout tool, which can use it to prime the download of the entire image (which you could point at a temporary location, and make it smart enough to avoid downloading things if they were already there), which is then fed to...

...the analysis tools.

This means that instead of saying bzl test <target-that-depends-on-oci-image-layout> you'd have to use a special target that you invoked with bzl run; this is quite a specialized usage, so that should be okay? Certainly, in its current form, a test that uses the rule as it stands will fail unless flags are specified that violate hermeticity, which means you can't run it as part of a wildcard bzl test //....

I'm sorry not to be able to approve this immediately; does any of the above sound reasonable, or are these solutions you've already considered and dismissed?

Further thoughts:

The requirement here is to make the entire built image available locally for security scanning

The built image consists of:

the base image (which oci_pull currently elides, replacing it with a reference)

the layers on top of the base image (which are built locally by rules_oci)

Thus, to create a complete local image, we need a variant of the oci_pull repository rule that pulls all the remote layers; this could then be referred to in the provider output of the oci_image rule, and used by oci_image_layout.

This moves all network access back into repository rules, without too much awkwardness (since you already have an oci_pull for the base image).

Thank you very much for the thorough review; I really appreciate it. Since I'm doing this to create a test rule that depends on it, I'd very much like the test target to be hermetically executable with bzl test. I'd also like to avoid pushing the image to a registry first; the idea being that images should be able to pass this test before we push them.

Your second idea seems very doable to me; create a variant of oci_pull that pulls down all necessary base blobs. The oci_image_layout would then be able to access the output of the oci_pull. It shouldn't be too difficult to make providers in oci_image_layout that access the oci_pull outputs. This seems a very clean way of doing it that matches the way bezel repository rules are intended to be used.

One question that arises is, with this approach, should I create a new rule (e.g. oci_pull_all_local or some similar name), or modify the attributes passed to oci_pull to parameterize it. e.g. have a boolean that describes whether to pull all the blobs locally. I'd be interested to hear your thoughts on this.

I'm also not completely clear yet how an oci_image_format rule that consumed an oci_image_index could declare the necessary oci_pull targets, because the base image details are part of oci_image, but don't seem to be outputs of oci_image_index. I'm thinking we might need to expand the outputs of oci_image_index in this case so that it contains details of the base images.

One question that arises is, with this approach, should I create a new rule (e.g. oci_pull_all_local or some similar name), or modify the attributes passed to oci_pull to parameterize it. e.g. have a boolean that describes whether to pull all the blobs locally. I'd be interested to hear your thoughts on this.

I think we should have a separate repository_rule, oci_pull_all. That should ensure that the expense of downloading all the layers is only borne by the test; building/testing other targets should not trigger the download. You can DRY things up by having an oci_pull macro with a flag that determines whether to invoke oci_pull_all in addition to the oci_pull rule (although that's probably icing on the cake).

Circling back on this, if we create an oci_pull_all rule (or have a parameterized macro with a boolean that specifies to do pull_all as well as pull), we would need a way for oci_image (and transitively, oci_image_index) to tell it to pull all. I'm thinking that we would need to add a new (optional) param to oci_image and oci_image index that would be passed down the dependency tree and eventually end up being passed to oci_pull (or used to decide to call oci_pull_all). Does this sound right to you?

It would also mean that any oci_image_index targets that we want to run my new test rule on would need to have this new parameter set. Ideally, the test would somehow pass it in, but other uses of oci_image_index would not do so. Unfortunately, I can't think off the top of my head how to achieve that. Any suggestions welcome :).

If what I've suggested sounds sane to you, I'll look at implementing it this week. Thanks again for your help with this.

TL;DR: I think the only place you need an extra parameter is in oci_image; everything else can be handled by passing the extra information through via providers.

Could we:

make oci_image take a new, optional pulled_base attr, that would be used to pass in the target created by oci_pull_all

pass that information out in a new provider, say OCIPulledImageInfo, that would only be present if pulled_base was set

oci_image_index (and downstream rules) could then look for that provider on its manifests inputs and pass it on (OCIPulledImageInfo would have to be more than just a simple pulled_base to handle the multiple pulled_bases on its input; you'd need to consider what information is needed by your new oci_image_layout rule, and make it trivially available in the new provider)

I thought about it, and that sounds like a good approach. It will mean that oci_image_layout won't work unless all the base images that it transitively depends upon have pulled_base set... but I can't think of any way around that with the way bazel works.

Interestingly, Google's oci_pull (https://github.com/bazel-contrib/rules_oci/blob/main/oci/private/pull.bzl) seems to do want we want for this; it even keeps it all in OCI Image Layout format. They have a todo in there to avoid eager download of layers, but the current code downloads everything. I wonder if we could reuse it. We can discuss that in the meeting I've organized.

captainreality force-pushed the kim.mason/imagelayout branch 6 times, most recently from 0333db3 to 1f2aaaf Compare May 23, 2024 23:21

captainreality force-pushed the kim.mason/imagelayout branch 2 times, most recently from fcda066 to 0f02cd2 Compare June 3, 2024 22:03

Create oci_image_layout bazel rule.

df23a10

This commit creates a bazel rule that produces an OCI Image Format directory based on a provided OCI Image index. The OCI Image Layout is a standardized OCI format described in https://github.com/opencontainers/image-spec/blob/main/image-layout.md.

captainreality force-pushed the kim.mason/imagelayout branch from 0f02cd2 to df23a10 Compare June 3, 2024 22:09

captainreality marked this pull request as ready for review June 3, 2024 22:19

captainreality changed the title ~~Add an OCI Image Layout command to the ociutil tool.~~ Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. Jun 3, 2024

captainreality requested review from guyboltonking and a team June 7, 2024 19:39

captainreality marked this pull request as draft June 10, 2024 22:13

captainreality marked this pull request as ready for review June 10, 2024 22:13

captainreality requested review from a team and removed request for a team June 11, 2024 15:46

guyboltonking reviewed Jun 14, 2024

View reviewed changes

captainreality marked this pull request as draft September 18, 2024 21:33

captainreality changed the title ~~Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule.~~ [OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. Sep 18, 2024

captainreality closed this Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. #58

[OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. #58

captainreality commented May 22, 2024 •

edited

Loading

guyboltonking Jun 14, 2024 •

edited

Loading

guyboltonking Jun 17, 2024

captainreality Jun 18, 2024 •

edited

Loading

guyboltonking Jun 19, 2024 •

edited

Loading

captainreality Jul 10, 2024

guyboltonking Jul 10, 2024

captainreality Jul 10, 2024

[OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. #58

[OBSOLETE] Add an OCI Image Layout command to the ociutil tool, and a corresponding Bazel rule. #58

Conversation

captainreality commented May 22, 2024 • edited Loading

guyboltonking Jun 14, 2024 • edited Loading

Choose a reason for hiding this comment

guyboltonking Jun 17, 2024

Choose a reason for hiding this comment

captainreality Jun 18, 2024 • edited Loading

Choose a reason for hiding this comment

guyboltonking Jun 19, 2024 • edited Loading

Choose a reason for hiding this comment

captainreality Jul 10, 2024

Choose a reason for hiding this comment

guyboltonking Jul 10, 2024

Choose a reason for hiding this comment

captainreality Jul 10, 2024

Choose a reason for hiding this comment

captainreality commented May 22, 2024 •

edited

Loading

guyboltonking Jun 14, 2024 •

edited

Loading

captainreality Jun 18, 2024 •

edited

Loading

guyboltonking Jun 19, 2024 •

edited

Loading