diff --git a/semantics/README.md b/semantics/README.md new file mode 100644 index 0000000..61ac86d --- /dev/null +++ b/semantics/README.md @@ -0,0 +1,38 @@ +# SPDX 2.x Semantics Reference Archive + +This directory holds a number of semantic structure designs representing +different types of repositories, artifacts and other commonly used software. + +The goal of these designs is to act as a source of reference for tool makers to +ensure a unified structure in documents produced by SPDX 2.x tools. + +## Semantic Reference Designs + +The following types of software are represented in this directory: + +* OCI Container Images + * [MultiArch Index](oci-multiarch-index.md) + * [Container Image](oci-image.md) + * [Container Layer with Operating System Packages](oci-layer.md) +* Software Repository + * [Universal Model for Code Repository](code-repository.md) +* Operating System Package + * [Universal Model for OS Package](os-package.md) + * [rpm Package](os-rpm.md) + * [deb Package](os-deb.md) + * [apk Package](os-apk.md) + + +## Design Considerations and Objectives + +Each of the designs contained here attempts to abstract th object as an SPDX +package which can be referenced by itself, moved to its own ot to another SBOM +while allowing for a flexible details. + +For example, the SBOM of a container image can be referenced by itself to fully +describe a container. But that same package can be moved to an SBOM describing +a multi-arch index while preserving its structure. + +Another example: An RPM package can provide its own SBOM and the package in it can +be repurposed by a container layer SBOM to describe all software installed via the +OS package manager. \ No newline at end of file diff --git a/semantics/oci-image.md b/semantics/oci-image.md new file mode 100644 index 0000000..a5e9af6 --- /dev/null +++ b/semantics/oci-image.md @@ -0,0 +1,117 @@ +# Containers: Container Image + +## Design Goals + +* Separation of layers to ensure OS dependencies and app data can be +appended to the approiate layer +* Separation of layers to be able to reuse them, for example when +describing images sharing the same base image. +* No registry or repository information in `PackageName` to ensure SBOM +is portable as image is copied across registries. +* Layer identification metadata encapsulated in purl external reference[^1] +* Ensure tools can differentiate which packages represent the image layers +from the packages representing both the source code of the image and the +base images. +[^1]: This design uses a proposed `os` field in the `oci` purl type which +[has been proposed](https://github.com/package-url/purl-spec/pull/179) but +still waiting to be merged. + +## Structure Diagram + +```mermaid +classDiagram +direction LR +class BaseImage { + PackageName: sha256:2256f59767967e5bf0a404b7 + ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:225[...]?arch=amd64&os=linux + PackageChecksum: SHA256 2256f59767967e5bf0a404b7 +} + +class Image { + PackageName: sha256:2def8ff3690355a + ExternalRef: PACKAGE_MANAGER purl pkg:oci/app@sha256:2de[...]?arch=amd64&os=linux + PackageChecksum: SHA256 2def8ff3690355a +} +class Layer1{ + PackageName: sha256:2256f59767967e5bf0a404b7 + ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:2225[...] + PackageChecksum: SHA256 2256f59767967e5bf0a404b7 +} +class Layer2 { + PackageName: sha256:c0aa059390ade47c068d + ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:c0a[...] + PackageChecksum: SHA256 c0aa059390ade47c068d +} + +class Layer3 { + PackageName: sha256:a6f30b3a81ddc4ea1d + ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:a6f[...] + PackageChecksum: SHA256 a6f30b3a81ddc4ea1d +} + +class SourceCode { + PackageName: "name": "github.com/organization/repo.git" + PackageDownloadLocation: "git+ssh://github.com/organization/repo.git@5fbbc211" + ExternalRef: PACKAGE_MANAGER purl pkg:github/organization/repo@5fbbc211[...] + Checksum: SHA1 5fbbc211 +} + + Image --> BaseImage: DESCENDANT_OF + Image --> SourceCode: GENERATED_FROM + Image --> Layer1: CONTAINS + Image --> Layer2: CONTAINS + Image --> Layer3: CONTAINS + +``` + +## Design Specification + +The goal of this design is to allow maximum flexibility when adding metadata +to the image components. + +### Package Structure + +The top level package represents the container image and references its manifest. +The name of the package should be the digest of the image manifest, preceded as +usual by the algorithm, eg `sha256:923784e51e709f...`. + +The image package can con contain three types of packages: + +#### Container Layer + +Each of the image layers hould be represented in a package. Ideally, the image +SBOM should describe the whole container image structure. Just as the image, +layers should also be named using their digests. The can layer can contain a +purl of type OCI to reference it. + +#### Base Image + +Images often are derived from other images. When the described image uses as a +base another image, the SBOM can reference it using a package. In order to ensure +that tools can discern the base image from other OCI artifacts, the base image +package should be related to the image via a `DESCENDANT_OF` relationship. + +#### Source Package + +The SBOM describing the image should also reference the VCS URL where the source +to build the image lives. To describe the build code, the image SBOM should have +a package pointing to the source repository. This package should be related to the +image package using a `GENERATED_FROM` relationship. + +### Software Identifiers + +The image package, the layers and the base image should contain external references +to the OCI artifacts using purls. The purl should of type [`oci`](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci) and does not require adding data about the registry, +repository or tags (but it may). WHen including os/arch metadata in the purl, only +the image should have it. + +The source code package should include a pointer to the repository containing the +code that generated the image. + +**Note:** This is not a reference to the application source code, it is a reference +to the code that was used to build the image (eg its Dockerfile). + +The purl referncing the source repository should be of type [github](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#github) if the code lives there. Check for other +suitable types if the source is not hosted in GitHub. + + diff --git a/semantics/oci-layer.md b/semantics/oci-layer.md new file mode 100644 index 0000000..2523eba --- /dev/null +++ b/semantics/oci-layer.md @@ -0,0 +1,61 @@ +# Containers: Layer With Operating System Packages + +## Design Goals + +* Ensuring a layer is packaged by itself to allow reuse. +* Separation from other images to enable adding data about +OS packages and other lose files. + +## Structure Diagram + + +```mermaid +classDiagram +direction LR + +class Layer{ + PackageName: sha256:2256f59767967e5bf0a404b7 + ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:2225[...] + PackageChecksum: SHA256 2256f59767967e5bf0a404b7 +} + +class busybox { + PackageName: BusyBox + PackageVersion: 1.35.0-r22 + ExternalRef: PACKAGE_MANAGER purl pkg:apk/alpine/busybox@1.35.0-r22?arch=x86 +} + +class cacertificatesbundle { + PackageName: "CA Certificates" + PackageVersion: 20220614-r1 + ExternalRef: PACKAGE_MANAGER purl pkg:apk/alpine/ca-certificates-bundle@20220614-r1 +} + +Layer --> busybox: CONTAINS + +Layer --> cacertificatesbundle: CONTAINS +``` +## Design Specification + +The goal of this design is to allow maximum flexibility when adding metadata +to the layer package. Generally, a layer abstracts a filesystem so it can +potentially contain anything. This is why the layer abstraction should have +room for current and unexpected uses. + +### Package Structure + +The package in this design represents a layer in a container image. Separating the +layers into their own packages ensures that adding data about packages can be added +to the appropiate section of the SBOM. For example, a layer can express an added file +via a `curl` pull while another can add child packages detailing the installed OS +dependencies. + +Things inside of images should be added as SPDX Packages and Files and related using a +CONTAINS relationship. + +### Software Identifiers + +The package representing a layer should have a purl of type `oci` referencing the +layer in a registry-neutral way. No os/arch data needs be added to the purl as +the layer will mostl likely live under an image SBOM which should include the platform +info. diff --git a/semantics/oci-multiarch-index.md b/semantics/oci-multiarch-index.md new file mode 100644 index 0000000..744565c --- /dev/null +++ b/semantics/oci-multiarch-index.md @@ -0,0 +1,58 @@ +# Containers: MultiArch Image Index + +## Design Goals + +* Having an design through which an SBOM can represent the +structure of a container image index. +* Separation of single-arch images to allow referencing SBOMs +for each or adding a detailed structure for them in the packages +representing each image. +* No registry or repository information to ensure the SBOM is +portable across registries when images are copied around. +* Data detailing metadata about the single arch images is packaged +in the purl external reference. + +## Structure Diagram + +```mermaid +classDiagram + direction LR +class ImageIndex{ + PackageName: sha256:923784e51e709f + ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:923[...] + PackageChecksum: SHA256 923784e51e709f +} +class Image{ + PackageName: sha256:5fbbc2112ee51e709f374c9c01e + ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:5fb[...]?arch=amd64&os=linux + PackageChecksum: SHA256 5fbbc2112ee51e709f374c9c01e +} +class Image2 { + PackageName: sha256:c3e3b1394f8b8fa1e8768 + ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:c3e[...]?arch=arm64&os=darwin + PackageChecksum: SHA256 c3e3b1394f8b8fa1e8768 +} + + ImageIndex --> Image: VARIANT_OF + ImageIndex --> Image2: VARIANT_OF +``` + +## Design Specification + +### Package Structure + +The top level package represents the image index. The name of the package should be +the digest of its manifest, preceded as usual by the algorithm, +eg `sha256:923784e51e709f...`. + +Each container image fronted by the image must be represented by another package, also +named as its digest. Each of these packages should be related to the index using a +`VARIANT_OF` SPDX relationship. + +### Software Identifiers + +Each of the packages in this design should contain a reference to the OCI object using +a [purl of type `oci`](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci). +Packages representing the single arch images must contain the os/arch metadata and optionally +can add tag and repository metadata. +