Skip to content

Commit

Permalink
Merge pull request spdx#46 from puerco/semantics
Browse files Browse the repository at this point in the history
Reference semantic architectures
  • Loading branch information
goneall authored Aug 6, 2023
2 parents fc94728 + 2969ed4 commit 26fa358
Show file tree
Hide file tree
Showing 4 changed files with 274 additions and 0 deletions.
38 changes: 38 additions & 0 deletions semantics/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# SPDX 2.x Semantics Reference Archive

This directory holds a number of semantic structure designs representing
different types of repositories, artifacts and other commonly used software.

The goal of these designs is to act as a source of reference for tool makers to
ensure a unified structure in documents produced by SPDX 2.x tools.

## Semantic Reference Designs

The following types of software are represented in this directory:

* OCI Container Images
* [MultiArch Index](oci-multiarch-index.md)
* [Container Image](oci-image.md)
* [Container Layer with Operating System Packages](oci-layer.md)
* Software Repository
* [Universal Model for Code Repository](code-repository.md)
* Operating System Package
* [Universal Model for OS Package](os-package.md)
* [rpm Package](os-rpm.md)
* [deb Package](os-deb.md)
* [apk Package](os-apk.md)


## Design Considerations and Objectives

Each of the designs contained here attempts to abstract th object as an SPDX
package which can be referenced by itself, moved to its own ot to another SBOM
while allowing for a flexible details.

For example, the SBOM of a container image can be referenced by itself to fully
describe a container. But that same package can be moved to an SBOM describing
a multi-arch index while preserving its structure.

Another example: An RPM package can provide its own SBOM and the package in it can
be repurposed by a container layer SBOM to describe all software installed via the
OS package manager.
117 changes: 117 additions & 0 deletions semantics/oci-image.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Containers: Container Image

## Design Goals

* Separation of layers to ensure OS dependencies and app data can be
appended to the approiate layer
* Separation of layers to be able to reuse them, for example when
describing images sharing the same base image.
* No registry or repository information in `PackageName` to ensure SBOM
is portable as image is copied across registries.
* Layer identification metadata encapsulated in purl external reference[^1]
* Ensure tools can differentiate which packages represent the image layers
from the packages representing both the source code of the image and the
base images.
[^1]: This design uses a proposed `os` field in the `oci` purl type which
[has been proposed](https://github.com/package-url/purl-spec/pull/179) but
still waiting to be merged.

## Structure Diagram

```mermaid
classDiagram
direction LR
class BaseImage {
PackageName: sha256:2256f59767967e5bf0a404b7
ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:225[...]?arch=amd64&os=linux
PackageChecksum: SHA256 2256f59767967e5bf0a404b7
}
class Image {
PackageName: sha256:2def8ff3690355a
ExternalRef: PACKAGE_MANAGER purl pkg:oci/app@sha256:2de[...]?arch=amd64&os=linux
PackageChecksum: SHA256 2def8ff3690355a
}
class Layer1{
PackageName: sha256:2256f59767967e5bf0a404b7
ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:2225[...]
PackageChecksum: SHA256 2256f59767967e5bf0a404b7
}
class Layer2 {
PackageName: sha256:c0aa059390ade47c068d
ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:c0a[...]
PackageChecksum: SHA256 c0aa059390ade47c068d
}
class Layer3 {
PackageName: sha256:a6f30b3a81ddc4ea1d
ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:a6f[...]
PackageChecksum: SHA256 a6f30b3a81ddc4ea1d
}
class SourceCode {
PackageName: "name": "github.com/organization/repo.git"
PackageDownloadLocation: "git+ssh://github.com/organization/repo.git@5fbbc211"
ExternalRef: PACKAGE_MANAGER purl pkg:github/organization/repo@5fbbc211[...]
Checksum: SHA1 5fbbc211
}
Image --> BaseImage: DESCENDANT_OF
Image --> SourceCode: GENERATED_FROM
Image --> Layer1: CONTAINS
Image --> Layer2: CONTAINS
Image --> Layer3: CONTAINS
```

## Design Specification

The goal of this design is to allow maximum flexibility when adding metadata
to the image components.

### Package Structure

The top level package represents the container image and references its manifest.
The name of the package should be the digest of the image manifest, preceded as
usual by the algorithm, eg `sha256:923784e51e709f...`.

The image package can con contain three types of packages:

#### Container Layer

Each of the image layers hould be represented in a package. Ideally, the image
SBOM should describe the whole container image structure. Just as the image,
layers should also be named using their digests. The can layer can contain a
purl of type OCI to reference it.

#### Base Image

Images often are derived from other images. When the described image uses as a
base another image, the SBOM can reference it using a package. In order to ensure
that tools can discern the base image from other OCI artifacts, the base image
package should be related to the image via a `DESCENDANT_OF` relationship.

#### Source Package

The SBOM describing the image should also reference the VCS URL where the source
to build the image lives. To describe the build code, the image SBOM should have
a package pointing to the source repository. This package should be related to the
image package using a `GENERATED_FROM` relationship.

### Software Identifiers

The image package, the layers and the base image should contain external references
to the OCI artifacts using purls. The purl should of type [`oci`](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci) and does not require adding data about the registry,
repository or tags (but it may). WHen including os/arch metadata in the purl, only
the image should have it.

The source code package should include a pointer to the repository containing the
code that generated the image.

**Note:** This is not a reference to the application source code, it is a reference
to the code that was used to build the image (eg its Dockerfile).

The purl referncing the source repository should be of type [github](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#github) if the code lives there. Check for other
suitable types if the source is not hosted in GitHub.


61 changes: 61 additions & 0 deletions semantics/oci-layer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Containers: Layer With Operating System Packages

## Design Goals

* Ensuring a layer is packaged by itself to allow reuse.
* Separation from other images to enable adding data about
OS packages and other lose files.

## Structure Diagram


```mermaid
classDiagram
direction LR
class Layer{
PackageName: sha256:2256f59767967e5bf0a404b7
ExternalRef: PACKAGE_MANAGER purl pkg:oci/layer@sha256:2225[...]
PackageChecksum: SHA256 2256f59767967e5bf0a404b7
}
class busybox {
PackageName: BusyBox
PackageVersion: 1.35.0-r22
ExternalRef: PACKAGE_MANAGER purl pkg:apk/alpine/[email protected]?arch=x86
}
class cacertificatesbundle {
PackageName: "CA Certificates"
PackageVersion: 20220614-r1
ExternalRef: PACKAGE_MANAGER purl pkg:apk/alpine/ca-certificates-bundle@20220614-r1
}
Layer --> busybox: CONTAINS
Layer --> cacertificatesbundle: CONTAINS
```
## Design Specification

The goal of this design is to allow maximum flexibility when adding metadata
to the layer package. Generally, a layer abstracts a filesystem so it can
potentially contain anything. This is why the layer abstraction should have
room for current and unexpected uses.

### Package Structure

The package in this design represents a layer in a container image. Separating the
layers into their own packages ensures that adding data about packages can be added
to the appropiate section of the SBOM. For example, a layer can express an added file
via a `curl` pull while another can add child packages detailing the installed OS
dependencies.

Things inside of images should be added as SPDX Packages and Files and related using a
CONTAINS relationship.

### Software Identifiers

The package representing a layer should have a purl of type `oci` referencing the
layer in a registry-neutral way. No os/arch data needs be added to the purl as
the layer will mostl likely live under an image SBOM which should include the platform
info.
58 changes: 58 additions & 0 deletions semantics/oci-multiarch-index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Containers: MultiArch Image Index

## Design Goals

* Having an design through which an SBOM can represent the
structure of a container image index.
* Separation of single-arch images to allow referencing SBOMs
for each or adding a detailed structure for them in the packages
representing each image.
* No registry or repository information to ensure the SBOM is
portable across registries when images are copied around.
* Data detailing metadata about the single arch images is packaged
in the purl external reference.

## Structure Diagram

```mermaid
classDiagram
direction LR
class ImageIndex{
PackageName: sha256:923784e51e709f
ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:923[...]
PackageChecksum: SHA256 923784e51e709f
}
class Image{
PackageName: sha256:5fbbc2112ee51e709f374c9c01e
ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:5fb[...]?arch=amd64&os=linux
PackageChecksum: SHA256 5fbbc2112ee51e709f374c9c01e
}
class Image2 {
PackageName: sha256:c3e3b1394f8b8fa1e8768
ExternalRef: PACKAGE_MANAGER purl pkg:oci/alpine@sha256:c3e[...]?arch=arm64&os=darwin
PackageChecksum: SHA256 c3e3b1394f8b8fa1e8768
}
ImageIndex --> Image: VARIANT_OF
ImageIndex --> Image2: VARIANT_OF
```

## Design Specification

### Package Structure

The top level package represents the image index. The name of the package should be
the digest of its manifest, preceded as usual by the algorithm,
eg `sha256:923784e51e709f...`.

Each container image fronted by the image must be represented by another package, also
named as its digest. Each of these packages should be related to the index using a
`VARIANT_OF` SPDX relationship.

### Software Identifiers

Each of the packages in this design should contain a reference to the OCI object using
a [purl of type `oci`](https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#oci).
Packages representing the single arch images must contain the os/arch metadata and optionally
can add tag and repository metadata.

0 comments on commit 26fa358

Please sign in to comment.