Skip to content

src: experimental http client support #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2025

Conversation

allisonkarlitskaya
Copy link
Collaborator

It turns out that the information contained in splitstreams to assist with garbage collection (ie: the list of things that we mustn't discard) is exactly the required information for downloading (ie: the list of things that we must acquire).

Use this fact to add support for fetching repository content from HTTP servers. We only download the objects that are actually required, so incremental pulls are very fast.

This works with just about any HTTP server, so you can do something like

python -m http.server -d ~/.var/lib/composefs

and download from that. With a fast enough web server on localhost, pulling a complete image into an empty repository takes about as long as pulling an oci: directory via skopeo with cfsctl oci pull.

In practice, this is intended to be used with a webserver which supports static compression and pre-compressed objects stored on the server. In particular, zstd support is enabled in the reqwest crate for this reason, and it's working with something like:

find repo/objects/ -type f -name '*[0-9a-f]' -exec zstd -19 -v '{}' +
static-web-server -p 8888 --compression-static -d repo

There's also an included s3-uploader.py in the examples/ directory which will upload a repository to an S3 bucket, with zstd compression.

It turns out that the information contained in splitstreams to assist
with garbage collection (ie: the list of things that we mustn't discard)
is exactly the required information for downloading (ie: the list of
things that we must acquire).

Use this fact to add support for fetching repository content from HTTP
servers.  We only download the objects that are actually required, so
incremental pulls are very fast.

This works with just about any HTTP server, so you can do something like

  python -m http.server -d ~/.var/lib/composefs

and download from that.  With a fast enough web server on localhost,
pulling a complete image into an empty repository takes about as long as
pulling an `oci:` directory via skopeo with `cfsctl oci pull`.

In practice, this is intended to be used with a webserver which supports
static compression and pre-compressed objects stored on the server.  In
particular, zstd support is enabled in the `reqwest` crate for this
reason, and it's working with something like:

  find repo/objects/ -type f -name '*[0-9a-f]' -exec zstd -19 -v '{}' +
  static-web-server -p 8888 --compression-static -d repo

There's also an included s3-uploader.py in the examples/ directory which
will upload a repository to an S3 bucket, with zstd compression.

Signed-off-by: Allison Karlitskaya <[email protected]>
@allisonkarlitskaya allisonkarlitskaya marked this pull request as ready for review May 20, 2025 08:01
@jeckersb
Copy link
Collaborator

Super cool, I'll try to set aside some time to play with this soon :)

@cgwalters
Copy link
Collaborator

Makes sense as an experiment, but heavily overlaps with existing solutions like split-reproducible blobs and zstd:chunked that are more natively supported by OCI.

@allisonkarlitskaya allisonkarlitskaya merged commit 74f65a7 into containers:main May 26, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants