Skip to content

fix: remove ~48 transitive dependencies from asset packaging#1654

Merged
aws-cdk-automation merged 2 commits into
mainfrom
mrgrain/refactor/private-tools/replace-archiver-with-yazl
Jun 22, 2026
Merged

fix: remove ~48 transitive dependencies from asset packaging#1654
aws-cdk-automation merged 2 commits into
mainfrom
mrgrain/refactor/private-tools/replace-archiver-with-yazl

Conversation

@mrgrain

@mrgrain mrgrain commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Whenever the CDK Toolkit packages file assets — Lambda function code, S3 file assets and the like — it builds a ZIP archive. That work was previously done by archiver, a dependency that brings roughly fifty transitive packages along with it into every install of the CDK CLI and @aws-cdk/toolkit-lib. This pull request replaces it with yazl, a focused, write-only ZIP library, which brings the footprint for this functionality down to just two packages (yazl and buffer-crc32).

The reason for the change is customer experience. A smaller dependency graph means a faster npm install and a lighter footprint on disk and in CI for everyone who installs the CLI or builds on the toolkit library. It also shrinks the set of third-party code that has to be audited and patched when security advisories are published, which is a recurring cost for our users. Because the CDK only ever creates archives and never parses untrusted ones, a write-only library is all we need, so this also drops a large amount of reader and format-handling code that was never exercised in our use case.

Asset packaging gets a little faster as a side effect. Both libraries hand compression to Node's native zlib, and in local benchmarks across representative asset trees yazl produced archives 8–11% faster than archiver at comparable memory use. The existing streaming behaviour is kept, so large assets are still written to disk without buffering the whole archive in memory.

The change is otherwise invisible to users. Archives stay byte-for-byte deterministic — entry timestamps remain pinned to the 1980 epoch, so identical content continues to produce an identical hash — Unix file modes such as the executable bit are still preserved, and symbolic links are still followed. The existing unit tests for the zip tool pass unchanged.

A full benchmark and security report is attached as a separate comment on this PR.

Checklist

  • This change contains a major version upgrade for a dependency and I confirm all breaking changes are addressed
    • Release notes for the new version:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

The zip tool used to package CDK file assets relied on `archiver`, which
pulls ~50 transitive packages into the CLI and toolkit libraries. Swap it
for `yazl`, a focused write-only ZIP library, reducing that to 2 packages
(`yazl` + `buffer-crc32`).

The change is behaviour-preserving: archives remain byte-for-byte
deterministic (entry dates pinned to the 1980 epoch), Unix file modes are
preserved, symlinks are followed, and output is still streamed to disk.
Both libraries delegate compression to Node's native zlib, so asset
packaging is slightly faster (8-11% in local benchmarks) at comparable
memory. Existing zip unit tests pass unchanged.
@mrgrain

mrgrain commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Benchmark Report: Replacing archiver in private-tools/zip

Goal

Replace the archiver dependency used by @aws-cdk/private-tools's zip tool
(lib/zip/index.ts) with a more modern library that has fewer dependencies,
while keeping functionality and performance unchanged.

TL;DR

Recommendation: yazl.

  • Cuts the dependency tree from ~50 transitive packages → 2 (yazl + buffer-crc32).
  • Faster than the current archiver (1.09–1.13× across fixtures) — performance is improved, not just unchanged.
  • Keeps the existing streaming, low-memory model (uses Node's native zlib).
  • Zero known security advisories (ever) for yazl or buffer-crc32.
  • fflate (zero-dep) was evaluated thoroughly and rejected: as a pure-JS
    compressor it is slower on Node and uses more memory than native-zlib
    libraries, in every mode tested.

What the tool requires

From lib/zip/index.ts, any replacement must support:

  1. Writing a DEFLATE zip — zipDirectory() streams to a file, zipString() returns a Buffer.
  2. Deterministic output — all entry dates pinned to 1980-01-01T00:00:00Z so equal content yields an equal hash.
  3. Unix mode preservation — e.g. the executable bit.
  4. Serial, ordered entry append; streaming to disk to bound memory on large assets.

Candidates

Library Transitive deps Compressor Streaming Mode support Notes
archiver (current) ~50 native zlib mode option general-purpose, heavy dep tree
yazl 2 (buffer-crc32) native zlib ✅ (outputStream) native mode option write-only, mature/stable, MIT
fflate 1 (zero deps) pure-JS ✅ (streaming API) attrs: mode<<16 + os:3 tiny, modern, MIT, but JS compressor
jszip several pure-JS (pako) partial unixPermissions already a devDep; heavier

Methodology

A self-contained harness (packages/@aws-cdk/private-tools/bench/, isolated
node_modules, not part of the build) measures each implementation through an
identical code path: fast-glob the directory, read each file, append serially
with the epoch date + file mode, and write the zip.

  • Fixtures (both ~24 MB, deterministic seeded content via gen-fixture.mjs):
    • mixed — 2009 files: 2000 small source-like files + 8 × 2 MB incompressible blobs + 1 executable.
    • compressible — 6001 small source-like files (resembles a JS/Lambda bundle).
  • Timinghyperfine --warmup 3 --runs 12 (wall-clock, end-to-end node process).
  • Memory/usr/bin/time -l peak resident set size (RSS).
  • Correctnessverify.mjs loads each zip with jszip and asserts the
    same guarantees as the unit test: byte-identical content, a single unique
    date of 1980-01-01T00:00:00.000Z, byte-identical output across two runs
    (determinism), and a preserved executable bit. All implementations passed.

Environment

OS macOS 26.5.1 (arm64)
CPU Apple M4 Pro, 14 cores
Node v24.14.1
hyperfine 1.19.0

Results

Wall-clock — mixed fixture (24 MB, 2009 files)

Implementation Mean vs archiver
yazl 722 ms 0.92× (8% faster)
archiver (current) 789 ms 1.00×
fflate (async / workers) 825 ms ± 227 ms 1.05× (high variance)
fflate (zipSync, in-memory) 904 ms 1.15×
fflate (streaming) 1.028 s 1.30×

Wall-clock — compressible fixture (24 MB, 6001 files)

Implementation Mean vs archiver
yazl 1.722 s 0.89× (11% faster)
fflate (zipSync, in-memory) 1.912 s 0.98×
fflate (async / workers) 1.934 s 1.00×
archiver (current) 1.942 s 1.00×
fflate (streaming) 2.136 s 1.10×

Peak memory (RSS) — mixed fixture

Implementation Peak RSS Streaming?
archiver (current) 133 MB
yazl 146 MB
fflate (zipSync) 162 MB ❌ (buffers whole archive)
fflate (streaming) 164 MB
fflate (async / workers) 312 MB ❌ (workers + buffering)

Analysis

Why yazl wins

yazl and archiver both delegate compression to Node's native zlib,
which runs on the libuv threadpool and overlaps compression with file I/O.
yazl is a focused, write-only library with far less pipeline overhead than the
general-purpose archiver, so it is consistently a little faster while using a
comparable amount of memory — and it carries 48 fewer transitive packages.

Why fflate was rejected (despite zero deps)

fflate is an excellent pure-JavaScript compressor, but that is exactly the
problem on Node:

  • Single-threaded JS deflate can't overlap with I/O the way native zlib
    does, so wall-clock is slower in every mode.
  • The streaming mode — the one we'd actually need to keep memory bounded —
    is the slowest variant and saves no memory (164 MB vs archiver's 133 MB).
  • zipSync is faster but buffers the entire archive (and all inputs) in RAM.
  • The async / multi-threaded mode (fflate's headline feature) spawns a
    worker per entry; on many small files (typical CDK assets) the overhead
    dominates, producing unstable wall-clock (±227 ms) and 2.4× the memory
    (312 MB)
    . It only pays off for a handful of very large files.

fflate's real strengths — tiny bundle size and browser support — don't apply
to a Node-only CLI tool that already has native zlib available.

Implementation note: in fflate's streaming API, the per-entry mtime,
attrs, and os must be assigned as properties on the ZipDeflate
instance
, not passed via the constructor options (the constructor options
only reach the deflater). Passing them as options is silently ignored and the
archive falls back to Date.now(), breaking determinism. This was found and
fixed during testing.

Maintenance

Package Latest Published Weekly downloads
yazl 3.3.1 2024-11-23 ~3.0M
fflate 0.8.3 2026-05-16 ~52M

yazl has a slower release cadence, but it is a small, feature-complete ZIP
writer; the ZIP format is stable, so "no recent release" reflects maturity
rather than abandonment. (The @indutny/yazl fork is also stale — 2024-02 — and
still depends on buffer-crc32, so it offers no advantage.)

Security

Authoritative advisory databases (OSV.dev and npm's GitHub Advisory DB) report
zero advisories, for any version, of both yazl and buffer-crc32.

A note on relevance: the well-known ZIP vulnerability classes (zip bombs,
malformed-header DoS, path traversal) live in readers/parsers that consume
untrusted archives — e.g. yauzl (the separate un-zip companion library) has
a DoS advisory. yazl is a writer fed our own files, so its attack surface
is structurally smaller. Dropping ~48 transitive packages also shrinks the
overall surface relative to archiver.

Decision

Adopt yazl. It satisfies every requirement and improves on the status quo:

  • ✅ Fewer dependencies: ~50 → 2.
  • ✅ Performance unchanged — in fact 8–11% faster wall-clock.
  • ✅ Streaming / low-memory model preserved (comparable RSS).
  • ✅ Functionally identical: deterministic, epoch dates, mode preserved.
  • ✅ Clean security history; smaller attack surface.

Zero-dependency alternative (not adopted): a hand-rolled ~150-line ZIP
writer over native zlib would match yazl's performance with no runtime
dependency at all (a ~15-line CRC-32 is needed because zlib.crc32 requires
Node ≥ 20 and the repo supports Node ≥ 18). It was deemed not worth owning the
ZIP-format code when yazl provides it in two well-audited packages.

@github-actions github-actions Bot added the p2 label Jun 19, 2026
@aws-cdk-automation aws-cdk-automation requested a review from a team June 19, 2026 20:38
@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Dependency Review

The following issues were found:
  • ✅ 0 vulnerable package(s)
  • ✅ 0 package(s) with incompatible licenses
  • ✅ 0 package(s) with invalid SPDX license definitions
  • ⚠️ 4 package(s) with unknown licenses.
  • ⚠️ 1 packages with OpenSSF Scorecard issues.
See the Details below.

License Issues

packages/@aws-cdk/cdk-assets-lib/package.json

PackageVersionLicenseIssue Type
yazl^3.3.1NullUnknown License

packages/@aws-cdk/private-tools/package.json

PackageVersionLicenseIssue Type
yazl^3.3.1NullUnknown License

packages/@aws-cdk/toolkit-lib/package.json

PackageVersionLicenseIssue Type
yazl^3.3.1NullUnknown License

packages/aws-cdk/package.json

PackageVersionLicenseIssue Type
yazl^3.3.1NullUnknown License

OpenSSF Scorecard

PackageVersionScoreDetails
npm/yazl ^3.3.1 UnknownUnknown
npm/@types/yazl ^3.3.1 UnknownUnknown
npm/yazl ^3.3.1 UnknownUnknown
npm/yazl ^3.3.1 UnknownUnknown
npm/@types/yazl ^3.3.1 UnknownUnknown
npm/yazl ^3.3.1 UnknownUnknown
npm/@types/yazl 3.3.1 🟢 6.5
Details
CheckScoreReason
Maintained🟢 1030 commit(s) and 4 issue activity found in the last 90 days -- score normalized to 10
Code-Review🟢 8Found 25/28 approved changesets -- score normalized to 8
Packaging⚠️ -1packaging workflow not detected
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Dangerous-Workflow🟢 10no dangerous workflow patterns detected
Token-Permissions⚠️ 0detected GitHub workflow tokens with excessive permissions
Security-Policy🟢 10security policy file detected
License🟢 9license file detected
Branch-Protection⚠️ -1internal error: error during branchesHandler.setup: internal error: some github tokens can't read classic branch protection rules: https://github.com/ossf/scorecard-action/blob/main/docs/authentication/fine-grained-auth-token.md
Signed-Releases⚠️ -1no releases found
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0
Pinned-Dependencies🟢 8dependency not pinned by hash detected -- score normalized to 8
Binary-Artifacts🟢 10no binaries found in the repo
Fuzzing⚠️ 0project is not fuzzed
npm/yazl 3.3.1 ⚠️ 2
Details
CheckScoreReason
Packaging⚠️ -1packaging workflow not detected
Token-Permissions⚠️ -1No tokens found
Code-Review⚠️ 0Found 2/24 approved changesets -- score normalized to 0
Maintained⚠️ 00 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Dangerous-Workflow⚠️ -1no workflows found
Binary-Artifacts🟢 10no binaries found in the repo
Pinned-Dependencies⚠️ -1no dependencies found
CII-Best-Practices⚠️ 0no effort to earn an OpenSSF best practices badge detected
Security-Policy⚠️ 0security policy file not detected
Fuzzing⚠️ 0project is not fuzzed
License🟢 10license file detected
Signed-Releases⚠️ -1no releases found
Branch-Protection⚠️ 0branch protection not enabled on development/release branches
SAST⚠️ 0SAST tool is not run on all commits -- score normalized to 0

Scanned Files

  • packages/@aws-cdk/cdk-assets-lib/package.json
  • packages/@aws-cdk/private-tools/package.json
  • packages/@aws-cdk/toolkit-lib/package.json
  • packages/aws-cdk/package.json
  • yarn.lock

@mrgrain mrgrain changed the title refactor(cli): replace archiver with yazl in the zip tool fix: remove ~48 transitive dependencies from asset packaging Jun 19, 2026
@github-actions

Copy link
Copy Markdown
Contributor

Total lines changed 9790 is greater than 1000. Please consider breaking this PR down.

@mrgrain mrgrain added the pr/exempt-size-check Skips PR size check label Jun 19, 2026
@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.80%. Comparing base (b516d60) to head (8fe7c15).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1654      +/-   ##
==========================================
+ Coverage   88.73%   88.80%   +0.06%     
==========================================
  Files          77       77              
  Lines       11365    11354      -11     
  Branches     1588     1584       -4     
==========================================
- Hits        10085    10083       -2     
+ Misses       1250     1241       -9     
  Partials       30       30              
Flag Coverage Δ
suite.unit 88.80% <ø> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@aws-cdk-automation aws-cdk-automation added this pull request to the merge queue Jun 22, 2026
Merged via the queue into main with commit f979da7 Jun 22, 2026
50 of 51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

p2 pr/exempt-size-check Skips PR size check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants