perf research: crypto.subtle + getRandomValues#6
Draft
crowlbot wants to merge 1 commit into
Draft
Conversation
Headline: crypto.subtle.digest of a small message is 7.8x slower than Bun (37 us vs 4.8 us) and roughly the same as Node 22. Cause is op_crypto_subtle_digest in ext/crypto/lib.rs calling spawn_blocking on every call. For an 11-byte SHA-256 the actual work is <100 ns but thread-pool dispatch adds ~30 us. Fix is a sync small-input fast path. Strong positive: getRandomValues(16) is 43x faster than Node LTS. Worth preserving. Other ratios for HMAC, AES-GCM, PBKDF2 are competitive with both Node and Bun. Graduating only the digest finding to upstream.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
crypto.subtle.digest("SHA-256", smallBuf)is 7.8× slower than Bun (37 μs vs 4.8 μs). Cause:op_crypto_subtle_digestinext/crypto/lib.rscallsspawn_blockingfor every input, regardless of size. For an 11-byte message, the actual SHA-256 work is <100 ns but the thread-pool dispatch adds ~30 μs. Graduating to an upstream fix that runs the digest synchronously on the calling thread for small inputs.Strong positive (worth preserving):
crypto.getRandomValues(buf16)is 43× faster than Node 22 LTS. Deno's sync getrandom path is excellent.Headline ratios
Times in ns/op (microsecond × 1000). Full data in
profiles/crypto_all.log.Where the cost lives — digest
ext/crypto/lib.rs:881-894:```rust
#[op2]
pub async fn op_crypto_subtle_digest(
#[serde] algorithm: CryptoHash,
#[buffer] data: JsBuffer,
) -> Result<Uint8Array, CryptoError> {
let output = spawn_blocking(move || {
digest::digest(algorithm.into(), &data)
.as_ref()
.to_vec()
.into()
})
.await?;
Ok(output)
}
```
spawn_blockingdispatches to tokio's blocking-thread pool. For inputs large enough to actually consume CPU time the dispatch is appropriate. For small inputs it inverts — dispatch costs ~30 μs while a single SHA-256 block on hardware-accelerated x86 takes well under 100 ns.V8 prof attribution (200k iters of digest("SHA-256", "hello world"))
spawn_blocking+ digest). Native flamegraphs blocked on this host (perf_event_paranoid=4, no sudo).AsyncFunctionAwaitResolveClosure,ResumeGeneratorTrampoline).Ranked hypotheses
op_crypto_subtle_digestusesspawn_blockingfor every input. Short-circuiting for small inputs (≤ 64 KB) recovers ~30 μs per call.perf/crypto-digest-sync-small-inputs.importKey HMAC rawis 4.7× slower than Bun.digest SHA-1 shortis the same 7.9× story.getRandomValues(16)43× faster than NodeWhat's not here
kernel.perf_event_paranoid = 4, no sudo on host; same cap as PRs perf research: fetch #1-perf research: structuredClone #5 on this fork).Layout
```
tools/perf_research/crypto/
README.md full report
micro/crypto_micro.js 13 ops covering getrandom, digest, hmac, aes-gcm, pbkdf2
profiles/crypto_all.log raw bench output per runtime
profiles/digest.prof.txt V8 --prof for the digest hot path
profiles/versions.txt runtime versions + host caps
```