perf research: structuredClone#5
Draft
crowlbot wants to merge 1 commit into
Draft
Conversation
Macro finding: structuredClone(<primitive>) is 3x slower than Node and ~10x slower than Bun. Cause is V8 failing to inline the structuredClone function body in ext/web/13_message_port.js (too large), so every call pays the function-boundary cost. Verified by wrapping with an external fast-path identical to the one inside the function: 2857 ns -> 25 ns in Deno (112x faster), 1038 ns -> 31 ns in Node, 294 ns -> 28 ns in Bun. Graduating to an upstream perf fix.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TL;DR
structuredClone(<primitive>)is ~3× slower than Node 22 LTS / Node 23 and ~10× slower than Bun. The cause is V8 failing to inline thestructuredCloneJS function body (inext/web/13_message_port.js) because of its size, so every primitive call pays the function-call-boundary cost.Wrapping the function with a tiny external fast-path identical to the one inside
structuredClonecollapses the cost from ~2,890 ns → ~25 ns (~115× faster) in Deno. Node and Bun see the same effect (33× and 10× respectively) — Deno just has the largest gap because Node and Bun's internal serializers handle primitives more cheaply behind the function-call boundary.Graduating to an upstream perf fix (separate PR): split into
structuredCloneSlow+ a small inlinable wrapper. The benchmark and fix live there.Headline ratios
Primitives — the finding
Wrapper test — proves the inlining hypothesis
`micro/sc_wrapper_test.js` wraps `structuredClone` with an external fast-path identical to the one inside the function:
The user-space wrapper has the SAME fast-path logic — it just lives in a small function V8 can inline.
Full microbench
15 patterns covering primitives, objects, arrays, maps, TypedArrays, ArrayBuffers, DataViews, mixed-shapes. Raw: `profiles/sc_micro_all.log`. Notable extras:
Where the cost lives
V8 prof — primitive hot path
Profile: `profiles/sc_prim.prof.txt`. 5M calls of `structuredClone(i)`, 3,042 ns/op.
Top builtin hits:
V8 reaches the dictionary converter and `ObjectAssign` in the bottom-up profile — meaning the fast-path return inside `structuredClone` isn't preventing V8 from emitting code for the rest of the body. Splitting the function lets V8 prove the fast path is the only reachable path for a primitive argument.
Ranked hypotheses
What's not here
Layout
```
tools/perf_research/structuredClone/
README.md full report
micro/sc_micro.js 15 ops
micro/sc_wrapper_test.js the wrapper-vs-real comparison
profiles/sc_micro_all.log raw bench output per runtime
profiles/sc_prim.prof.txt V8 --prof for the primitive hot path
profiles/versions.txt runtime versions + host caps
```