perf research: structuredClone by crowlbot · Pull Request #5 · crowlbot/deno

crowlbot · 2026-05-18T01:09:45Z

TL;DR

structuredClone(<primitive>) is ~3× slower than Node 22 LTS / Node 23 and ~10× slower than Bun. The cause is V8 failing to inline the structuredClone JS function body (in ext/web/13_message_port.js) because of its size, so every primitive call pays the function-call-boundary cost.

Wrapping the function with a tiny external fast-path identical to the one inside structuredClone collapses the cost from ~2,890 ns → ~25 ns (~115× faster) in Deno. Node and Bun see the same effect (33× and 10× respectively) — Deno just has the largest gap because Node and Bun's internal serializers handle primitives more cheaply behind the function-call boundary.

Graduating to an upstream perf fix (separate PR): split into structuredCloneSlow + a small inlinable wrapper. The benchmark and fix live there.

Headline ratios

Primitives — the finding

Bench	Deno	Node 22	Node 23	Bun	Deno vs Node 22	Deno vs Bun
clone_number	3,289	1,079	1,052	281	3.05× slower	11.7× slower
clone_short_string	3,551	1,171	1,213	574	3.03× slower	6.19× slower
clone_boolean	3,009	1,050	1,095	281	2.87× slower	10.7× slower

Wrapper test — proves the inlining hypothesis

`micro/sc_wrapper_test.js` wraps `structuredClone` with an external fast-path identical to the one inside the function:

Runtime	`realSC(42)`	`wrappedSC(42)`	Speedup
Deno 2.7.14	2,857 ns	25.5 ns	112×
Node 22.13.1	1,038 ns	30.7 ns	33×
Bun 1.1.43	294 ns	28.0 ns	10×

The user-space wrapper has the SAME fast-path logic — it just lives in a small function V8 can inline.

Full microbench

15 patterns covering primitives, objects, arrays, maps, TypedArrays, ArrayBuffers, DataViews, mixed-shapes. Raw: `profiles/sc_micro_all.log`. Notable extras:

`clone_u8_64k`: Deno 65,898 ns vs Node 22 31,715 / Bun 27,239 — 2× slower than both. Path: `02_structured_clone.js:79-134` (ArrayBufferPrototypeSlice + Symbol.toStringTag switch + dead-code WeakMap allocation). Not graduating — needs native flamegraph to attribute precisely, and could be folded into a separate cleanup once the dead WeakMap is removed.
`clone_array_1000_strs`: Deno is 1.5× faster than Bun here. Strong showing.
`clone_u8_1m`: Deno is 1.17× faster than Node 22.

Where the cost lives

`ext/web/13_message_port.js:614-690` — `structuredClone(value, options)`. The body is ~75 lines (webidl converter setup, kNotSerializable check, transfer-list dispatch). V8 won't inline this. Every call pays the boundary cost even when the existing internal fast path (`if (arguments.length >= 1 && options === undefined) { ... return value; }`) is taken.
`ext/web/02_structured_clone.js:46` — `const objectCloneMemo = new SafeWeakMap()`. Written via `WeakMapPrototypeSet` on every `ArrayBuffer` clone but never read anywhere in the codebase (grepped). Dead allocation per clone.

V8 prof — primitive hot path

Profile: `profiles/sc_prim.prof.txt`. 5M calls of `structuredClone(i)`, 3,042 ns/op.

Top builtin hits:

583 ticks `CreateShallowObjectLiteral`
333 ticks `LoadIC`
267 ticks `webidl 00_webidl.js:755` (dictionary converter inner)
113 ticks `ObjectAssign`
65 ticks `ArrayIteratorPrototypeNext`

V8 reaches the dictionary converter and `ObjectAssign` in the bottom-up profile — meaning the fast-path return inside `structuredClone` isn't preventing V8 from emitting code for the rest of the body. Splitting the function lets V8 prove the fast path is the only reachable path for a primitive argument.

Ranked hypotheses

Rank	Hypothesis	Impact × Confidence	Notes
H1	V8 cannot inline `structuredClone` because the function body is too large; splitting into `structuredCloneSlow` + a tiny inlinable wrapper recovers the fast path.	HIGH × HIGH	Proven by `sc_wrapper_test.js`: 2857 ns → 25 ns (112× faster). Graduating to upstream PR `perf/structuredClone-primitive-fastpath`.
H2	`objectCloneMemo` in `02_structured_clone.js` is dead code (filled, never read).	low × high	Confirmed by grep. Each ArrayBuffer clone allocates an entry that's never consulted. Not graduating on its own (task forbids drive-by changes).
H3	`clone_u8_64k` is 2× slower than Node/Bun.	medium × medium	Path goes through the giant Symbol.toStringTag switch + WeakMap allocation. Not attributed to architectural cost yet — native flamegraph required. Leaving unranked-for-action.

What's not here

Native flamegraph attribution. `kernel.perf_event_paranoid = 4` and `sudo` unavailable; same cap as PRs perf research: fetch #1–perf research: streams #4 on this fork.

Layout

```
tools/perf_research/structuredClone/
README.md full report
micro/sc_micro.js 15 ops
micro/sc_wrapper_test.js the wrapper-vs-real comparison
profiles/sc_micro_all.log raw bench output per runtime
profiles/sc_prim.prof.txt V8 --prof for the primitive hot path
profiles/versions.txt runtime versions + host caps
```

Macro finding: structuredClone(<primitive>) is 3x slower than Node and ~10x slower than Bun. Cause is V8 failing to inline the structuredClone function body in ext/web/13_message_port.js (too large), so every call pays the function-boundary cost. Verified by wrapping with an external fast-path identical to the one inside the function: 2857 ns -> 25 ns in Deno (112x faster), 1038 ns -> 31 ns in Node, 294 ns -> 28 ns in Bun. Graduating to an upstream perf fix.

crowlbot mentioned this pull request May 18, 2026

perf research: crypto.subtle + getRandomValues #6

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf research: structuredClone#5

perf research: structuredClone#5
crowlbot wants to merge 1 commit into
mainfrom
perf-research/structuredClone

crowlbot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

crowlbot commented May 18, 2026

TL;DR

Headline ratios

Primitives — the finding

Wrapper test — proves the inlining hypothesis

Full microbench

Where the cost lives

V8 prof — primitive hot path

Ranked hypotheses

What's not here

Layout

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant