Skip to content

perf research: WebSocket#7

Draft
crowlbot wants to merge 1 commit into
mainfrom
perf-research/websocket
Draft

perf research: WebSocket#7
crowlbot wants to merge 1 commit into
mainfrom
perf-research/websocket

Conversation

@crowlbot
Copy link
Copy Markdown
Owner

TL;DR

Honest finding: no high-impact architectural slowdown in Deno's WebSocket.

  • Client side: Deno beats Node 22 by 6.5× at 64 KB binary frames and beats Bun by 2.2× at the same size. Small-text is 1.36× faster than Node and ~28% slower than Bun.
  • Server side: Within 6–25% of Bun's server, with the biggest gap on small text. Not graduating — the gap is in unattributed native ticks (deno binary + libc, ~65% of total) and native flamegraphs are blocked on this host.

Headline ratios

Client side (msgs/s, higher is better), Deno reference server

Workload Deno Node 22 Node 23 Bun Deno vs Node 22 Deno vs Bun
client_text_11 (11 B text) 111,128 81,646 66,595 153,162 1.36× faster 1.38× slower
client_bin_32 (32 B bin) 101,448 70,825 97,242 110,911 1.43× faster 1.09× slower
client_bin_4k (4 KB bin) 48,887 11,238 11,892 39,576 4.35× faster 1.23× faster
client_bin_64k (64 KB bin) 6,927 1,067 1,071 3,138 6.49× faster 2.21× faster

Server side (Deno client, Deno vs Bun server)

Workload Deno server Bun server Bun vs Deno
client_text_11 111,128 136,218 Bun 1.23× faster
client_bin_32 101,448 123,858 Bun 1.22× faster
client_bin_4k 48,887 63,068 Bun 1.29× faster
client_bin_64k 6,927 7,368 Bun 1.06× faster

Where the time goes — V8 prof on Deno client

ticks  total  nonlib   name
 229   42.2%          /deno   (WS framing + mio/tokio)
 127   23.4%          libc    (recv() syscalls)
  11    2.0%   5.9%   Builtin: LoadIC
   7    1.3%   3.7%   Builtin: ArrayPrototypePush
   ... (no single JS hot path)

65% of ticks in native code. JS-side overhead is uneventful — no obvious hot path to attack.

Hypotheses considered

# Hypothesis Verdict
H1 Per-frame allocation dominates Rejected — 64 KB frames are 6.5× faster than Node
H2 EventTarget dispatch overhead per message Unranked — wrappedHandler at 0.4% of ticks
H3 Server-side small-message gap Unranked — native attribution blocked

What's not here

  • Node-as-server (would require ws npm package).
  • Native flamegraph (perf_event_paranoid = 4, no sudo on host).
  • Many-connection scaling.

Final ranking

none — no high-impact slowdown to graduate. The 25% small-text server gap is real but currently unattributed.

Layout

```
tools/perf_research/websocket/
README.md full report
micro/ws_server.js Deno echo server (reference)
micro/ws_server_bun.js Bun echo server (cross-check)
micro/ws_client.js universal WS client bench
profiles/ws_results.log raw bench output
profiles/ws_client.prof.txt V8 --prof for Deno client
profiles/versions.txt runtime versions + host caps
```

Honest finding: no high-impact architectural slowdown in Deno's
WebSocket. Client beats Node 22 by 6.5x on 64KB binary frames and Bun
by 2.2x. Small text is 1.36x faster than Node and 28% slower than Bun.

Server-side echo throughput is within 6-25% of Bun depending on
message size, with the biggest gap on small text. The gap is in
unattributed native ticks (mio/tokio/WS framing in deno binary, ~65%
of total ticks) and native flamegraphs are blocked on this host
(perf_event_paranoid=4, no sudo). Recording as unranked rather than
speculating.

No graduated upstream fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant