Skip to content

perf research: url#2

Draft
crowlbot wants to merge 1 commit into
mainfrom
perf-research/url
Draft

perf research: url#2
crowlbot wants to merge 1 commit into
mainfrom
perf-research/url

Conversation

@crowlbot
Copy link
Copy Markdown
Owner

perf research: url

Macro performance research on Deno's implementation of URL and
URLSearchParams.

This PR contains only benchmark scripts and committed V8 prof artifacts — no
production code changes. The report below is the deliverable.

Methodology

  • Ratios over absolute numbers. Host is Docker on Proxmox; absolute
    ns/op is unreliable, so the headline is same-host ratios vs Node 22 LTS
    and Bun.
  • Flamegraph attribution. V8 --prof in-process. perf / samply
    need kernel.perf_event_paranoid<=1 but the container is locked at 3
    and sysctl is denied — so all attribution below is JS-side. Native
    servo/url crate time is captured under the "deno binary" bucket but
    cannot be broken down further without CAP_PERFMON.
  • Microbench mix. Designed around the URL surface that real workloads
    hit: construct (with and without base), canParse, getters
    (href/pathname/search), setters (pathname/search),
    searchParams.get, plus URLSearchParams construct (string + object),
    get, and toString.

Pinned: deno 2.7.14 (v8 14.7.173.20-rusty), node v22.22.2, bun 1.3.14.

Headline ratios — microbench (ns/op; lower is better)

URL

op Deno Node Bun Deno/Node Deno/Bun
new URL("https://example.com/path?x=1#y") 436 385 500 1.13 0.87
new URL("/p?x=1", base) 770 620 847 1.24 0.91
new URL("…/api?a=1&b=2&c=3&d=4&e=5") 618 411 616 1.50 1.00
URL.canParse("https://example.com/path") 324 145 243 2.25 1.33
url.href 8.9 8.1 17.1 1.10 0.52
url.pathname 12.6 11.6 53.2 1.09 0.24
url.search 13.8 16.7 57.6 0.83 0.24
url.pathname = "/new/path" 769 694 840 1.11 0.92
url.search = "?new=query" 741 657 839 1.13 0.88
url.searchParams.get("a") 36.1 16.5 20.1 2.19 1.80

URLSearchParams

op Deno Node Bun Deno/Node Deno/Bun
new URLSearchParams("a=1&b=2&…&h=8") (8 pairs) 1 483 219 1 758 6.79 0.84
new URLSearchParams(obj) (6 pairs) 1 001 946 693 1.06 1.44
usp.get("c") 41.8 17.1 22.6 2.45 1.85
usp.toString() (8 pairs) 1 990 351 518 5.68 3.85

Reads:

  • URL parsing itself is competitivenew URL(...) is within ±25 %
    of Node and slightly faster than Bun on the simple case. Servo's url
    crate is doing fine.
  • Getters are excellenturl.pathname is ~12 ns vs Bun's 53 ns,
    because the offset-into-serialization design avoids reparsing on read.
  • Setters cost a full reparseurl.pathname = "/x" is 769 ns,
    competitive with Node (694) but still ~50× more expensive than the
    corresponding getter. Each call op-dispatches into Rust and re-parses
    the URL from its serialized string.
  • URLSearchParams.toString() is 5.7× slower than Node and 3.9× slower
    than Bun. The op-dispatch overhead dwarfs the actual serialization for
    small param sets (8 pairs).
  • URLSearchParams.get() is 2.5× slower than Node and 1.8× slower than
    Bun.
    Storage is [[name, value], …] walked with === on every
    call — no index, no hash.
  • new URLSearchParams("a=1&b=2&…") is 6.8× slower than Node. Node
    parses pure-JS without an op call; Deno op-dispatches into Rust's
    form_urlencoded::parse. For a 24-byte query string, the dispatch
    dominates the parse.

Flamegraph attribution (V8 --prof)

Full profile committed at
tools/perf_research/url/profiles/url_micro.prof.txt
(raw log: url_micro.v8.log.gz).

Top of Statistical profiling result (1 559 ticks):

   ticks  total  nonlib   name
   1209   77.5%          /var/agent-loop/repo/target/release/deno
    111    7.1%          /usr/lib/x86_64-linux-gnu/libc.so.6

   ticks  total  nonlib   name
     35    2.2%   14.6%   Builtin: LoadIC
     16    1.0%    6.7%   JS: *<anonymous> ext:deno_webidl/00_webidl.js:1101:10   // record<USVString,USVString>
     11    0.7%    4.6%   Builtin: StringPrototypeToWellFormed                    // USVString conv
     11    0.7%    4.6%   Builtin: ObjectPrototypeHasOwnProperty                  // record converter loop
      8    0.5%    3.3%   Builtin: KeyedLoadIC_Megamorphic
      8    0.5%    3.3%   Builtin: ArrayMap
      5    0.3%    2.1%   JS: *<anonymous> ext:deno_web/00_url.js:159:9           // USP record-init mapper

77.5 % of total ticks land inside the Deno binary — i.e. servo's url
crate and the op-dispatch boundary. JS-side time is dominated by the
record<USVString, USVString> converter (USP {...} init) and the
USVString well-formed pass.

Where the cost lives

Finding File:line Notes
Every URL setter re-parses the serialized URL from scratch in Rust ext/web/url.rs:127-191 op_url_reparse calls Url::options().parse(&href) before applying the setter, even though the JS side could pass through component offsets for in-place mutation.
URLSearchParams.toString() op-dispatches to Rust for serialization ext/web/00_url.js:332-335ext/web/url.rs:213-221 One op call per toString(). For 8-pair input the op dispatch is ≥80 % of the cost; Node does this in pure JS.
new URLSearchParams("…") op-dispatches to form_urlencoded::parse ext/web/00_url.js:142ext/web/url.rs:195-211 Same op-dispatch shape. Node's pure-JS parser wins for short queries (≤8 pairs) where dispatch dominates.
URLSearchParams stored as [[name, value], …] with linear-scan .get/.has/.set ext/web/00_url.js:243-256 (get), :263-276 (has), :282-318 (set) Same pattern as Headers: O(n) scan, === compare. Acceptable for the typical small N but explains the .get 2.5× gap vs Node.
op_url_parse + op_url_get_serialization is a two-op sequence on the "needs serialization" path ext/web/url.rs:38-40, ext/web/00_url.js:100-110 The Rust side stuffs the serialized String into op state, then the JS calls a second op to take it. Avoids returning a String on the fast (status=0) path, but doubles the boundary crossings whenever serialization differs from input — i.e. on every URL where casing, percent-encoding, or default-port normalization fires.
op_url_parse_search_params eagerly collects into Vec<(String, String)> ext/web/url.rs:198-210 Each call allocates 2N Strings for an N-pair query, even when the caller only wants searchParams.get("first").
URL.canParse does a full parse rather than a validation pass ext/web/00_url.js:446-455op_url_parse 2.25× slower than Node. Servo's url crate doesn't expose a validation-only entry point, so canParse pays the full parse + components-buf write.

Ranked architectural hypotheses

H1 — Every URL setter (pathname, search, hash, host, …) re-parses the entire URL string in Rust on every call (HIGH × HIGH)

  • Evidence. url.pathname = "/x" is 769 ns vs url.pathname getter at 13 ns — a 60× spread.
    Every setter in ext/web/00_url.js:526-882 calls
    opUrlReparse(this.#serialization, SET_*, value), which in
    ext/web/url.rs:127-191 does
    Url::options().parse(&href) from scratch before applying the setter.
  • Architectural root. The JS side has the parsed component offsets
    cached in private fields, but the Rust side doesn't keep a Url around
    per URL instance — every setter rebuilds the parse state from the
    serialized string. The component buffer is updated, but the Rust state
    is discarded.
  • Estimated impact if fixed. A Url-resource model (op returns an
    rid, setters operate on the rid) would amortize the parse cost across
    the lifetime of the URL. For multi-setter code paths
    (url.pathname = …; url.search = …; url.hash = …) this is a 3× reduction;
    for single-setter use it shaves the ~700 ns reparse to ~200 ns
    (apply + reserialize). Most server-side routing libraries set
    pathname/search per request — this is a hot path.

H2 — URLSearchParams.toString() op-dispatches to Rust for what can be a 5-line JS loop (MEDIUM × HIGH)

  • Evidence. usp.toString() (8 pairs) is 1 990 ns in Deno vs 351 ns
    in Node and 518 ns in Bun — 5.7× and 3.9× slower respectively.
    The op call (ext/web/url.rs:213-221)
    takes the full Vec<(String, String)>, runs
    form_urlencoded::Serializer::new(...).extend_pairs(...).finish(),
    and returns the resulting String — a full crossing of the V8/Rust
    boundary plus a Vec<(String, String)> allocation in the marshaling
    layer.
  • Architectural root. For small N (≤16 pairs, which covers virtually
    every real use), the op-dispatch and Vec<(String,String)> allocation
    dominate the actual serialization. A pure-JS encoder (Node's path)
    ends up faster.
  • Estimated impact if fixed. Move serialization to JS for the
    common case (escape table + simple loop, ~30 lines). Brings toString()
    in line with Node (~350 ns), a 5× win on this op. Affects any
    code that calls usp.toString() for redirect URLs, query rebuilding,
    fetch URL construction, etc.

H3 — new URLSearchParams("…") op-dispatches to form_urlencoded::parse even for short queries (MEDIUM × HIGH)

  • Evidence. new URLSearchParams("a=1&b=2&…&h=8") (8 pairs, 24 bytes)
    is 1 483 ns in Deno vs 219 ns in Node — 6.8× slower.
  • Architectural root. Symmetric to H2.
    ext/web/00_url.js:142 calls
    op_url_parse_search_params(init); the op eagerly collects into
    Vec<(String, String)> (url.rs:198-210)
    and returns it. The marshaling cost outweighs the actual parsing for
    short inputs.
  • Estimated impact if fixed. Pure-JS parser for inputs below some
    threshold (e.g. ≤256 bytes). Brings construct in line with Node
    (~220 ns), a 6× win. Note that this is also the path used by
    url.searchParams = ... and by the searchParams getter on first
    access, so it compounds.

H4 — op_url_parse plus op_url_get_serialization is a two-op call on the "needs serialization" path (MEDIUM × MEDIUM)

  • Evidence. The status=1 path runs on every URL where the
    serialization differs from the input — i.e. any URL with casing
    normalization (HTTPS://Xhttps://x), percent-encoding, default
    port removal (:443/:80), or trailing-slash insertion. Real-world
    URLs hit this constantly.
  • Architectural root. ext/web/url.rs:96-102
    stashes the serialized String into OpState and returns status=1;
    the JS then calls op_url_get_serialization() as a second op
    (ext/web/00_url.js:100-110) to take
    it out. This avoids returning a String on the status=0 fast path, but
    doubles the boundary crossings on the slow path.
  • Estimated impact if fixed. Return the serialization directly via
    Option<String> (or a sentinel value in a #[buffer] arg) on the slow
    path — one op call instead of two. Estimated 80-150 ns savings on
    the slow path per URL, which fires on the majority of real-world
    inputs.

H5 — URL parsing is a hot path for fetch (URL is parsed at least twice per request) (MEDIUM × MEDIUM)

  • Evidence. ext/fetch/lib.rs:436 parses
    the URL string again inside op_fetch, even though the JS side has
    already constructed a URL via the user's new URL(...) call (or
    passed a string that fetch internally wraps).
  • Architectural root. The JS-side URL object is opaque to the
    Rust op (only its .href string crosses the boundary). The fetch
    op accepts a String and does Url::parse(&url) from scratch.
  • Estimated impact if fixed. If URL objects carried an internal
    Rust Url rid (per H1), fetch could consume it directly without a
    second parse. ~400 ns savings per fetch(url) call, compounding with
    the H1 setter savings for fetch(new URL(...)) patterns. Listed
    separately from H1 because it spans ext/web and ext/fetch.

H6 — URLSearchParams linear-scan .get/.has/.set (same pattern as Headers) (LOW × HIGH)

  • Evidence. usp.get("c") is 2.5× slower than Node, 1.8× slower
    than Bun. Storage is [[name, value], …] walked with ===
    per entry. Typical real-world N is small (≤8), so this is
    individually small, but it's present at every searchParams access.
  • Architectural root. ext/web/00_url.js:243-256.
    No index, no hash. Spec compliance requires preserving insertion
    order; an additional Map<name, indexInList> index would speed up
    lookups while keeping the list as the source of truth.
  • Estimated impact if fixed. For N ≤ 8 the constant factor matters
    more than the asymptotic complexity. Single-digit-percent for typical
    N; included here only because the same pattern is the leading finding
    in perf-research/fetch for Headers, and a single shared
    implementation strategy would address both.

Reproduction

See tools/perf_research/url/README.md.

cargo build --release --bin deno
./target/release/deno run -A --no-prompt tools/perf_research/url/micro/url_micro.js
node tools/perf_research/url/micro/url_micro.js
bun  tools/perf_research/url/micro/url_micro.js

- micro/url_micro.js: 14 ops covering construct/canParse/setters/searchParams
  on URL, and construct/get/has/toString on URLSearchParams.
- micro_results.jsonl: per-runtime ns/op from a single sweep across Deno
  2.7.14, Node v22.22.2, Bun 1.3.14.
- profiles/url_micro.prof.txt + .v8.log.gz: V8 --prof in-process profile
  of the bench (perf/samply blocked by container caps; --prof always works).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant