Skip to content

refactor: use camino Utf8Path for internal path representation#253

Draft
stormslowly wants to merge 5 commits into
mainfrom
refactor/camino-utf8-internal-paths
Draft

refactor: use camino Utf8Path for internal path representation#253
stormslowly wants to merge 5 commits into
mainfrom
refactor/camino-utf8-internal-paths

Conversation

@stormslowly
Copy link
Copy Markdown
Collaborator

Why

The resolver already assumed UTF-8 paths implicitly — path_to_str's .expect("path should be UTF-8"), plus the same assumption in tsconfig and the wasm canonicalize. This makes that assumption explicit at the type level: internal path storage and computation now use camino's Utf8Path/Utf8PathBuf, removing scattered OsStr/&str juggling on the hot path and preventing accidental silent non-UTF-8 handling.

The public API is unchanged — every input and output stays std Path/PathBuf.

What

"UTF-8 core, std boundary":

  • Internal → Utf8Path/Utf8PathBuf: CachedPath, the cache's tsconfig map key, Resolution.path (pub(crate)), all of TsConfig's internals, the PathUtil trait, and the lib.rs resolution logic.
  • Boundary stays std (the in/out params): resolve(&Path), the FileSystem trait, ResolveOptions, Resolution's accessors, PackageJson.path/realpath, ResolveError variants, and ResolverPath (as_arc/into_arc still return Arc<Path>).
  • Non-UTF-8 paths entering at the boundary panic via .expect("path should be UTF-8"), consistent with the pre-existing path_to_str behavior.

Adds camino 1.2.2 (MSRV 1.61 ≤ repo's 1.70).

Verification

  • cargo build / --all-features / --all-targets: clean
  • cargo clippy --all-features --all-targets: no new warnings in changed files
  • cargo test: 141 passed; the 6 tests::pnp failures are pre-existing on main (local Yarn PnP fixtures not installed), unrelated to this change
  • Downstream napi crate compiles unchanged

Convert the resolver's internal path types to camino's Utf8Path/Utf8PathBuf
while keeping every public input and output as std Path/PathBuf.

Why: the codebase already assumed UTF-8 paths implicitly (path_to_str's
`expect`), so making it explicit at the type level removes scattered
OsStr/&str juggling on the hot path and prevents accidental non-UTF-8
handling. Public API is unchanged; conversions happen only at the boundary.
Drop a redundant `as_std_path()` before `starts_with` (camino's `starts_with`
takes `AsRef<Path>`) and use `.into()` for the unambiguous Utf8PathBuf ->
PathBuf boundary conversions (error variants, parse args). No behavior change.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 29, 2026

Merging this PR will not alter performance

✅ 10 untouched benchmarks


Comparing refactor/camino-utf8-internal-paths (6f70056) with main (c84afdc)

Open in CodSpeed

Resolution.path is now Utf8PathBuf, which has no `to_str`. The windows test
reads the pub(crate) field directly; switch it to `as_str()`. This module is
`#[cfg(windows)]`, so the breakage only surfaced on the Windows CI runner.
…wise

camino's `Utf8Path` `PartialEq` walks path components with no fast path,
while std `Path` (and `&str`) equality is a single `memcmp` (std even
comments its fast path is "for hashmap lookups"). The `DashSet<CachedPath>`
cache lookup is the hottest equality on the resolve path (~2.6M comparisons),
so the component-wise compare regressed the benches by ~40% instructions.

Compare `as_str()` bytes instead. Correct because cache entries are already
byte-keyed via the precomputed `hash_path`, so a hash hit implies identical
bytes; byte equality matches the existing hashing contract.
…correct

The prior `as_str()` byte comparison broke path dedup on Windows: there
`hash_path` hashes component-wise (`Path::hash`), so two paths differing only
by separator/case (`pack1/foo` vs `pack1\foo`) hash equal but compared
unequal as raw strings — corrupting the cache.

Compare via std `Path` (`as_std_path()`): component-correct on every platform
(matches `hash_path`'s per-platform scheme, exactly like the pre-camino
baseline) and still hits std's byte `memcmp` fast path, which is what removes
the Linux benchmark regression.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant