refactor: use camino Utf8Path for internal path representation#253
Draft
stormslowly wants to merge 5 commits into
Draft
refactor: use camino Utf8Path for internal path representation#253stormslowly wants to merge 5 commits into
stormslowly wants to merge 5 commits into
Conversation
Convert the resolver's internal path types to camino's Utf8Path/Utf8PathBuf while keeping every public input and output as std Path/PathBuf. Why: the codebase already assumed UTF-8 paths implicitly (path_to_str's `expect`), so making it explicit at the type level removes scattered OsStr/&str juggling on the hot path and prevents accidental non-UTF-8 handling. Public API is unchanged; conversions happen only at the boundary.
Drop a redundant `as_std_path()` before `starts_with` (camino's `starts_with` takes `AsRef<Path>`) and use `.into()` for the unambiguous Utf8PathBuf -> PathBuf boundary conversions (error variants, parse args). No behavior change.
Resolution.path is now Utf8PathBuf, which has no `to_str`. The windows test reads the pub(crate) field directly; switch it to `as_str()`. This module is `#[cfg(windows)]`, so the breakage only surfaced on the Windows CI runner.
…wise camino's `Utf8Path` `PartialEq` walks path components with no fast path, while std `Path` (and `&str`) equality is a single `memcmp` (std even comments its fast path is "for hashmap lookups"). The `DashSet<CachedPath>` cache lookup is the hottest equality on the resolve path (~2.6M comparisons), so the component-wise compare regressed the benches by ~40% instructions. Compare `as_str()` bytes instead. Correct because cache entries are already byte-keyed via the precomputed `hash_path`, so a hash hit implies identical bytes; byte equality matches the existing hashing contract.
…correct The prior `as_str()` byte comparison broke path dedup on Windows: there `hash_path` hashes component-wise (`Path::hash`), so two paths differing only by separator/case (`pack1/foo` vs `pack1\foo`) hash equal but compared unequal as raw strings — corrupting the cache. Compare via std `Path` (`as_std_path()`): component-correct on every platform (matches `hash_path`'s per-platform scheme, exactly like the pre-camino baseline) and still hits std's byte `memcmp` fast path, which is what removes the Linux benchmark regression.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The resolver already assumed UTF-8 paths implicitly —
path_to_str's.expect("path should be UTF-8"), plus the same assumption intsconfigand the wasmcanonicalize. This makes that assumption explicit at the type level: internal path storage and computation now usecamino'sUtf8Path/Utf8PathBuf, removing scatteredOsStr/&strjuggling on the hot path and preventing accidental silent non-UTF-8 handling.The public API is unchanged — every input and output stays std
Path/PathBuf.What
"UTF-8 core, std boundary":
Utf8Path/Utf8PathBuf:CachedPath, the cache'stsconfigmap key,Resolution.path(pub(crate)), all ofTsConfig's internals, thePathUtiltrait, and thelib.rsresolution logic.resolve(&Path), theFileSystemtrait,ResolveOptions,Resolution's accessors,PackageJson.path/realpath,ResolveErrorvariants, andResolverPath(as_arc/into_arcstill returnArc<Path>)..expect("path should be UTF-8"), consistent with the pre-existingpath_to_strbehavior.Adds
camino1.2.2 (MSRV 1.61 ≤ repo's 1.70).Verification
cargo build/--all-features/--all-targets: cleancargo clippy --all-features --all-targets: no new warnings in changed filescargo test: 141 passed; the 6tests::pnpfailures are pre-existing onmain(local Yarn PnP fixtures not installed), unrelated to this changenapicrate compiles unchanged