Skip to content

add rust backed node buffer implementation#6360

Draft
anonrig wants to merge 1 commit intomainfrom
yagiz/add-buffer-rust
Draft

add rust backed node buffer implementation#6360
anonrig wants to merge 1 commit intomainfrom
yagiz/add-buffer-rust

Conversation

@anonrig
Copy link
Copy Markdown
Member

@anonrig anonrig commented Mar 19, 2026

Built on top of #6272 branch. Not yet ready to review & merge.

@anonrig anonrig requested review from a team as code owners March 19, 2026 13:59
@anonrig anonrig marked this pull request as draft March 19, 2026 14:00
Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm Bonk, and I've done a quick review of your PR.

This PR adds a Rust-backed BufferUtil implementation behind the RUST_BACKED_NODE_BUFFER autogate, using the simdutf Rust crate linked against workerd's vendored simdutf C++ library. The autogate-gated approach is the right pattern for this kind of migration.

Issues (ranked by severity)

  1. [High] to_string: panics on start > end — if start.value() exceeds end.value() (after clamping), &data[s..e] panics at runtime. Untrusted JS can trigger this.

  2. [High] fill_impl: panics on start > end or start >= buffer.len()data[start..end].fill(0) panics if start > end, and data[i] panics if start >= buffer.len() when end is also unclamped for start.

  3. [Medium] index_of_offset: length as i32 wraps for buffers > 2 GiBlet len = length as i32 silently wraps to negative for buffers with length > i32::MAX. While V8 TypedArray limits make this unlikely in practice, it contradicts the #![expect(clippy::cast_sign_loss)] rationale that "all casts are checked or bounded by the caller".

  4. [Medium] Massive code duplication between to_string and to_string_for_encoding — every encoding branch is copy-pasted between these two functions. A single function with a slice parameter would eliminate ~70 lines of duplicated unsafe code.

  5. [Low] decode_string for Base64/Base64url: duplicated branches — the two branches are identical except for Base64Options::Default vs Base64Options::Url. Could be unified.

  6. [Low] byte_length ignores encoding — the C++ byteLength takes a JsString (with encoding semantics). The Rust version accepts a String and returns str.len() (UTF-8 byte length), which is only correct for UTF-8 encoding. Other encodings would produce wrong results.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Mar 19, 2026

I'm Bonk, and I've done a quick review of your PR.

This PR adds a Rust-backed BufferUtil implementation behind the RUST_BACKED_NODE_BUFFER autogate, using the simdutf Rust crate linked against workerd's vendored simdutf C++ library. The autogate-gated approach is correct for this kind of migration.

I posted a review with 6 inline comments on PR #6360. The key findings ranked by severity:

  1. [High] to_string panics on start > end -- untrusted JS can trigger a runtime panic by passing start > end
  2. [High] fill_impl panics on start > end or out-of-bounds start -- same class of issue
  3. [Medium] index_of_offset silently wraps length via as i32 for buffers > 2 GiB
  4. [Medium] ~70 lines of duplicated unsafe simdutf code between to_string and to_string_for_encoding
  5. [Low] Duplicated Base64/Base64url branches in decode_string
  6. [Low] byte_length ignores encoding -- returns UTF-8 byte length regardless of encoding

Each comment includes a concrete suggestion where applicable.

github run

@anonrig anonrig force-pushed the yagiz/add-buffer-rust branch from 00d2f75 to 29a1e01 Compare March 19, 2026 14:50
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 19, 2026

Merging this PR will not alter performance

✅ 70 untouched benchmarks
⏩ 129 skipped benchmarks1


Comparing yagiz/add-buffer-rust (1d5c469) with main (b4fc0eb)

Open in CodSpeed

Footnotes

  1. 129 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@anonrig anonrig force-pushed the yagiz/add-buffer-rust branch 2 times, most recently from ab8b578 to faea404 Compare March 19, 2026 17:01
@jasnell
Copy link
Copy Markdown
Collaborator

jasnell commented Mar 19, 2026

/bigbonk review carefully specifically looking for definite or possible breaking behavioral changes. identify anything that could potentially break user assumptions

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review after deeper analysis. These are new findings beyond my initial review.

  • [Medium] index_of uses naive O(n*m) search vs C++ SearchString (Boyer-Moore-Horspool) — likely source of CodSpeed regression
  • [Medium] decode_string ASCII/Latin1 combined branch has wrong ASCII semantics
  • [Medium] nbytes::base64_decode doesn't distinguish Base64 from Base64url encoding
  • [Low] decode UTF-16LE surrogate detection bitmask is too narrow (faithfully ported from C++)
  • [Low] nbytes FFI does element-by-element copy instead of bulk memcpy

This review was generated by an AI code review assistant and may contain inaccuracies.

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This review was generated by an AI code review assistant and may contain inaccuracies.

Copy link
Copy Markdown
Contributor

@ask-bonk ask-bonk bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up deep review with additional findings beyond the initial review.

  • [Medium] index_of uses naive O(n*m) search vs C++ Boyer-Moore-Horspool — likely CodSpeed regression source
  • [Medium] decode_string ASCII branch has wrong semantics (should mask to 7 bits)
  • [Medium] decode_string does not distinguish Base64 from Base64url
  • [Low] decode UTF-16LE surrogate detection bitmask is too narrow (faithfully ported from C++)
  • [Low] nbytes FFI does element-by-element copy instead of bulk memcpy

This review was generated by an AI code review assistant and may contain inaccuracies.

#[expect(clippy::cast_sign_loss, reason = "opt_offset verified non-negative")]
let start = opt_offset as usize;
if is_forward {
for i in start..=data.len().saturating_sub(needle.len()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naive O(n*m) search — likely perf regression source: The C++ indexOf uses SearchString (Boyer-Moore-Horspool from buffer-string-search.h) for efficient substring search. This Rust version uses a byte-by-byte comparison loop, which is O(n*m) worst case. For large buffers with long needles, this is significantly slower.

This is likely the root cause of the CodSpeed simpleStringBody[Response] regression (-26.77%). Even if this specific benchmark does not call indexOf directly, the naive pattern matching approach suggests other hot paths may have similar regressions.

Consider:

  • memchr crate (SIMD-accelerated) for single-byte needles
  • data.windows(needle.len()).position(|w| w == needle) for a compiler-optimized multi-byte fallback
  • Porting SearchString (Boyer-Moore-Horspool) for parity with the C++ version

Ok(match enc {
// ASCII and Latin1 both take the low byte of each Unicode code point.
// We convert from UTF-8 chars (not raw bytes) to get the correct code points.
Encoding::Ascii | Encoding::Latin1 => string.chars().map(|c| c as u8).collect(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ASCII/Latin1 branches produce identical output, but ASCII should mask to 7 bits: The C++ Encoding::ASCII handling masks each byte with & 0x7f (see toStringImpl in buffer.c++). But this Rust decode_string combines both into a single branch that just truncates to u8 via c as u8. For Latin1, this is correct (low byte of Unicode code point). For ASCII, this is wrong — Node.js Buffer.from("\u00e9", "ascii") should produce 0x69 (= 0xe9 & 0x7f), not 0xe9.

Suggested change
Encoding::Ascii | Encoding::Latin1 => string.chars().map(|c| c as u8).collect(),
Encoding::Ascii => string.chars().map(|c| (c as u8) & 0x7f).collect(),
Encoding::Latin1 => string.chars().map(|c| c as u8).collect(),

// Node.js uses nbytes::Base64Decode which is very lenient — it
// ignores invalid characters and decodes whatever valid base64
// chars are present.
nbytes::base64_decode(src)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base64 and Base64url use the same decode path: nbytes::base64_decode(src) delegates to C++ nbytes::Base64Decode, which uses standard Base64 alphabet (+, /). For Base64url, the alphabet should be - and _ instead. Node.js Buffer.from(data, "base64url") should decode URL-safe base64.

The C++ decodeStringImpl also uses nbytes::Base64Decode for both, which happens to be lenient enough to accept both alphabets. If that leniency is intentional and tested, this is fine — but it would be good to add a comment explaining the intentional fallthrough, and ideally a test for URL-safe characters (- and _).

::rust::Vec<uint8_t> result;
result.reserve(actual);
auto* data = reinterpret_cast<const uint8_t*>(buf.get());
std::copy(data, data + actual, std::back_inserter(result));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Element-by-element copy: std::back_inserter copies bytes one at a time. Since you already have a contiguous buffer, consider using a bulk copy:

Suggested change
std::copy(data, data + actual, std::back_inserter(result));
::rust::Vec<uint8_t> result;
result.reserve(actual);
auto* data = reinterpret_cast<const uint8_t*>(buf.get());
std::copy_n(data, actual, std::back_inserter(result));

Or even better, if rust::Vec supports it, construct from the range directly. The current code is equivalent to std::copy with iterators, which the compiler may optimize to memcpy — but it is not guaranteed.

};

let data = buffer.as_slice();
let offset = byte_offset.value() as i32;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

byte_offset f64→i32 truncation: byte_offset.value() as i32 silently truncates values outside the i32 range. JavaScript numbers can represent offsets larger than i32::MAX (e.g., 2147483648). The as i32 cast wraps to negative, changing search semantics. Consider:

let offset = byte_offset.value() as i64;
// then use i64 throughout the offset computation

This is related to my earlier comment about length as i32, but byte_offset is also affected and is more likely to actually receive large values from user code.

@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk bot commented Mar 19, 2026

Review posted successfully on PR #6360. Here's a summary of what was done:

Initial review (already posted): 6 inline comments covering panics on untrusted input, silent i32 truncation, code duplication, and byteLength encoding mismatch.

Follow-up deep review (just posted): 6 additional inline comments with new findings:

Severity Finding Location
Medium index_of uses naive O(n*m) search vs C++ Boyer-Moore-Horspool — likely CodSpeed regression source buffer.rs:517
Medium decode_string ASCII branch doesn't mask to 7 bits (combined with Latin1, but should differ) buffer.rs:371
Medium decode_string Base64/Base64url use same nbytes::base64_decode without distinguishing alphabets buffer.rs:393
Low byte_offset f64→i32 truncation in index_of buffer.rs:473
Low nbytes FFI element-by-element copy instead of bulk memcpy nbytes/ffi.c++:23
Info UTF-16LE surrogate bitmask self-corrected (initially thought wrong, verified correct) buffer.rs:819

The ASCII semantics bug (buffer.rs:371) includes a suggested fix.

Limerick incoming!

A Buffer in Rust was porting along,
Its search loop was naively wrong,
With O(n times m),
It slowed every gem —
Boyer-Moore would have made it more strong!

github run

@anonrig anonrig force-pushed the yagiz/add-buffer-rust branch 2 times, most recently from 24b7e96 to a4f1655 Compare March 19, 2026 18:51

// Thin wrapper around workerd::util::Autogate::isEnabled().
// The key is a string matching the kebab-case name (e.g. "rust-backed-node-buffer").
inline bool is_enabled(::rust::String key) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

autogate checks are supposed to be super cheap, this function is not super cheap nor a thing wrapper

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is not ideal and will be changed before merging. Just did a quick hack to see the test suite results.


/// Decodes hex, truncating at the first invalid hex pair (Node.js behavior).
/// In strict mode, returns an error on invalid input instead of truncating.
fn hex_decode(src: &[u8], strict: bool) -> Result<Vec<u8>, jsg::Error> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we have something other than bool args?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I know the existing c++ code has bool args but that's just because I haven't gotten around to updating everything to use WD_STRONG_BOOL yet

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need this clippy rule to be enabled: fn_params_excessive_bools https://rust-lang.github.io/rust-clippy/master/index.html#fn_params_excessive_bools

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how that addresses my ask? It's not about excessive bools, it's about bool args period. Is it not possible to have something like WD_STRONG_BOOL in rust?

#[jsg_static_constant]
pub const BASE64URL: u8 = 5;
#[jsg_static_constant]
pub const HEX: u8 = 6;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit ... a comment indicating that these need to be kept in sync with the enum would be helpful.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we should just use enum values rather than constant values here.

pub const HEX: u8 = 6;

#[jsg_method]
pub fn is_ascii(&self, bytes: jsg::v8::Local<jsg::v8::Uint8Array>) -> bool {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I think long term we'd want to have an equivalent to jsg::Js* wrapper types in Rust like we do in C++. the jsg::v8::Local<...> pattern here is unfortunate.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we just don't need it because they're actually v8::XXX. v8::String stores a pointer to v8::ffi::String and there is no blocker to make v8::String act like jsg::JsString.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following what you're saying here... specifically, I don't know what you mean by, "there is no blocker to make v8::String act like jsg::JsString".

I should not have to think about the jsg::v8::Local<...> part of this at all as it just makes things more verbose and isn't a detail I really should have to worry about. A jsg::v8::JsUint8Array or jsg::v8::JsString, etc is less verbose and is just easier to work with.

I'm firmly -1 on the jsg::v8::Local<T> pattern unless there's an essential reason for keeping it, which I'm not seeing.

Copy link
Copy Markdown
Member Author

@anonrig anonrig Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recommendation is to keep this as is. Move the v8::* bindings to a different crate, and implement jsg::JsUint8Array or any other class using the v8 bindings and abstract the Local handle logic.

@anonrig anonrig force-pushed the yagiz/rust-gc-v2 branch 5 times, most recently from 677011e to 3e59d17 Compare March 20, 2026 22:47
Base automatically changed from yagiz/rust-gc-v2 to main March 20, 2026 23:31
@anonrig anonrig force-pushed the yagiz/add-buffer-rust branch from a4f1655 to 0656546 Compare March 23, 2026 15:42
byte_offset: jsg::Number,
encoding: jsg::Number,
is_forward: jsg::NonCoercible<bool>,
) -> Result<jsg::Number, jsg::Error> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The performance of this algorithm is going to be O(n*m) at the best. That going to be too much of a performance hit for this.

Copy link
Copy Markdown
Collaborator

@jasnell jasnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to need a bit more work. The index_of performance issues are significant, and I believe your unwrap_string impl looks like it is potentially swallowing errors. Also, there are multiple failing tests that will need to be looked at.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants