Fix browser compatibility (implementing compatible close capsule)#176
Fix browser compatibility (implementing compatible close capsule)#176ai-and-i wants to merge 18 commits into
Conversation
Add a new unpublished workspace crate `web-transport-browser-tests` that provides infrastructure for running WebTransport tests against headless Chromium. The crate launches a shared browser singleton, creates isolated incognito contexts per test, generates short-lived self-signed certs, and wraps user JS snippets with a `connectWebTransport()` helper. Includes a smoke test proving end-to-end browser-to-server connectivity. Browser tests are excluded from the normal `just check`/`just test` workflows and run separately via `just browser-test`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Expand the browser test infrastructure to support handler panic propagation, expected handler counts, pre-accept request handling, and robust Chrome lifecycle management (dedicated runtime, keepalive page, cleanup watchdog). Add 43 integration tests covering bidirectional/unidirectional streams, datagrams, connection lifecycle, stream error codes, concurrency, and large data transfers against a real Chromium instance. Co-Authored-By: Claude Code <noreply@anthropic.com>
Now that CloseWebTransportSession capsule is implemented, update the browser test suite to reflect proper close behavior: remove #[ignore] annotations from passing tests, await session.closed() to ensure capsule delivery, keep stream references to prevent early cancellation, and improve JS error reporting with stack traces. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Assert WebTransportError::Closed(_, _) directly in all browser tests instead of using the is_session_closed() helper, ensuring the close capsule info is accurately surfaced through every error path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…handling Tests cover half-close semantics, stream priorities, rapid stream creation, concurrent bidi/uni streams, session stats, datagram edge cases (max size, oversized, empty, high water marks), and post-close/reset/stop error behavior. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
WalkthroughThis pull request introduces a new browser-based testing infrastructure for WebTransport. A new workspace member 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Wrap Browser in tokio::sync::Mutex and ChildStdin in std::sync::Mutex so SharedBrowser derives Sync naturally without unsafe. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace `let _ = task.await` with `.expect()` so that panics inside the spawned accept loop are propagated instead of silently swallowed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The test was sorting messages before comparing, which only verified set membership rather than actual priority/arrival ordering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass chrome_pid and data_dir as shell positional arguments ($0, $1) instead of interpolating them into the command string with format!(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ailable Remove unwrap_or(0) fallback that caused `kill -9 0` to target the whole process group. Instead, preserve the Option<u32> and only spawn the cleanup watchdog when a Chrome PID is available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace expect-based tx.send(()) with let _ = tx.send(()) so that a dropped receiver does not mask the real accept-loop error surfaced by task.await.expect(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ream_error tests Tests were unreliable because they used wt.close()/wt.closed for synchronization, which races with the stream error signals. Instead, use server-to-client writes and second-stream reads as synchronization points to ensure the server has observed the error before the connection tears down. Also increase immediate_close_handler delay to 1000ms. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Spawning multiple tasks that each call accept_bi() or accept_uni() on the same session causes all but one caller to hang indefinitely. The unfold stream only stores one waker, so concurrent pollers get their wakers overwritten and are never woken again. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…compatibility Write a CloseWebTransportSession capsule inside an HTTP/3 DATA frame on the CONNECT stream (RFC 9297), enabling browsers to receive close codes and reasons via the W3C WebTransport API's `WebTransport.closed` promise. Adds Http3CapsuleReader to handle capsules split across multiple DATA frames, and reworks Session::close() to deliver the capsule before closing the QUIC connection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a peer sends a CloseWebTransportSession capsule, the background task stores the (code, reason) and calls conn.close(), causing all active Quinn operations to fail with LocallyClosed. Previously only Session::closed() checked the capsule—now every error path maps LocallyClosed to WebTransportError::Closed(code, reason). - Add CloseCapsule type alias and map_conn_error/map_session_error/ map_send_datagram_error helpers on Session and SessionAccept - Thread close_capsule into RecvStream and SendStream; add map_read_error/map_write_error helpers - Store local close info in close() so locally-initiated closes also surface WebTransportError::Closed consistently Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Use OnceLock for set-once, first-writer-wins semantics between close() and the background capsule reader task. This eliminates the Mutex<Option<(u32, String)>> in favor of lock-free reads on every operation's error path. - close() uses error.set() as the sole guard — if it wins, it owns the capsule write and connection teardown; if the background task already set a remote close error, close() returns early - Background task symmetrically bails out if close() won the race - Consolidated three near-identical error mappers (map_conn_error, map_session_error, map_send_datagram_error) into a single map_error(impl Into<SessionError>) - closed() always awaits conn.closed() to ensure the capsule is delivered before returning, unlike quinn's early-return pattern - Removed connect_send reference from background task since the OnceLock guard makes it unnecessary Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SessionAccept used a single unfold stream for accepting connections, which only stores one waker. When multiple tasks call accept_bi() or accept_uni() concurrently, only the last caller's waker is retained and earlier callers hang forever. Add per-method waker vecs (bi_wakers, uni_wakers) to SessionAccept. On Pending, push the caller's waker; on Ready, drain and wake all stored wakers so other callers retry. Replace the ready!() macro with explicit match to ensure the waker push isn't bypassed by its hidden early return on Poll::Pending. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
dd0f579 to
7aee027
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
web-transport-quinn/src/session.rs (1)
516-554:⚠️ Potential issue | 🟡 MinorUnbounded waker accumulation may cause memory growth.
The
bi_wakersanduni_wakersvectors grow without bound when multiple concurrent callers pollaccept_bi/accept_unirepeatedly. EachPoll::Pendingpushes the current waker, but the same waker can be pushed multiple times if the same task re-polls.Consider deduplicating wakers or clearing and re-registering on each poll cycle.
♻️ Suggested improvement: deduplicate wakers
Poll::Ready(None) | Poll::Pending => { + // Only add if not already registered (by Waker identity) + if !self.bi_wakers.iter().any(|w| w.will_wake(cx.waker())) { self.bi_wakers.push(cx.waker().clone()); + } return Poll::Pending; }Apply the same pattern to
uni_wakersinpoll_accept_uni.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-quinn/src/session.rs` around lines 516 - 554, bi_wakers and uni_wakers accumulate duplicate Waker entries across repeated Poll::Pending calls (causing unbounded growth); in the poll_accept_bi and poll_accept_uni paths deduplicate the current task's Waker before pushing (or clear and re-register the wakers vector each poll cycle) so the same Waker cannot be added multiple times—use Waker::will_wake eq checks to find/remove existing entries or replace the vector contents with a single registered waker per task, updating references to bi_wakers and uni_wakers in SessionAccept::new and the poll methods accordingly.
🧹 Nitpick comments (6)
web-transport-browser-tests/src/server.rs (3)
187-201: Consider documenting or reducing the 1-second sleep.The 1-second sleep is a timing workaround. While the comment explains why it's needed, this adds latency to every close test. Consider:
- Reducing to 100-200ms if that's sufficient
- Adding a TODO to investigate a more deterministic approach
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/src/server.rs` around lines 187 - 201, The immediate_close_handler uses a hardcoded 1s sleep (tokio::time::sleep(Duration::from_millis(1000))). Change this to a smaller configurable delay (e.g., 100–200 ms) and add a TODO noting this is a timing workaround that should be replaced by a deterministic signal; update the sleep call inside immediate_close_handler and keep the existing session.close(...) and session.closed().await logic intact so behavior is preserved while reducing test latency.
226-268: Code duplication withstart()function.
start_with_request_handlerduplicates most ofstart(). Consider extracting common setup logic into a shared helper to reduce maintenance burden.♻️ Sketch of refactored approach
async fn start_server_internal<F, Fut>( cert: &TestCert, accept_loop: F, ) -> Result<TestServer> where F: FnOnce(web_transport_quinn::Server, TaskHandles, oneshot::Receiver<()>) -> Fut + Send + 'static, Fut: Future<Output = ()> + Send, { let addr: SocketAddr = "127.0.0.1:0".parse().unwrap(); let server = web_transport_quinn::ServerBuilder::new() .with_addr(addr) .with_certificate(cert.chain.clone(), cert.key.clone_key())?; let actual_addr = server.local_addr()?; let url = format!("https://localhost:{}", actual_addr.port()); let (shutdown_tx, shutdown_rx) = oneshot::channel(); let handler_tasks: TaskHandles = Arc::new(Mutex::new(Vec::new())); let task = tokio::spawn(accept_loop(server, handler_tasks.clone(), shutdown_rx)); Ok(TestServer { addr: actual_addr, url, shutdown_tx: Some(shutdown_tx), task: Some(task), handler_tasks }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/src/server.rs` around lines 226 - 268, start_with_request_handler duplicates most of start(); extract the shared server initialization and TestServer construction into a helper like start_server_internal(cert: &TestCert, accept_loop: ...) -> Result<TestServer> that creates the ServerBuilder, binds the addr, builds the url, creates the oneshot shutdown channel and handler_tasks, and spawns the accept loop task; then implement start() and start_with_request_handler() by calling start_server_internal and passing different accept_loop closures (one that runs the default accept loop and one that invokes the provided RequestHandler), using the helper to construct and return the TestServer so duplicated setup code is removed.
73-83: Potential address mismatch between server bind and URL.The server binds to
[::1]:0(IPv6 localhost) but the URL useslocalhost. If the browser resolveslocalhostto127.0.0.1(IPv4), the connection may fail. Consider binding to127.0.0.1:0for more predictable behavior, or use[::1]in the URL.♻️ Proposed fix to align address and URL
pub async fn start(cert: &TestCert, handler: ServerHandler) -> Result<TestServer> { - let addr: SocketAddr = "[::1]:0".parse().unwrap(); + let addr: SocketAddr = "127.0.0.1:0".parse().unwrap(); let server = web_transport_quinn::ServerBuilder::new() .with_addr(addr) .with_certificate(cert.chain.clone(), cert.key.clone_key())?; let actual_addr = server.local_addr()?; let url = format!("https://localhost:{}", actual_addr.port());🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/src/server.rs` around lines 73 - 83, The test server binds to IPv6 loopback "[::1]:0" in start() but always builds a URL using "localhost", which can resolve to IPv4 and cause connection failures; update start() to either bind to IPv4 loopback (use "127.0.0.1:0" for SocketAddr) so the existing url = format!("https://localhost:{}", actual_addr.port()) is reliable, or keep the IPv6 bind and build the URL from actual_addr (using actual_addr.ip() and bracketed IPv6 formatting when necessary) so the ServerBuilder::with_addr, actual_addr, and url generation are consistent.web-transport-browser-tests/src/harness.rs (1)
19-29: Consolidate duplicated harness bootstrap logic.
setupandsetup_with_request_handlerduplicate cert/context/server assembly. A shared helper reduces drift risk.♻️ Proposed refactor
+async fn build_harness(server: TestServer, cert: TestCert) -> Result<TestHarness> { + let context = TestContext::new().await?; + Ok(TestHarness { server, context, cert }) +} + pub async fn setup(handler: ServerHandler) -> Result<TestHarness> { let cert = cert::generate(); let server = server::start(&cert, handler).await?; - let context = TestContext::new().await?; - - Ok(TestHarness { - server, - context, - cert, - }) + build_harness(server, cert).await } @@ pub async fn setup_with_request_handler(handler: RequestHandler) -> Result<TestHarness> { let cert = cert::generate(); let server = server::start_with_request_handler(&cert, handler).await?; - let context = TestContext::new().await?; - - Ok(TestHarness { - server, - context, - cert, - }) + build_harness(server, cert).await }Also applies to: 65-75
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/src/harness.rs` around lines 19 - 29, The setup and setup_with_request_handler functions duplicate the same cert/context/server assembly; factor that shared logic into a single private helper (e.g., build_harness or init_harness) that returns the TestHarness components (cert, server, context) and have both setup and setup_with_request_handler call that helper, keeping only the request-specific wiring in setup_with_request_handler (reference symbols: setup, setup_with_request_handler, TestHarness, ServerHandler, TestContext, cert::generate, server::start). Ensure the helper is async, returns a Result<TestHarness> (or the raw parts if you prefer) and propagate errors unchanged.web-transport-browser-tests/tests/concurrent.rs (1)
326-327: Sleep-based synchronization is timing-sensitive.These fixed delays can make CI flaky. Prefer explicit server acknowledgements (like your uni “ok” signal pattern in
concurrent_accept.rs) before closing the session.Also applies to: 541-543
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/tests/concurrent.rs` around lines 326 - 327, Replace the timing-sensitive sleep+close pattern (the await new Promise(...); wt.close();) with explicit server acknowledgement handling: instead of sleeping, wait for the server’s "ok" signal or a specific confirmation message/event (the same uni "ok" pattern used in concurrent_accept.rs) before calling wt.close(); update the code locations where setTimeout is used (including the occurrences around wt.close() at the shown diff and lines ~541-543) to listen for and await that acknowledgement from the server (e.g., a read/recv or promise resolved by the "ok" frame) and then call wt.close() only after the confirmation is received.web-transport-browser-tests/src/browser.rs (1)
139-152: Potential unbounded connection handling in blank page server.The HTTP server spawns a new task for every accepted connection without any limit. While this is unlikely to be exploited in a test environment, consider adding a connection limit or using a simpler single-request-at-a-time pattern since this server only serves the keepalive page and test pages.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@web-transport-browser-tests/src/browser.rs` around lines 139 - 152, The connection loop that calls listener.accept().await and tokio::spawn for every connection (inside the outer tokio::spawn and inner tokio::spawn) is unbounded; replace it with a bounded concurrency strategy—either serialize handling (handle one request at a time in the loop) or limit concurrent handlers with a tokio::sync::Semaphore/Permit or a tokio::task::JoinSet with a max size—so that accepts acquire a permit before spawning/processing and release it when the stream handling (read/write on stream) completes; update the code around listener.accept(), the spawned stream handler, and any spawn calls to use that permit/JoinSet to cap concurrent connection tasks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@web-transport-browser-tests/src/browser.rs`:
- Around line 210-243: The TestContext::new path can leak a browser context if
create_browser_context succeeds but new_page fails; after create_browser_context
(context_id) and before returning, ensure you clean up on error by deleting the
created context (call the existing dispose/cleanup method or invoke the
browser's delete_browser_context / equivalent) inside the error branch for
new_page; update TestContext::new to catch failures from new_page, delete the
created context_id (using the same shared.browser.lock().await API used to
create it), and then propagate the error so no context is left orphaned.
In `@web-transport-browser-tests/tests/concurrent.rs`:
- Around line 257-272: The spawned server tasks (the tokio::spawn closures that
call recv.read_to_end, send.write_all, send.finish and the one containing
assert_eq!) must not be detached — collect their JoinHandles (or use a
tokio::task::JoinSet) and await them before finishing the test so
panics/assertion failures propagate and fail the test; modify the accept loop
where you call tokio::spawn for the bidi and uni handlers to push each
JoinHandle into a vector (or spawn into a JoinSet) and after client-side work,
await/join all handles instead of relying on a fixed sleep, ensuring any errors
from read_to_end/write_all/finish or the assert_eq are returned/propagated.
In `@web-transport-browser-tests/tests/stream_error.rs`:
- Around line 988-994: The test currently expects
ReadError::SessionError(ConnectionError::LocallyClosed) after calling
session.close(7, b"done") but should assert that the error reflects the provided
close code and reason; update the assertions to match
WebTransportError::Closed(7, b"done".to_vec() or equivalent) instead of
ConnectionError::LocallyClosed for both the ReadError branch (matching
ReadError::SessionError(SessionError::WebTransportError::Closed(...))) and the
analogous WriteError assertion (lines referenced for WriteError), ensuring the
pattern matches other tests that assert WebTransportError::Closed(code, reason)
so the close-code propagation is properly validated.
In `@web-transport-proto/src/capsule.rs`:
- Around line 286-293: Read the VarInt into a u64 and perform the MAX_FRAME_SIZE
comparison while still a u64 (e.g. let len_u64 =
VarInt::read(...).into_inner()), then return CapsuleError::MessageTooLong if
len_u64 > MAX_FRAME_SIZE as u64; only after that safely convert len_u64 to usize
(using TryFrom/usize::try_from or a checked cast) and return an error if the
conversion would overflow usize. Update the logic around VarInt::read, the len
variable, and the MAX_FRAME_SIZE check so the cast to usize happens after bounds
validation to avoid truncation on 32-bit targets.
---
Outside diff comments:
In `@web-transport-quinn/src/session.rs`:
- Around line 516-554: bi_wakers and uni_wakers accumulate duplicate Waker
entries across repeated Poll::Pending calls (causing unbounded growth); in the
poll_accept_bi and poll_accept_uni paths deduplicate the current task's Waker
before pushing (or clear and re-register the wakers vector each poll cycle) so
the same Waker cannot be added multiple times—use Waker::will_wake eq checks to
find/remove existing entries or replace the vector contents with a single
registered waker per task, updating references to bi_wakers and uni_wakers in
SessionAccept::new and the poll methods accordingly.
---
Nitpick comments:
In `@web-transport-browser-tests/src/browser.rs`:
- Around line 139-152: The connection loop that calls listener.accept().await
and tokio::spawn for every connection (inside the outer tokio::spawn and inner
tokio::spawn) is unbounded; replace it with a bounded concurrency
strategy—either serialize handling (handle one request at a time in the loop) or
limit concurrent handlers with a tokio::sync::Semaphore/Permit or a
tokio::task::JoinSet with a max size—so that accepts acquire a permit before
spawning/processing and release it when the stream handling (read/write on
stream) completes; update the code around listener.accept(), the spawned stream
handler, and any spawn calls to use that permit/JoinSet to cap concurrent
connection tasks.
In `@web-transport-browser-tests/src/harness.rs`:
- Around line 19-29: The setup and setup_with_request_handler functions
duplicate the same cert/context/server assembly; factor that shared logic into a
single private helper (e.g., build_harness or init_harness) that returns the
TestHarness components (cert, server, context) and have both setup and
setup_with_request_handler call that helper, keeping only the request-specific
wiring in setup_with_request_handler (reference symbols: setup,
setup_with_request_handler, TestHarness, ServerHandler, TestContext,
cert::generate, server::start). Ensure the helper is async, returns a
Result<TestHarness> (or the raw parts if you prefer) and propagate errors
unchanged.
In `@web-transport-browser-tests/src/server.rs`:
- Around line 187-201: The immediate_close_handler uses a hardcoded 1s sleep
(tokio::time::sleep(Duration::from_millis(1000))). Change this to a smaller
configurable delay (e.g., 100–200 ms) and add a TODO noting this is a timing
workaround that should be replaced by a deterministic signal; update the sleep
call inside immediate_close_handler and keep the existing session.close(...) and
session.closed().await logic intact so behavior is preserved while reducing test
latency.
- Around line 226-268: start_with_request_handler duplicates most of start();
extract the shared server initialization and TestServer construction into a
helper like start_server_internal(cert: &TestCert, accept_loop: ...) ->
Result<TestServer> that creates the ServerBuilder, binds the addr, builds the
url, creates the oneshot shutdown channel and handler_tasks, and spawns the
accept loop task; then implement start() and start_with_request_handler() by
calling start_server_internal and passing different accept_loop closures (one
that runs the default accept loop and one that invokes the provided
RequestHandler), using the helper to construct and return the TestServer so
duplicated setup code is removed.
- Around line 73-83: The test server binds to IPv6 loopback "[::1]:0" in start()
but always builds a URL using "localhost", which can resolve to IPv4 and cause
connection failures; update start() to either bind to IPv4 loopback (use
"127.0.0.1:0" for SocketAddr) so the existing url =
format!("https://localhost:{}", actual_addr.port()) is reliable, or keep the
IPv6 bind and build the URL from actual_addr (using actual_addr.ip() and
bracketed IPv6 formatting when necessary) so the ServerBuilder::with_addr,
actual_addr, and url generation are consistent.
In `@web-transport-browser-tests/tests/concurrent.rs`:
- Around line 326-327: Replace the timing-sensitive sleep+close pattern (the
await new Promise(...); wt.close();) with explicit server acknowledgement
handling: instead of sleeping, wait for the server’s "ok" signal or a specific
confirmation message/event (the same uni "ok" pattern used in
concurrent_accept.rs) before calling wt.close(); update the code locations where
setTimeout is used (including the occurrences around wt.close() at the shown
diff and lines ~541-543) to listen for and await that acknowledgement from the
server (e.g., a read/recv or promise resolved by the "ok" frame) and then call
wt.close() only after the confirmation is received.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
Cargo.tomljustfileweb-transport-browser-tests/Cargo.tomlweb-transport-browser-tests/src/browser.rsweb-transport-browser-tests/src/cert.rsweb-transport-browser-tests/src/harness.rsweb-transport-browser-tests/src/js.rsweb-transport-browser-tests/src/lib.rsweb-transport-browser-tests/src/server.rsweb-transport-browser-tests/tests/bidi_stream.rsweb-transport-browser-tests/tests/common/mod.rsweb-transport-browser-tests/tests/concurrent.rsweb-transport-browser-tests/tests/concurrent_accept.rsweb-transport-browser-tests/tests/connection.rsweb-transport-browser-tests/tests/datagram.rsweb-transport-browser-tests/tests/smoke.rsweb-transport-browser-tests/tests/stream_error.rsweb-transport-browser-tests/tests/uni_stream.rsweb-transport-proto/src/capsule.rsweb-transport-quinn/Cargo.tomlweb-transport-quinn/src/recv.rsweb-transport-quinn/src/send.rsweb-transport-quinn/src/session.rs
| impl TestContext { | ||
| /// Create a new isolated browser context. | ||
| pub async fn new() -> Result<Self> { | ||
| let shared = get_browser(); | ||
|
|
||
| let context_id = shared | ||
| .browser | ||
| .lock() | ||
| .await | ||
| .create_browser_context(CreateBrowserContextParams::default()) | ||
| .await | ||
| .context("failed to create browser context")?; | ||
|
|
||
| // Navigate to a localhost HTTP page so the JS context has a secure | ||
| // origin with the WebTransport API available. | ||
| let page = shared | ||
| .browser | ||
| .lock() | ||
| .await | ||
| .new_page( | ||
| CreateTargetParams::builder() | ||
| .url(&shared.page_url) | ||
| .browser_context_id(context_id.clone()) | ||
| .build() | ||
| .unwrap(), | ||
| ) | ||
| .await | ||
| .context("failed to create page")?; | ||
|
|
||
| Ok(Self { | ||
| page, | ||
| context_id: Some(context_id), | ||
| }) | ||
| } |
There was a problem hiding this comment.
Minor: Browser context may leak if page creation fails.
If new_page() fails after create_browser_context() succeeds (line 225-237), the created browser context won't be disposed since Self is never constructed and dispose() won't be called.
🛠️ Proposed fix to clean up on partial failure
pub async fn new() -> Result<Self> {
let shared = get_browser();
let context_id = shared
.browser
.lock()
.await
.create_browser_context(CreateBrowserContextParams::default())
.await
.context("failed to create browser context")?;
// Navigate to a localhost HTTP page so the JS context has a secure
// origin with the WebTransport API available.
- let page = shared
+ let page = match shared
.browser
.lock()
.await
.new_page(
CreateTargetParams::builder()
.url(&shared.page_url)
.browser_context_id(context_id.clone())
.build()
.unwrap(),
)
.await
- .context("failed to create page")?;
+ {
+ Ok(page) => page,
+ Err(e) => {
+ // Clean up the browser context we just created
+ let _ = shared.browser.lock().await.dispose_browser_context(context_id).await;
+ return Err(e).context("failed to create page");
+ }
+ };
Ok(Self {
page,
context_id: Some(context_id),
})
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| impl TestContext { | |
| /// Create a new isolated browser context. | |
| pub async fn new() -> Result<Self> { | |
| let shared = get_browser(); | |
| let context_id = shared | |
| .browser | |
| .lock() | |
| .await | |
| .create_browser_context(CreateBrowserContextParams::default()) | |
| .await | |
| .context("failed to create browser context")?; | |
| // Navigate to a localhost HTTP page so the JS context has a secure | |
| // origin with the WebTransport API available. | |
| let page = shared | |
| .browser | |
| .lock() | |
| .await | |
| .new_page( | |
| CreateTargetParams::builder() | |
| .url(&shared.page_url) | |
| .browser_context_id(context_id.clone()) | |
| .build() | |
| .unwrap(), | |
| ) | |
| .await | |
| .context("failed to create page")?; | |
| Ok(Self { | |
| page, | |
| context_id: Some(context_id), | |
| }) | |
| } | |
| impl TestContext { | |
| /// Create a new isolated browser context. | |
| pub async fn new() -> Result<Self> { | |
| let shared = get_browser(); | |
| let context_id = shared | |
| .browser | |
| .lock() | |
| .await | |
| .create_browser_context(CreateBrowserContextParams::default()) | |
| .await | |
| .context("failed to create browser context")?; | |
| // Navigate to a localhost HTTP page so the JS context has a secure | |
| // origin with the WebTransport API available. | |
| let page = match shared | |
| .browser | |
| .lock() | |
| .await | |
| .new_page( | |
| CreateTargetParams::builder() | |
| .url(&shared.page_url) | |
| .browser_context_id(context_id.clone()) | |
| .build() | |
| .unwrap(), | |
| ) | |
| .await | |
| { | |
| Ok(page) => page, | |
| Err(e) => { | |
| // Clean up the browser context we just created | |
| let _ = shared.browser.lock().await.dispose_browser_context(context_id).await; | |
| return Err(e).context("failed to create page"); | |
| } | |
| }; | |
| Ok(Self { | |
| page, | |
| context_id: Some(context_id), | |
| }) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web-transport-browser-tests/src/browser.rs` around lines 210 - 243, The
TestContext::new path can leak a browser context if create_browser_context
succeeds but new_page fails; after create_browser_context (context_id) and
before returning, ensure you clean up on error by deleting the created context
(call the existing dispose/cleanup method or invoke the browser's
delete_browser_context / equivalent) inside the error branch for new_page;
update TestContext::new to catch failures from new_page, delete the created
context_id (using the same shared.browser.lock().await API used to create it),
and then propagate the error so no context is left orphaned.
| tokio::spawn(async move { | ||
| let buf = recv.read_to_end(1024 * 1024).await.expect("echo: read_to_end failed"); | ||
| send.write_all(&buf).await.expect("echo: write_all failed"); | ||
| send.finish().expect("echo: finish failed"); | ||
| }); | ||
| } | ||
| stream = session.accept_uni() => { | ||
| let Ok(mut recv) = stream else { break }; | ||
| tokio::spawn(async move { | ||
| let data = recv.read_to_end(1024 * 1024).await.expect("read_to_end failed"); | ||
| assert_eq!( | ||
| String::from_utf8_lossy(&data), | ||
| "uni", | ||
| "server should receive uni stream data" | ||
| ); | ||
| }); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
cd web-transport-browser-tests
rg -n -B5 -A10 'multiple_streams_mixed_types' tests/concurrent.rs | head -100Repository: moq-dev/web-transport
Length of output: 680
🏁 Script executed:
#!/bin/bash
cd web-transport-browser-tests
# Get the exact code at lines 257-272
sed -n '240,290p' tests/concurrent.rsRepository: moq-dev/web-transport
Length of output: 2047
🏁 Script executed:
#!/bin/bash
cd web-transport-browser-tests
# Check if JoinSet is used anywhere in the file
rg -n 'JoinSet|join_next' tests/concurrent.rsRepository: moq-dev/web-transport
Length of output: 192
🏁 Script executed:
#!/bin/bash
cd web-transport-browser-tests
# Verify the full test function context
sed -n '200,350p' tests/concurrent.rsRepository: moq-dev/web-transport
Length of output: 5698
Detached spawned task failures can be lost, causing false-positive test passes.
The server-side spawned tasks at lines 257 and 264 are not awaited; assertion failures inside them (e.g., the assert_eq! at line 270) may fail silently without failing the test. The test only validates the JavaScript client side (line 316), not server-side assertions, and relies on a fragile 200ms fixed sleep for synchronization.
🔧 Proposed fix
let handler: ServerHandler = Box::new(|session| {
Box::pin(async move {
+ let mut tasks = tokio::task::JoinSet::new();
loop {
tokio::select! {
stream = session.accept_bi() => {
let Ok((mut send, mut recv)) = stream else { break };
- tokio::spawn(async move {
+ tasks.spawn(async move {
let buf = recv.read_to_end(1024 * 1024).await.expect("echo: read_to_end failed");
send.write_all(&buf).await.expect("echo: write_all failed");
send.finish().expect("echo: finish failed");
});
}
stream = session.accept_uni() => {
let Ok(mut recv) = stream else { break };
- tokio::spawn(async move {
+ tasks.spawn(async move {
let data = recv.read_to_end(1024 * 1024).await.expect("read_to_end failed");
assert_eq!(
String::from_utf8_lossy(&data),
"uni",
"server should receive uni stream data"
);
});
}
datagram = session.read_datagram() => {
let Ok(data) = datagram else { break };
session.send_datagram(data).expect("echo: send_datagram failed");
}
}
}
+
+ while let Some(result) = tasks.join_next().await {
+ if let Err(e) = result {
+ if e.is_panic() {
+ std::panic::resume_unwind(e.into_panic());
+ }
+ }
+ }
})
});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web-transport-browser-tests/tests/concurrent.rs` around lines 257 - 272, The
spawned server tasks (the tokio::spawn closures that call recv.read_to_end,
send.write_all, send.finish and the one containing assert_eq!) must not be
detached — collect their JoinHandles (or use a tokio::task::JoinSet) and await
them before finishing the test so panics/assertion failures propagate and fail
the test; modify the accept loop where you call tokio::spawn for the bidi and
uni handlers to push each JoinHandle into a vector (or spawn into a JoinSet) and
after client-side work, await/join all handles instead of relying on a fixed
sleep, ensuring any errors from read_to_end/write_all/finish or the assert_eq
are returned/propagated.
| Err(ReadError::SessionError(SessionError::ConnectionError( | ||
| quinn::ConnectionError::LocallyClosed, | ||
| ))) => {} | ||
| other => panic!( | ||
| "expected ReadError::SessionError(ConnectionError::LocallyClosed), got {other:?}" | ||
| ), | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, get context around the specified line ranges
echo "=== Lines 988-994 context ==="
sed -n '980,1000p' web-transport-browser-tests/tests/stream_error.rs | cat -n
echo ""
echo "=== Lines 1032-1037 context ==="
sed -n '1025,1045p' web-transport-browser-tests/tests/stream_error.rs | cat -nRepository: moq-dev/web-transport
Length of output: 2003
🏁 Script executed:
#!/bin/bash
# Check error type definitions and the WebTransportError enum
echo "=== Searching for WebTransportError definition ==="
rg -n 'enum WebTransportError|struct WebTransportError' --type rust -A 10
echo ""
echo "=== Searching for SessionError definition ==="
rg -n 'enum SessionError|struct SessionError' --type rust -A 10
echo ""
echo "=== Searching for close-related error handling patterns ==="
rg -n 'WebTransportError::Closed|ConnectionError::LocallyClosed' --type rustRepository: moq-dev/web-transport
Length of output: 13963
🏁 Script executed:
#!/bin/bash
# Check the test function names to understand context
echo "=== Test function definitions ==="
rg -n 'fn server_read_on_stream_after_session_close|fn server_write_on_stream_after_session_close' web-transport-browser-tests/tests/stream_error.rs -A 2Repository: moq-dev/web-transport
Length of output: 281
Assertions must expect WebTransportError::Closed to test close-code propagation.
Both tests call session.close(7, b"done") but assert ConnectionError::LocallyClosed, missing the close code entirely. Per the error handling in session.rs:108, a provided close code should yield WebTransportError::Closed(code, reason). These assertions break the test contract for this PR's close-capsule feature and diverge from the pattern used elsewhere in the test suite.
Replace both assertions:
Fix
- Err(ReadError::SessionError(SessionError::ConnectionError(
- quinn::ConnectionError::LocallyClosed,
- ))) => {}
+ Err(ReadError::SessionError(SessionError::WebTransportError(
+ WebTransportError::Closed(7, _),
+ ))) => {}(Same pattern at lines 1032–1037 for WriteError.)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Err(ReadError::SessionError(SessionError::ConnectionError( | |
| quinn::ConnectionError::LocallyClosed, | |
| ))) => {} | |
| other => panic!( | |
| "expected ReadError::SessionError(ConnectionError::LocallyClosed), got {other:?}" | |
| ), | |
| } | |
| Err(ReadError::SessionError(SessionError::WebTransportError( | |
| WebTransportError::Closed(7, _), | |
| ))) => {} | |
| other => panic!( | |
| "expected ReadError::SessionError(ConnectionError::LocallyClosed), got {other:?}" | |
| ), | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web-transport-browser-tests/tests/stream_error.rs` around lines 988 - 994,
The test currently expects
ReadError::SessionError(ConnectionError::LocallyClosed) after calling
session.close(7, b"done") but should assert that the error reflects the provided
close code and reason; update the assertions to match
WebTransportError::Closed(7, b"done".to_vec() or equivalent) instead of
ConnectionError::LocallyClosed for both the ReadError branch (matching
ReadError::SessionError(SessionError::WebTransportError::Closed(...))) and the
analogous WriteError assertion (lines referenced for WriteError), ensuring the
pattern matches other tests that assert WebTransportError::Closed(code, reason)
so the close-code propagation is properly validated.
| let len = VarInt::read(&mut self.stream) | ||
| .await | ||
| .map_err(|_| CapsuleError::UnexpectedEnd)? | ||
| .into_inner() as usize; | ||
|
|
||
| if len > MAX_FRAME_SIZE as usize { | ||
| return Err(CapsuleError::MessageTooLong); | ||
| } |
There was a problem hiding this comment.
Check MAX_FRAME_SIZE before casting u64 to usize to avoid truncation on 32-bit systems.
The length is cast to usize on line 289 before the size validation on line 291. On 32-bit platforms, a malicious or malformed frame could declare a length > 4GB, which would silently truncate when cast to usize, potentially passing the MAX_FRAME_SIZE check with an incorrect value.
🔧 Proposed fix
- let len = VarInt::read(&mut self.stream)
+ let len_u64 = VarInt::read(&mut self.stream)
.await
.map_err(|_| CapsuleError::UnexpectedEnd)?
- .into_inner() as usize;
+ .into_inner();
- if len > MAX_FRAME_SIZE as usize {
+ if len_u64 > MAX_FRAME_SIZE {
return Err(CapsuleError::MessageTooLong);
}
+
+ let len = len_u64 as usize;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let len = VarInt::read(&mut self.stream) | |
| .await | |
| .map_err(|_| CapsuleError::UnexpectedEnd)? | |
| .into_inner() as usize; | |
| if len > MAX_FRAME_SIZE as usize { | |
| return Err(CapsuleError::MessageTooLong); | |
| } | |
| let len_u64 = VarInt::read(&mut self.stream) | |
| .await | |
| .map_err(|_| CapsuleError::UnexpectedEnd)? | |
| .into_inner(); | |
| if len_u64 > MAX_FRAME_SIZE { | |
| return Err(CapsuleError::MessageTooLong); | |
| } | |
| let len = len_u64 as usize; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@web-transport-proto/src/capsule.rs` around lines 286 - 293, Read the VarInt
into a u64 and perform the MAX_FRAME_SIZE comparison while still a u64 (e.g. let
len_u64 = VarInt::read(...).into_inner()), then return
CapsuleError::MessageTooLong if len_u64 > MAX_FRAME_SIZE as u64; only after that
safely convert len_u64 to usize (using TryFrom/usize::try_from or a checked
cast) and return an error if the conversion would overflow usize. Update the
logic around VarInt::read, the len variable, and the MAX_FRAME_SIZE check so the
cast to usize happens after bounds validation to avoid truncation on 32-bit
targets.
|
Could you split the browser test stuff out of this PR? |
I'm doing this now. |
|
Sorry, missed this yesterday; thanks for doing the split! I'm going to close this as it's superseded by #179. |
Summary
This is follow up to #175 .
Fixes browser compatibility issues identified by the test suite in #175. Browsers expect the server to send a
CloseWebTransportSessioncapsule (type0x2843) inside an HTTP/3 DATA frame on the CONNECT stream per RFC 9297. Without this, Chrome'sWebTransport.closedpromise never resolves with the correct close code and reason, causing the browser tests to fail.This PR adds capsule encoding/decoding support in the proto crate, reworks session close to write the capsule before tearing down the QUIC connection, and ensures all error paths consistently surface close info (code, reason) instead of raw Quinn errors.
Key changes
web-transport-proto/src/capsule.rs): AddCloseWebTransportSessioncapsule encode/decode andHttp3CapsuleReaderfor capsules split across multiple DATA framesweb-transport-quinn/src/session.rs):close()now writes the capsule in a DATA frame, FINs the CONNECT stream, waits for the peer to close, then force-closes; a background task reads peer capsules and stores close info viaArc<OnceLock<SessionError>>with first-writer-wins semanticsrecv.rs,send.rs): Thread close capsule info intoRecvStreamandSendStreamsoLocallyClosederrors map toWebTransportError::Closed(code, reason)consistently across stream reads/writes,accept(), and datagram sendsDepends on #175.
Test plan
cargo test -p web-transport-browser-tests --test connectionto verify browser close handshakecargo test -p web-transport-browser-tests --test stream_errorto verify stream error propagationcargo test -p web-transport-prototo verify capsule encode/decode🤖 Generated with Claude Code