Add socketioxide (Rust Socket.io) as a comparison target#1
Open
irinanazarova wants to merge 10 commits into
Open
Add socketioxide (Rust Socket.io) as a comparison target#1irinanazarova wants to merge 10 commits into
irinanazarova wants to merge 10 commits into
Conversation
Library author asked on X if we'd benchmark their crate alongside Node Socket.io, uWS, and AnyCable. socketioxide speaks the Socket.io wire protocol, so the bench-runner's existing socket.io-client driver works against it unchanged. The work is server-side. New `socketioxide/` directory carries the Rust Cargo project, the Dockerfile, and a railway.toml. The server mirrors the shape of backend/src/socketio/server.ts: /health, /stats, /_broadcast, /publish-local, plus a Connection State Recovery toggle via SOCKETIO_CSR=1 (FIXME'd in main.rs because the crate's CSR API has shifted across versions). Manifest entries land under each rubric (latency 1K/10K, jitter, idle, avalanche escalation), targeting the existing bench-jitter-socketio, bench-idle-socketio, and bench-avalanche-socketio endpoints via ?serverUrl=. Baselines are empty until first run. docs/socketioxide-comparison.md tracks status, open questions for the library author, and the eventual results. README links to it under 'Additional target on request'. Build + tests green. Rust code is unverified by compile from this end; the GitHub issue will tag the library author for review.
Seven advisories rolled into one high-severity npm-audit row in undici 8.0.0-8.4.1 (cert validation bypass in SOCKS5 ProxyAgent, header injection via Set-Cookie, WS DoS via fragment-count and cumulative- fragment bypasses, HTTP response queue poisoning, Set-Cookie SameSite attribute downgrade, cross-user info disclosure via shared cache whitespace bypass). undici is driver-side (used by the long-timeout Agent in bench scripts), not bench-runner-side. Fix re-resolves within the existing ^8.2.0 caret range; no package.json change. found 0 vulnerabilities
Maintainer + API check turned up two scaffold bugs: 1. Wrong version pin. Cargo.toml had socketioxide = "0.16"; latest stable is 0.18.3 (Apr 2026, same author actively shipping engineio hardening fixes on the day this branch landed). Bumped to ^0.18, features ['v4', 'tracing'] for the Socket.io v4 wire protocol and structured logs. 2. The 'state-recovery' feature I specified does not exist. socketioxide 0.18.3 feature list is v4 / msgpack / tracing / extensions / state / __test_harness. The 'state' feature is for application-shared state (with_state), not session resume. CSR is not documented in the README, examples, or feature flags. Dropped: - socketioxideCsr TARGETS entry - All -csr manifest entries (latency-csr, jitter-csr) - The two FIXME blocks in main.rs that pretended to enable CSR - The recovered counter and /stats-csr endpoint - The SOCKETIO_CSR env var What's left is honest at-most-once socketioxide: tested with the same disruption shape as default Socket.io and uWS. The architectural prediction is that it lands in the at-most-once band (~85% delivery under jitter, in-process WS dies with the app on deploy), which would confirm the page's claim that those properties are about deployment topology rather than runtime language. CSR is now an open question to the library author in docs/socketioxide-comparison.md (does the crate ship CSR? Is it on the roadmap?). If yes, we add the variant back. If no, the comparison is what it is.
…yCable
The scaffold now compiles and runs. Fixing it against the released crate
turned up three API mismatches from my first guess:
- Connect handlers must be async (io.ns('/', on_connect) where
on_connect is an async fn), not sync closures.
- State (the connection counter) goes through .with_state(Arc<AtomicU64>)
+ the State extractor, gated behind the 'state' feature flag, which
I'd left out. Handlers read it as IoState<ConnCounter>.
- emit takes &data and is .await-ed; room handlers use Data<T>
extractors, not positional Value.
Pinned 0.18 resolves to 0.18.4; release build is clean.
First real numbers, local head-to-head with anycable-go 1.6.14 as a
same-window control (its normal shape confirms the environment was
sound). 200 clients, per-message HTTP publish for both:
Latency (jitter off): socketioxide 100% / p99 18ms,
AnyCable 100% / p99 22ms. Comparable.
Jitter (TCP drop /15s): socketioxide 91.6% delivered (at-most-once,
no replay, fast on what it sends),
AnyCable 100% (replay, multi-second tail).
socketioxide lands in the at-most-once band with default Socket.io and
uWS: the delivery gap is the protocol (replay vs none), not the runtime
language. Confirms the page's architectural claim across a fourth impl.
Results table + reproducer in docs/socketioxide-comparison.md; raw JSON
force-added at backend/results/socketioxide-local-2026-06-23.json.
Comparison page untouched, as requested. Railway-scale rows (10K, idle
1M, avalanche) still pending a deploy; FILTER=socketioxide,anycable
runs the new rows with AnyCable as the canary.
Three fixes found deploying to Railway: - Base image rust:1.83 was below socketioxide 0.18.4's MSRV (1.94) and too old for a transitive dep needing edition2024. Use rust:1-slim. - Commit Cargo.lock + build --locked so the image uses the exact deps resolved and tested locally (dropped the strip step; binutils absent in slim). - Bind [::] (IPv6 dual-stack), not 0.0.0.0. Railway's private network (*.railway.internal) routes over IPv6; 0.0.0.0 is unreachable internally. Verified: 20/20 clients connect via the bench-runner, 100% delivery, 42ms p99. PORT is pinned to 3000 on the service to match the manifest target.
Phase 1 on real infra. Deployed socketioxide-server to the bench project, woke anycable-go OSS as the same-window canary, drove both from the Railway bench-runner over the internal network. Latency (jitter off): comparable to AnyCable at 1K and 10K, both 100% delivery (socketioxide 289/972ms p50/p99 at 10K vs AnyCable 232/731ms). Jitter delivery, bracketed across scale: 200 local: 91.6% 1K Railway: 89.4% 10K Railway: 40.6% then 32.7% socketioxide is at-most-once: it sits in the band with default Socket.io (~85%) and uWS (~87%) up to 1K, then collapses under the 10K reconnect storm. Two independent 10K runs (41%, 33%) confirm it; not a crash (0 connect failures, 10K/10K connect), not Railway noise (AnyCable held 100% in the same windows). The Rust runtime does not rescue the in-process at-most-once architecture at scale; AnyCable holds 100% because the WS layer is a separate process that the deploy/storm never restarts and replay recovers the offline-window gap. Comparison page untouched. Idle 1M + avalanche deferred to phase 2 (needs the 50-shard fleet). All phase-1 services torn back down to offline after the run. Report: docs/socketioxide-comparison.md. Raw: backend/results/.
Idle test was harness-limited at ~600K (both targets hit the identical ~600,090 ceiling because the bench-runner shards capped ~12K clients each, not because either server saturated; both sized 32GB peaked at ~21-22GB). At 600K held: socketioxide ~37 KB/conn (1.8% CPU), anycable-go ~39 KB/conn (9% CPU). Comparable per-connection memory, both well under Node Socket.io's ~52 KB. The notable finding: socketioxide held 600K+, ~5x past Node Socket.io's ~120K single-event-loop ceiling. tokio's multi-threading clears the wall that caps Node. So Rust fixes Socket.io's capacity limit (runtime concurrency) but not its at-most-once delivery limit (protocol). Set the idle-socketioxide targetServiceId in the manifest for metrics. Avalanche rows still running.
Avalanche escalation on Railway: 5K recovers 100% in 2.9s, 10K is 96% in 67s (411 never back), 20K collapses to 0% recovered. Tracks Node Socket.io's cliff almost exactly (Socket.io 10K ~65s/96%). The in-process WS layer dies with the app deploy regardless of runtime language; the reconnect storm overwhelms the restarted single instance at scale. AnyCable is 0s by construction (separate process). Raw under backend/results/railway-phase2/.
Traced the 600K idle cap to a hard ~12,002 connections/shard (ephemeral-port exhaustion to one host:port from one source IP; not memory, containers report nofile=122880). Grew the fleet 50 -> 85 shards (~1.02M theoretical) and re-ran: 49 shards delivered a clean 12,000 each (588,000, 0 failures), 36 errored/timed out under the coordinator fan-out. Harness-limited, not server-limited; socketioxide accepted every connection the surviving shards threw with memory headroom. Established: socketioxide holds at least ~600K idle socket.io connections on a 32GB box, ~5x past Node Socket.io's ~120K event-loop ceiling, RAM/conn comparable to AnyCable. True ceiling unmeasured (needs more source IPs / a lighter idle client than socket.io-client). Phase-2 teardown: all 87 services stopped (85 shards + anycable-go + socketioxide-server), both 32GB targets downsized to 0.5GB/1vCPU, verified offline. Comparison page untouched throughout.
Promote the socketioxide head-to-head from a one-line pointer to a full results section in the README: latency, jitter, avalanche, and idle tables vs AnyCable, with the takeaway that Rust fixes Socket.io's capacity ceiling but not its at-most-once delivery or in-process deploy fragility. Deep dive stays in docs/socketioxide-comparison.md.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
socketioxide's author asked on X if we'd benchmark their Rust Socket.io server alongside the existing targets. This adds it and reports a full head-to-head with AnyCable on the same Railway hardware.
What's here
socketioxide/— a Rust bench server (Axum + socketioxide 0.18.4) that speaks the Socket.io wire protocol, so the existingsocket.io-clientbench-runner drives it unchanged. Dockerfile,Cargo.lock, Railway config included.tests-manifest.tsfor socketioxide latency / jitter / idle / avalanche.docs/socketioxide-comparison.md.backend/results/.npm audit fix(undici 8.2.0 → 8.5.0).No new bench-runner endpoints: socketioxide reuses the Socket.io ones via
?serverUrl=.Results (Railway, head-to-head with AnyCable)
The takeaway: Rust fixes Socket.io's capacity ceiling (the single-event-loop wall), but not its at-most-once delivery or in-process deploy fragility. Those are protocol and topology properties, so socketioxide collapses under jitter and deploy storms at scale the same way Node Socket.io does.
Notes
[::]bind (Railway private net is IPv6), pinnedPORT=3000.