A high-performance, secure, and protocol aware WebSocket gateway designed to handle thousands of concurrent connections, whether MQTT-over-WebSocket or raw WebSocket in a unified, observable, and production ready manner.
Modern real-time systems demand scalable gateways to ingest connections from heterogeneous clients: robots, IoT devices, mobile apps, telemetry dashboards, and more. These clients often use:
- MQTT-over-WebSocket (for sensor data, telemetry, robotics)
- Raw WebSocket (for JSON RPC, custom protocols, command channels)
While WebSocket is a great transport for bidirectional communication, many WebSocket backends (like RabbitMQ, EMQX, or internal services) do not provide native authentication or authorization. This project introduces a secure middle-layer proxy that provides:
- JWT-based authentication on every incoming connection
- Protocol detection and dispatching
- Observability and metrics
- Deployment simplicity behind ingress gateways like Traefik or NGINX
Why JWT?
- Because it's interoperable with any OIDC-compliant identity provider (like Keycloak, Auth0, AWS Cognito, Azure AD)
- JWTs can be embedded in MQTT CONNECT packets, used as initial messages, sent in headers, or attached a (http-only) session cookie
- This allows the gateway to be a proper security enforcement boundary, even when the backend lacks native identity controls
This project is that gateway.
Written in Go for performance and concurrency, it uses:
gorilla/websocketfor low-level WebSocket controlgolang-jwt/jwt/v5for standards-compliant JWT verification- Prometheus metrics for real-time observability
- ⚡ Minimal latency
- 🧵 Two goroutines per connection (no context switching)
- 🔐 JWT-secured access
- 🔁 Full-duplex binary & text stream support
- 🧩 Multi-protocol support (MQTT/WS, raw WS, future protocols)
- 📊 Observability-first with labeled Prometheus metrics
- 🚦 Graceful shutdown for container orchestration
- ☁️ Cloud-native by default (Kubernetes, Traefik, Docker)
This shows how a client connects directly to the go-ws-gateway-proxy, which validates JWTs and forwards traffic to the backend.
This shows how the proxy can sit behind a public L4/L7 ingress (e.g., Traefik) and route traffic securely within Kubernetes or a VM-based infrastructure.
Single endpoint (/ws) supports both:
- MQTT-over-WebSocket (e.g., RabbitMQ w/ MQTT-WS plugin)
- EMQX
- Mosquitto with WebSocket bridge
- Raw WebSocket clients (e.g., control channels, JSON-RPC, custom frames)
The gateway uses a lightweight parser to autodetect the protocol:
protocol, username, token, err := ParseConnect(packet)- JWTs extracted from MQTT
password, raw WS payload, headers, or (http-only) session cookie - Validated using JWKS from any OIDC-compliant provider
- Claims parsed:
sub,preferred_username,iat,exp,jti
This creates a security envelope on top of the WebSocket protocol, even when the downstream WebSocket service lacks native identity enforcement.
The gateway supports token caching to avoid repeatedly verifying JWT signatures and to prepare for features like token revocation and replay detection.
-
Supported cache backends:
memory(default, per-pod, ephemeral)redis(shared across replicas)postgres(persistent, optionally shared)
-
Configuration:
JWT_CACHE_STORE:memory|redis|postgresJWT_CACHE_PREFIX: Optional prefix to namespace cache keys (default:jwt_cached_token_)REDIS_URL: Required when using Redis backendPOSTGRES_DSN: Required when using Postgres backend
-
Behavior:
- Valid JWTs (based on
sub,jti,exp) are cached - Prevents repeated re-verification of the same token
- Prepares groundwork for support of:
- Token revocation lists (via
jti) - Replay detection across IPs/devices
- Real-time token invalidation
- Token revocation lists (via
- Valid JWTs (based on
Fallback to in-memory cache occurs automatically if misconfiguration is detected or external stores are unavailable.
The gateway now includes early-stage support for distributed token revocation and eviction. This is useful for:
- Forcing logout or session termination across connected clients
- Coordinating global revocation across pods using Redis Pub/Sub
- Clearing
jtientries from the in-memory or external token cache (Redis/Postgres)
POST /admin/revoke: Revoke a specific token using itssubandjticlaims.- Revocation removes the token from the active cache backend (in-memory, Redis, or Postgres).
- Revocation also publishes a "poison pill" via Redis to notify all proxy pods.
- Any active WebSocket connection using that token will be closed with a
1008frame.
- Replay protection is still best-effort—only sessions that respect
sub:jticache keys are affected. - Clients using the same token across connections may still connect briefly before the revocation is received.
- A background job or expiration TTL is still needed for Postgres cache cleanup (see TODO).
- Intended to run behind a service mesh, reverse proxy, or internal firewall, no protection on this exposed route.
curl -X POST http://<gateway-host>/admin/revoke \
-H "Content-Type: application/json" \
-d '{ "sub": "user123", "jti": "abcde-uuid-123" }'- Zero-copy WebSocket forwarding
- No payload parsing or transformation
- Binary or text preserved
- Keeps native framing (e.g., MQTT control packets)
The gateway supports optional message-level rate limiting on all active WebSocket connections. This protects against abusive clients, ensures system fairness, and enforces resource boundaries.
-
Configuration:
RATE_LIMIT_PER_SECOND: Maximum sustained message rate (refill rate of the token bucket)RATE_LIMIT_BURST: Maximum burst capacity (number of messages allowed in a short burst)
-
Behavior:
- Enforced using Go’s token bucket algorithm (
golang.org/x/time/rate) - On violation, the gateway sends a standards-compliant control frame:
WebSocket Close Code: 1008 (Policy Violation) Reason: Rate limit exceeded
- The connection is then closed gracefully
- Enforced using Go’s token bucket algorithm (
-
Identity-aware enforcement:
- For authenticated clients, rate limits are enforced per JWT subject (
sub) - For anonymous clients, fallback to IP address-based limiting
- Ensures shared rate control across all concurrent sessions from the same user or client
- For authenticated clients, rate limits are enforced per JWT subject (
-
Distributed enforcement via Redis:
- When
REDIS_URLis set, rate limits are enforced across instances using a Redis-backed counter (with 1-second fixed windows) - Lua scripting ensures atomicity and correct TTL behavior under load
- Fallbacks to in-memory limits if Redis is unavailable
- When
| Metric | Description |
|---|---|
ws_connections_total |
Total number of WebSocket connections attempted |
ws_auth_failures_total |
Count of invalid, expired, or malformed JWTs |
ws_auth_success_total{sub,preferred_username} |
Successful JWT authentications labeled by subject and username |
ws_active_sessions |
Currently active WebSocket proxy sessions |
ws_connection_duration_seconds |
Histogram of WebSocket session lifetimes (seconds) |
ws_revocations_total |
Total number of manually revoked sessions |
ws_protocol_connections_total{protocol} |
Total connection count by protocol type (mqtt, raw, etc.) |
ws_proxy_errors_total{protocol} |
Total proxy-level failures during streaming |
ws_proxy_retries_total{protocol} |
Count of upstream reconnect attempts during proxying |
ws_circuit_open_total{protocol} |
Count of circuit breaker open events per protocol |
ws_upstream_health_success_total |
Successful upstream health check responses |
ws_upstream_health_failure_total |
Failed upstream health check responses |
ws_rate_limit_violations_total |
Number of rate limit violations |
- TLS termination handled by ingress
- Supports
/healthzendpoint for liveness checks - Clean shutdown on
SIGTERM - Configurable timeouts (e.g.
WS_IDLE_TIMEOUT_SECONDS)
- Future support for GraphQL-over-WS, STOMP, or custom protocol adapters
- Upstreams configurable by protocol (env-driven)
.
├── Dockerfile
├── .env.example
├── scripts/
│ └── docker_run.sh
├── cmd/
│ ├── main.go # Entry point
├── internal/
│ ├── mqtt/ # Protocol detection & parsing
│ ├── auth/ # JWT verification via JWKS
│ └── proxy/ # Transparent duplex forwarding
├── k8s/ # Kubernetes manifest files
└── README.mdCopy the .env.example file and customize it:
cp .env.example .envSet values like:
JWKS_URL=https://keycloak.example.com/realms/myrealm/protocol/openid-connect/certs
WS_IDLE_TIMEOUT_SECONDS=90
RAW_WS_BACKEND_URL=ws://backend-service:8081/rawFirst time only:
chmod +x scripts/docker_run.shUse the provided script:
./scripts/docker_run.shWhich executes:
docker run --rm \
-p 8080:8080 \
--env-file .env \
go-ws-gateway-proxyNote: The container looks for /etc/ws-gw/routes.yaml.
When running standalone Docker, mount your own:
docker run -p 8080:8080 \
-v $(pwd)/routes.yaml:/etc/ws-gw/routes.yaml:ro \
--env JWKS_URL=... \
go-ws-gateway-proxy:latesthttp://localhost:8080/metricshttp://localhost:8080/healthz
| Env Var | Purpose | Default |
|---|---|---|
INGRESS_HOST |
Public hostname used by the Ingress or Traefik proxy | — |
WS_ALLOWED_ORIGINS |
Comma-separated list or wildcard of allowed WebSocket origins | value of Origin header |
JWT_COOKIE_NAME |
Cookie name to extract JWT from if not provided in CONNECT/header | session |
GRACEFUL_DRAIN_TIMEOUT_SECONDS |
Max time to wait for active connections to finish on SIGTERM | 30 |
| File | When |
|---|---|
k8s/base/configmap-routes.yaml |
Kustomize base – edit or patch to add new upstreams |
values.yaml → gatewayRoutes: |
Helm chart – override per environment |
The proxy loads the YAML, matches the first prefix that fits
req.URL.Path, and dials the upstream with optional sub-protocol copying
and query-string preservation.
| Env Var | Purpose | Default |
|---|---|---|
JWKS_URL |
OIDC-compliant JWKS endpoint for token signature verification | — |
JWKS_TTL_MINUTES |
Duration to cache the JWKS before refetching | 10 |
JWT_EXPECTED_ISSUER |
Issuer claim expected in JWTs | — |
JWT_EXPECTED_AUDIENCE |
Comma-separated list of valid audience values | — |
JWT_VALID_ALGORITHMS |
Comma-separated list of allowed JWT algorithms | RS256 |
| Env Var | Purpose | Default |
|---|---|---|
WS_IDLE_TIMEOUT_SECONDS |
Max time a connection can remain idle before closure | 60 |
WS_WRITE_TIMEOUT_SECONDS |
Max duration for outbound WebSocket writes | 10 |
WS_CONTROL_TIMEOUT_SECONDS |
Timeout for sending control frames (e.g. close frames) | 1 |
| Env Var | Purpose | Default |
|---|---|---|
CB_FAILURE_THRESHOLD |
Consecutive failures before opening the circuit | 5 |
CB_OPEN_TIMEOUT_SECONDS |
Time the circuit remains open before retry is attempted | 60 |
CB_HEALTHCHECK_INTERVAL_SECONDS |
Interval to probe upstream and possibly close circuit | 30 |
| Env Var | Purpose | Default |
|---|---|---|
ROUTE_FILE |
Absolute path to routes.yaml (overrides --route-file flag) | /etc/ws-gw/routes.yaml |
WS_DIAL_RETRY_MAX |
Number of times to retry dialing the upstream on failure | 3 |
WS_DIAL_RETRY_INTERVAL_SECONDS |
Delay between retry attempts | 2 |
| Env Var | Purpose | Default |
|---|---|---|
RATE_LIMIT_PER_SECOND |
Allowed number of messages per second (token bucket refill rate) | 10 |
RATE_LIMIT_BURST |
Allowed burst size before throttling | 30 |
| Env Var | Purpose | Default |
|---|---|---|
JWT_CACHE_STORE |
Cache backend: memory, redis, or postgres |
memory |
JWT_CACHE_PREFIX |
Prefix for keys in Redis/Postgres cache | jwt_cached_token_ |
REDIS_URL |
Redis connection URL (e.g. redis://localhost:6379/0) |
— |
POSTGRES_DSN |
DSN for Postgres cache backend | — |
Routes are now declared by ConfigMap (Kustomize) or gatewayRoutes in Helm
values. Each entry maps a URL prefix to an upstream WebSocket endpoint:
gatewayRoutes:
- prefix: /livekit
upstream: ws://livekit-signal.media.svc.cluster.local:7880/signal
copySubProtocol: true
preserveQuery: true
- prefix: /mqtt
upstream: ws://rabbitmq-mqtt.messaging.svc.cluster.local:15675/ws
copySubProtocol: false
preserveQuery: true- Queue-aware outbound flow control
- Global rate limiting using Redis across all replicas
- Detect and reject reuse of the same JWT across multiple clients (replay)
- Store
jti(JWT ID) in Redis or Postgres to:- Track which tokens have already been used
- Enforce one-time-use policies for high-sensitivity operations
- Support short-lived tokens with strict replay protection windows
- Optional expiration strategy (TTL-based or revocation list)
- Enable real-time invalidation of compromised tokens
- Integrate with token revocation events from upstream identity providers
- Fallback to OAuth2
introspection_endpoint - For short-lived, reference-based tokens
- Allows real-time token validity checks without relying on signature verification alone
Pull requests, issues, and discussions are welcome. Contributions should maintain zero-dependency, high-performance Go idioms.
MIT License © 2024 — go-ws-gateway-proxy mortdiggiddy
| Feature | Value |
|---|---|
| Protocol-aware WS proxy | ✅ |
| Secure JWT auth w/ Keycloak | ✅ |
| Fast, transparent forwarding | ✅ |
| Metrics, health checks | ✅ |
| Deployable to K8s, Docker | ✅ |
| Extensible architecture | ✅ |
This gateway is built for real-time systems that care about security, observability, and scale. Whether you’re running a robotics fleet, an IoT mesh, or a trading interface — this is the entry point you can rely on.

