Skip to content

Failed to decode signal responses with LiveKit Server v1.10.0 (protocol=1) #86

Description

@ccanizaresv

Description

The ESP32 SDK v0.3.6 fails to establish peer connections when connecting to LiveKit Server v1.10.0. The signaling WebSocket connects successfully and the JoinResponse is decoded (participant info is visible), but subsequent SignalResponse messages fail to decode via nanopb, preventing peers from being established.

Environment

  • ESP32 SDK: v0.3.6 (latest release)
  • LiveKit Server: v1.10.0 (livekit/livekit-server:latest Docker image)
  • Hardware: ESP32-S3-BOX-3B
  • ESP-IDF: v5.4
  • nanopb: 0.4.9
  • Protocol version: 1 (hardcoded in core/url.c:30)

Error logs

The connection follows this pattern on every attempt (5 reconnects, then timeout):

I lk_session: Participant: satellite-box3 (?) state=0 kind=0
I lk_session: Participant: agent-AJ_xGT8vtR6crHz (?) state=2 kind=4
E livekit_protocol: Failed to decode signal res: type=16, error=end-of-stream
I DTLS: Init SRTP OK
E livekit_protocol: Failed to decode signal res: type=9, error=wrong wire type
E livekit_peer.sub: Failed to open peer
E livekit_engine: Failed to establish peer connections

The failing field numbers in SignalResponse and their errors:

Proto field # Field name nanopb option Decode error
5 update (ParticipantUpdate) decoded (FT_POINTER) parent stream too short
9 mute (MuteTrackRequest) FT_IGNORE wrong wire type
13 stream_state_update FT_IGNORE invalid wire_type
14 subscribed_quality_update FT_IGNORE invalid wire_type
16 refresh_token (string) FT_IGNORE end-of-stream

Analysis

  1. The JoinResponse decodes correctly — participant info from both the local client and the remote agent are visible.
  2. Fields marked as FT_IGNORE in livekit_rtc.options are failing to be skipped by nanopb during decode, rather than being silently ignored. This suggests either the server's protobuf encoding has changed in a way nanopb can't skip, or there's a nanopb issue with FT_IGNORE inside oneof fields.
  3. Field 5 (ParticipantUpdate), which IS being decoded (not ignored), also fails with parent stream too short, suggesting the server sends a message structure that exceeds what the current proto definitions + nanopb options can handle.
  4. Because these decode failures cause the entire SignalResponse to be dropped (in signaling.c:198), critical messages like SDP offers/answers may also be lost, preventing peer connection establishment.

Steps to reproduce

  1. Run LiveKit Server v1.10.0
  2. Connect with ESP32 SDK v0.3.6 using audio publish/subscribe
  3. Observe repeated decode failures and peer connection failures after 7 reconnect attempts

Possible causes

  • Server v1.10.0 may encode SignalResponse fields with newer protobuf extensions that nanopb's FT_IGNORE/skip logic can't handle inside oneof unions.
  • The protocol definitions (from livekit/protocol v1.42.2) may not match what server v1.10.0 actually sends.
  • The WebSocket buffer size (20KB in signaling.c) may truncate large messages, causing the end-of-stream / parent stream too short errors.

Question

What is the recommended/tested LiveKit Server version range for SDK v0.3.6? It would be very helpful to document server version compatibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions