-
Notifications
You must be signed in to change notification settings - Fork 296
Description
Context
Applications using long string-based protocol IDs can benefit from automatic compression/abbreviation during stream opening.
One concrete example is Ethereum's Req/Resp domain. Protocol strings like the following create inefficiencies when opening per-request streams.
/eth2/beacon_chain/req/beacon_blocks_by_range/2 [47 bytes]
Combined with the fixed /multistream/1.0.0 preamble and length-prefixing varints, each request incurs in ~70 bytes of overhead that can be reduced to single digits.
Goals
Design a dictionary-based compression mechanism for protocol IDs during stream opening that is simple, symmetric, dynamic yet deterministic, and easily debuggable.
Proposed solution
multistream 2 definition
We introduce the /multistream/2 protocol, with compact multicodec 0x41. We reserve the 0x40-0x4f range for multistream versioning, where 0x40 aliases /multistream/v1.0.0 (currently unused).
/multistream/v1.0.0 continues using the fixed byte string <varint encoding of 19> || /multistream/v1.0.0 for itself. This byte string is:
0x132f6d756c746973747265616d2f76312e302e30
When the first byte on a stream is 0x41, the protocol selection is processed by /multistream/2.
Identify
We add a new Protobuf field in the payload of identify and identify push:
// Maximum version of multiselect supported by the peer.
// Allowable range: [1-16], encoding as a single byte.
optional uint32 max_multiselect_version = 9;Absence of this field indicates v1-only support. Presence indicates the peer supporting v1 through to the specified version.
New multistream protocols can only be used after receiving the peer's identify payload. Both ends the effective version via min(self.max_multiselect_version, peer.max_multiselect_version), producing identical results.
Peers MUST wait to start new streams until receiving their peer's Identify. Most applications already follow this behaviour; go-multistream even includes explicit waiting logic.
For peers supporting multistream > v1, v1 usage is effectively limited to first first identify protocol stream during connection bootstrap.
Protocol abbreviation with multistream 2
Identify enumerates all protocol strings supported by the peer. Abbreviations are calculated by:
- Hashing each string with SHA-256/BLAKE3-256 (TBD).
- Populating the abbreviation table by using the first byte of each digest. On collisions, expand ambiguous values by selecting additional bytes one by one until no ambiguity exists.
When peer A informs of new protocol strings via identify push, new collisions may arise with extant protocols that require its expansion. Existing streams will remain undisturbed, and new streams will use the newly extended selector. However, Peer A MUST retain old mappings for affected protocols to avoid races if peer P is opening streams while the update is in transit.
Example:
Old state (Peer A)
protocol/a=>0x01protocol/b=>0xa9protocol/c=>0x7d
New state (Peer A)
protocol/a=>0x01protocol/b=>0xa9protocol/c=>0x7daf(expanded because prefix conflicts)protocol/d=>0x7d88(expanded because prefix conflicts)protocol/c=>0x7d(retained for session with P)
Optimization: Peer P will only selects protocols it actually supports, so remembering mappings for unopenable protocols is unnecessary. Applications may supply all possible outbound protocol strings to libp2p ahead of time (and subsequently update this knowledge), enabling efficient data structures and lookups by tracking only relevant mappings.
Protocol selection with multistream 2
Protocols are now selected by sending:
varint length || 0x41 (multicodec) || varint length || protocol abbreviation
Example. To select /eth2/beacon_chain/req/beacon_blocks_by_range/2:
With v1:
0x132f6d756c746973747265616d2f76312e302e302f2f657468322f626561636f6e5f636861696e2f7265712f626561636f6e5f626c6f636b735f62795f72616e67652f32
Breakdown:
- varint(19) + "/multistream/v1.0.0" = 20 bytes
- varint(47) + "/eth2/beacon_chain/req/beacon_blocks_by_range/2" = 48 bytes
- total: 68 bytes
With v2:
0x0141010c
Breakdown:
- BLAKE3-256 hash: 0x0c210f5e4f3bd7f720e887684c77c48f9d65b8d2065a3f2b83b0d47496e5bb29
- varint(1) + 0x41 = 2 bytes
- varint(1) + first byte of BLAKE3-256 hash = 2 bytes [assuming no collisions]
- total: 4 bytes (94% saving)
Metadata
Metadata
Assignees
Labels
Type
Projects
Status