Syncing algorithm #1778

noot · 2021-09-07T19:20:51Z

noot
Sep 7, 2021

Gossamer sync algorithm design

WIP document: https://hackmd.io/@nZ-twauPRISEa6G9zg3XRw/r1wy2hfWF

This outlines my current approach to the syncing algorithm design. (still WIP, it's a bit messy)

Requirements

modular - needs to be able to support multiple algorithms in the future, ie full, light, and warp
isolated from the networking layer - network package should only deal with sending and receiving messages (and potentially scoring, TBD)
needs to be able to handle requests for block [headers, bodies] as well as justifications
needs to handle 2 different modes of sync - bootstrap and near-head

smoldot (alternative client for substrate chains)

smoldot is essentially a "lighter" version of substrate.

from the implementer:

In case that helps: in smoldot I split the syncing strategy in two, with the initial syncing and head-of-chain syncing in different modules. The initial syncing simply downloads chunks of 128 blocks at a time (without keeping track of forks, since we're supposed to be on the finalized chain) and verifies them, while the head of chain syncing downloads all forks.
The head of chain syncing is basically a data structure containing:

Blocks ready to be verified (whose header and body are known, for example)
Blocks that we know exist but aren't ready to be verified, for example because only their header is known or because their parent hash isn't a known block. When we connect to a peer and they say in their handshake that their best block is block hash H, we also insert H in there.
On-going network requests that target not-ready-to-be-verified yet blocks.
When a network request finishes, we update the data structure and potentially turn not-ready-to-be-verified into ready-to-be-verified blocks

The syncing algorithm also needs to track which blocks are known to which peers. In smoldot this is simply a glorified BTreeSet<(PeerId, BlockHash)>.
This code is definitely one of the hardest challenges when implementing a Polkadot client though
Blocks that we know exist but aren't ready to be verified

This list needs to be bounded, ideally, to avoid attacks where someone sends millions of block announces that we can't verify

as well, the ancestor search is used for at-the-head fork syncing.
thus, gossamer should implement 2 different modes of syncing - bootstrap and "idle" or near-head.

interface for syncer within network package

type Syncer interface {
    // HandleBlockAnnounceHandshake updates our view of the peer's state by potentially updating their best block and hash
    HandleBlockAnnounceHandshake(from peer.ID, msg *BlockAnnounceHandshake) error
    
    // HandleBlockAnnounce ...
    HandleBlockAnnounce(from peer.ID, msg *BlockAnnounceMessage) (propagate bool, err error)
}

Note that HandleBlockAnnounceHandshake is of type network.HandshakeValidator and HandleBlockAnnounce is of type network.NotificationsMessageHandler

interface for network within syncer package

type Network interface {
    // DoBlockRequest sends a request to the given peer. If a response is received within a certain time period, it is returned, otherwise an error is returned.
    DoBlockRequest(to peer.ID, req *BlockRequestMessage) (*BlockResponseMessage, error)

    // BanPeer disconnects and blocks the given peer from reconnecting
    // Should only be used in extreme circumstances 
    BanPeer(p peer.ID) // TODO: do we want to add banning now, or wait until scoring? probably wait ;)
}

syncer design

we will have a chainSync module, which keeps track of the current sync workers and handles their success or failure.

a worker will consist of:

start hash/number to begin request from
target hash/number
ID
data to request
direction of request

also similar to both, we will keep track of the latest known state for each peer, ie. their best block hash and number. this is obtained via the BlockAnnounceHandshake message. we can use this info to determine who to request blocks from

we will also have a chainProcessor module which is similar to the current syncer.Service in that it processes the blocks once they are ready. this module should have some queue of blocks that the chainSync writes to, which the chainProcessor continually reads from and processes.

the chainSync should make sure the blocks are ordered / have no gaps, as it needs to make sure there are no blocks missing in a response. the worker will request in ordered 128-block chunks during bootstrap, so it can just make sure it receives a response before moving on to the next. during idle mode, the module should make sure the parent block is known before placing the block in the queue.

bootstrap mode

Bootstrap mode simply requests blocks in chunks of 128 and ensures they are ordered for processing (ie. in ascending order, without gaps). As we receive justifications along with the blocks, as the chain at this point is already finalized, we don't need to track forks during bootstrap.

beginning bootstrap:

should be started once we connect to some number of peers (5?) and receive their best blocks
once we have their best blocks, we pick the highest as our target
dispatch a worker with start=0 and target=highest seen

TODO: should we have parallel workers, or just 1 for bootstrap?

worker will request from peers who report that they have the blocks we are currently looking for
on response, pre-validate blocks and push the blocks into the chainProcessor queue if validated
- if the blocks aren't validated, then return an error and downscore the peer
on failure, dispatch another worker with updated start and retry
- updated start should be our best block (or should it be the end of the validation queue?)
- probably go with best block for now, duplicate blocks are better than potentially missing

response validation:

chainSync should perform some response pre-validation before pushing to the processor
it should check that:
- the response is not empty
- the response contains all the expected fields (header and body for bootstrap)
- each block has the correct parent, ie the response constitutes a valid chain
once this is done, then the blocks can be pushed for processing

block processing:

essentially the same as now
only difference is the queue
should it be a channel, or a full data structure?
depends on whether we need to know what's in the queue
idle mode might need to know what's in the queue, so we can determine whether we have the parent block yet
or, we can just put the block in the queue only if we already imported the parent (might actually be nicer)

note: substrate/smoldot tracks all requests that have been sent out, but response has not yet been handled, do we need this or will tracking workers be sufficient?

idle mode

Idle mode (or near-head mode) requires tracking of forks.

We need to download every fork we see. this may require "ancestor search" to find the root of the fork. (we can also use some naive approach where we just request for the blocks think might be on that fork, perhaps by descending order)

Need to track:

blocks ready to be verified (ie. we have header and body and parent block)
blocks we know of but can't be verified yet (ie. only have header, have hash from a handshake, haven't imported parent block yet)
- what data structure to use for this?
- smoldot uses "disjoint blocks" data structure which is a hash map of (number, hash) -> block + additional data
- maybe we can use a tree structure for this, or hashmap as well

workers will be dispatched to target the blocks we know of but can't yet be verified

when we get the data needed, we move the blocks over to the ready queue

process for syncing a fork:
eg. our best block is number 100 with hash 0xab..cd. a peer reports a block 100 with hash 0xef..gh. the peer is on a fork, we need to request blocks in descending order from 0xef..12

dispatch a worker with the following:

startHash = 0xef..12
startNumber = 100
direction = DIR_DESCENDING
targetHash = ?
targetNumber = ?

when we receive the response, hopefully the fork is not >128 blocks long, in which case we can traverse the results to find the common ancestor of both our chains
if the fork is >128 blocks long, we will need to dispatch another worker with an updated startHash being the previous-128

bootstrap/idle modules

WIP: I suggest potentially switching between two modules for bootstrap and idle modes, as there are quite a few differences between the two, eg.:

bootstrap can be done by potentially 1 worker, idle will need 1 for each fork at least
bootstrap workers always have same direction/# of blocks to request, idle mode workers are more complex and may change direction and number based on current pending blocks
during idle mode, we want to regularly check the pending blocks set and create requests for blocks there. during bootstrap we can essentially ignore the pending blocks set, as the only blocks in it will likely be blocks that we receive from block announces/handshakes

justification request process

TODO

for bootstrap mode, we can just request justifications along with the blocks and optimistically hope our peer has them all, since the older blocks should all be finalized already.

for idle mode - ?

Open questions

disjoint block set that tracks block we know of, but aren't ready to be processed - what data structure to use for this? substrate uses a simple hashmap, is there something that could be more optimal (like a disjoint blocktree or something like that?)
queue of ready blocks - is a channel sufficient, or do we need a whole queue structure? ie. will we need to read from the ready queue at some point? I think a channel is ok, but may need to change once we get to implementation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing algorithm #1778

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Syncing algorithm #1778

noot Sep 7, 2021

Gossamer sync algorithm design

Requirements

smoldot (alternative client for substrate chains)

interface for syncer within network package

interface for network within syncer package

syncer design

bootstrap mode

idle mode

bootstrap/idle modules

justification request process

Open questions

Replies: 0 comments

noot
Sep 7, 2021