Transition block lookup sync to range sync #6122

dapplion · 2024-07-17T09:25:15Z

Issue Addressed

For context of the issue 👇

Allow to sync a long unknown fork #6099

Proposed Changes

This solution links range sync and lookup sync by:

Trigger lookup sync with status messages that include unknown root. Currently range sync "swallows" peer statuses that are too close to head but on a potentially unknown fork. Range sync is not suitable to handle short forks, so instead deffer to lookup sync to retrieve that unknown root.
If lookup sync reaches the max chain length, trigger range sync on that set of peers. The idea is that most peers in that group should have this chain as canonical so we can blocks_by_range from them. In the future we may use Add BeaconBlocksByRange v3 ethereum/consensus-specs#3845 to be sure what fork are we syncing from.

I believe this change is the minimal solution to cover the corner case of #6099. It's not maximally efficient as it will download a section of blocks twice. However this situation is rare, and we just need some way to recover even if it's at the expense of some extra bandwidth.

Testing

There are plans to with EF devops to test nasty long non-finality situations, but not now. For now I can try to write representative unit sync tests with some sample scenarios. This change should not impact good network codepaths, so should have no downsides. Range sync will only be triggered when we need to discover a long unknown fork, which has only happened in LH in cooked testnets of lookup sync bugs.

beacon_node/network/src/sync/block_lookups/mod.rs

dapplion · 2024-08-01T14:49:59Z

Updated tests to assert a syncing chain is created. Plus removed down-scoring the peers for a long chain

lighthouse/beacon_node/network/src/sync/block_lookups/mod.rs

Lines 294 to 299 in 4ec5594

    
           // Do not downscore peers here. Because we can't distinguish a valid chain from 
        
           // a malicious one we may penalize honest peers for attempting to discover us a 
        
           // valid chain. Until blocks_by_range allows to specify a tip, for example with 
        
           // https://github.com/ethereum/consensus-specs/pull/3845 we will have poor 
        
           // attributability. A peer can send us garbage blocks over blocks_by_root, and 
        
           // then correct blocks via blocks_by_range.

As a final item we could check that for a syncing chain if it reaches the end, assert that the block root matches that syncing root

AgeManning · 2024-09-03T07:08:51Z

beacon_node/network/src/sync/block_lookups/mod.rs

                        }
-                        self.drop_lookup_and_children(*lookup_id);
-                    }
+                        None => (lookup.block_root(), lookup.peek_downloaded_block_slot()),


I ran out of time looking through to find out the scenario when this occurs.

Its when we don't have any single block lookups that match the current parent tip. In this case we start a range sync based on the parents ancestor, rather than the parent_chain_tip?

I didn't fully parse this.

Overall I like the idea!

This should never happen. I can log an error and return, but used this fallback value for "simplicity". Happy to change if this confuses more than helps

Oh yeah. I think that might be helpful. This way we can catch any wild states rather than doing a range sync. Also (at least for me, will make it easier to follow).

AgeManning

Yeah I think we should test this out. Logic seems reasonable to me.

realbigsean

LGTM

realbigsean · 2024-10-08T18:02:02Z

@mergify queue

mergify · 2024-10-08T18:02:37Z

queue

🛑 The pull request has been removed from the queue `default`

The merge conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

realbigsean · 2024-10-08T21:16:40Z

@mergify refresh

mergify · 2024-10-08T21:16:46Z

refresh

✅ Pull request refreshed

realbigsean · 2024-10-08T21:16:52Z

@mergify queue

mergify · 2024-10-08T21:17:04Z

queue

✅ The pull request has been merged automatically

The pull request has been merged automatically at 71c5388

kevaundray reviewed Jul 17, 2024

View reviewed changes

beacon_node/network/src/sync/block_lookups/mod.rs Outdated Show resolved Hide resolved

dapplion force-pushed the lookup-to-range branch from 62375ef to 4ec5594 Compare August 1, 2024 14:48

dapplion added the ready-for-review The code is ready for review label Sep 2, 2024

Transition block lookup sync to range sync

17aadad

dapplion force-pushed the lookup-to-range branch from 4ec5594 to 17aadad Compare September 2, 2024 14:05

dapplion requested review from realbigsean, AgeManning and pawanjay176 September 2, 2024 14:05

AgeManning reviewed Sep 3, 2024

View reviewed changes

dapplion added 2 commits September 9, 2024 12:14

Log unexpected state

3ca0f78

Merge remote-tracking branch 'sigp/unstable' into lookup-to-range

91df1b9

dapplion requested a review from AgeManning October 3, 2024 21:31

AgeManning approved these changes Oct 3, 2024

View reviewed changes

dapplion added 2 commits October 8, 2024 13:18

Add docs

48cb221

Merge remote-tracking branch 'sigp/unstable' into lookup-to-range

b8eae5c

realbigsean approved these changes Oct 8, 2024

View reviewed changes

mergify bot merged commit 71c5388 into sigp:unstable Oct 8, 2024
28 checks passed

dapplion deleted the lookup-to-range branch October 8, 2024 22:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transition block lookup sync to range sync #6122

Transition block lookup sync to range sync #6122

dapplion commented Jul 17, 2024

dapplion commented Aug 1, 2024

AgeManning Sep 3, 2024

dapplion Sep 3, 2024

AgeManning Sep 8, 2024

AgeManning left a comment

realbigsean left a comment

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024 •

edited

Loading

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024 •

edited

Loading

Transition block lookup sync to range sync #6122

Transition block lookup sync to range sync #6122

Conversation

dapplion commented Jul 17, 2024

Issue Addressed

Proposed Changes

Testing

dapplion commented Aug 1, 2024

AgeManning Sep 3, 2024

Choose a reason for hiding this comment

dapplion Sep 3, 2024

Choose a reason for hiding this comment

AgeManning Sep 8, 2024

Choose a reason for hiding this comment

AgeManning left a comment

Choose a reason for hiding this comment

realbigsean left a comment

Choose a reason for hiding this comment

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024 • edited Loading

🛑 The pull request has been removed from the queue default

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024

✅ Pull request refreshed

realbigsean commented Oct 8, 2024

mergify bot commented Oct 8, 2024 • edited Loading

✅ The pull request has been merged automatically

mergify bot commented Oct 8, 2024 •

edited

Loading

🛑 The pull request has been removed from the queue `default`

mergify bot commented Oct 8, 2024 •

edited

Loading