Skip to content

Conversation

pratikspatil024
Copy link
Member

@pratikspatil024 pratikspatil024 commented Sep 15, 2025

Description

Implemented peer consensus based witness verification

  1. Warning Threshold System
    • Triggers verification when a peer reports # pages > allowed (calculated dynamically)
    • Peers under threshold are assumed honest
  2. Peer Consensus Verification
    • Query 2 random peers for verification
    • Use collective peer knowledge to detect malicious behaviour
  3. Peer Management
    • Drop peers immediately
    • Cache verification results to avoid redundant checks
CheckWitnessPageCount() - Entry point for verification
verifyWitnessPageCount() - Core consensus logic
RequestWitnessesWithVerification() - Verification-enabled witness requests
getAllPeers() - Random peer selection support

//configs
const (
    witnessPageWarningThreshold = 10               // Trigger verification threshold
    witnessVerificationPeers    = 2                // Number of peers to query
    witnessVerificationTimeout  = 5 * time.Second  // Query timeout
    witnessVerificationCacheTTL = 10 * time.Minute // Cache duration
)

Verification Flow:


1. Node A requests witness 0xABC... from Peer B.
2. Downloads Page 0, learns there are 50 total pages.
3. Locks TotalPages = 50.
4. Calculates threshold → ceil(45MB / 15MB) = 3 pages.
5. Since 50 > 3, pauses to verify.
6. Downloads Pages 1, 2, and 3 (up to threshold).Bandwidth used so far: 48MB.
7. Cache check → MISS.
8. Selects two random peers → Peer C and Peer D.
9. Requests metadata (WIT2) from both peers.
10. Both report: TotalPages = 50, Size = 800MB, Block = 12345.
11. Bandwidth for metadata: 180 bytes.
12. Calculates consensus → majority agrees on 50 pages.
13. Peer B deemed honest.
14. Caches result {0xABC...: 50 pages, TTL = 10 min}.
15. Resumes download from Peer B (Pages 4–49).
16. Validates each page (total count, page index, limits).
17. Completes full download (50 pages = 800MB).
18. Reconstructs witness and imports it.

Changes

  • Bugfix (non-breaking change that solves an issue)
  • Hotfix (change that solves an urgent issue, and requires immediate attention)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (change that is not backwards-compatible and/or changes current functionality)
  • Changes only for a subset of nodes

Breaking changes

Please complete this section if any breaking changes have been made, otherwise delete it

Nodes audience

In case this PR includes changes that must be applied only to a subset of nodes, please specify how you handled it (e.g. by adding a flag with a default value...)

Checklist

  • I have added at least 2 reviewer or the whole pos-v1 team
  • I have added sufficient documentation in code
  • I will be resolving comments - if any - by pushing each fix in a separate commit and linking the commit hash in the comment reply
  • Created a task in Jira and informed the team for implementation in Erigon client (if applicable)
  • Includes RPC methods changes, and the Notion documentation has been updated

Cross repository changes

  • This PR requires changes to heimdall
    • In case link the PR here:
  • This PR requires changes to matic-cli
    • In case link the PR here:

Testing

  • I have added unit tests
  • I have added tests to CI
  • I have tested this code manually on local environment
  • I have tested this code manually on remote devnet using express-cli
  • I have tested this code manually on amoy
  • I have created new e2e tests into express-cli

Manual tests

Please complete this section with the steps you performed if you ran manual tests for this functionality, otherwise delete it

Additional comments

Please post additional comments in this section if you have them, otherwise delete it

Comment on lines 176 to 178
if witPacket, ok := res.Res.(*wit.WitnessPacketRLPPacket); ok && len(witPacket.WitnessPacketResponse) > 0 {
return witPacket.WitnessPacketResponse[0].TotalPages, nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The res from resCh is not *wit.WitnessPacketRLPPacket but actually []*stateless.Witness.

Reference:

bor/eth/peer.go

Lines 256 to 276 in 8c7fa04

var witnesses []*stateless.Witness
var responseHashes []common.Hash
for hash, wit := range reconstructedWitness {
witnesses = append(witnesses, wit)
responseHashes = append(responseHashes, hash)
}
if len(witnesses) != len(hashes) {
p.witPeer.Peer.Log().Error("Not able to fetch all requests witnesses", "peer", p.ID(), "requestedHashes", hashes, "responseHashes", responseHashes)
}
doneCh := make(chan error)
go func() {
<-doneCh
}()
// Adapt wit.Response[] to eth.Response.
// We can only copy exported fields. The unexported fields (id, recv, code)
// will have zero values in the ethRes sent to the caller.
// Correlation still works via the Req field.
ethRes := &eth.Response{
Req: wrapperReq.Request, // Point back to the embedded shim request.
Res: witnesses,

A few things we should consider:

  • By requesting that way, we are downloading the whole witness, not a single page. (In this case >160MB)
  • Even requesting a single page, still a big amount of data (16MB for each peer)
  • We should consider implementing a new method to just retrieve the page count. (may be useful for future scenarios)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Lucca, makes sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this makes sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, this does not add a separate wit protocol method, as it would need a protocol upgrade.
Do you think I should do that?
@lucca30 @cffls

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @pratikspatil024 , good question. Introducing a new method allows us reducing from 16MB in this packet to basically 0. So it seems like an interesting upgrade. We can maybe deliver this first version that way but keep the upgrade on backlog.

@cffls
Copy link
Contributor

cffls commented Sep 15, 2025

Since this isn't a necessary feature in the initial veblop release, let's use develop as the base branch.

func TestWitnessVerificationConstants(t *testing.T) {
// These constants should match the ones defined in eth/fetcher/witness_manager.go
const (
witnessPageWarningThreshold = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to make this a dynamic parameter based on the block gas limit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@pratikspatil024 pratikspatil024 changed the base branch from v2.3.0-candidate to develop September 25, 2025 10:20
Copy link
Contributor

@cffls cffls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this been tested in a devnet? It would be great if we can create some test scenarios for this new logic.

// Request witnesses using the wit peer
return ethPeer.RequestWitnesses([]common.Hash{hash}, sink)
// Create verification callback for page count verification (returns true if honest)
verifyPageCount := func(hash common.Hash, pageCount uint64, peer string) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be moved to a standalone function instead of being an embedded function?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved, thanks!

eth/peer.go Outdated
mapsMu.Lock()
downloadPaused[page.Hash] = true
mapsMu.Unlock()
p.witPeer.Peer.Log().Warn("Peer failed verification, pausing witness download", "peer", p.ID(), "hash", page.Hash, "totalPages", page.TotalPages)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the node drop a peer here if the peer is dishonest? The block hash is added to downloadPaused but no further action is taken it seems.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch! Fixed, thanks!

}

// Run synchronous verification and return result
return h.blockFetcher.GetWitnessManager().CheckWitnessPageCount(hash, pageCount, peer, getRandomPeers, getWitnessPageCount)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call CheckWitnessPageCount first before sampling from peers? Because it would avoid sending unnecessary metadata request to peers in most cases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what's happening right now, the getRandomPeers function is used at a later point. Added comments.

Let me know if this is not what you meant, thanks!

Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
2 Security Hotspots
11.7% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants