Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining and Clarifying the Verification Process for NMT Proofs Generated from Non-Lexicographically Ordered Trees #97

Open
staheri14 opened this issue Feb 13, 2023 · 4 comments
Milestone

Comments

@staheri14
Copy link
Contributor

Problem

The issue at hand pertains to the generation of NMTs from sets of leaves that are not ordered lexicographically based on their namespaces. The behavior of NMT proofs generated from such malformed trees is unclear and needs to be addressed. The main objectives of this issue are to:

  • Determine whether the current NMT proof verification system is able to detect out-of-order leaves and respond accordingly.
  • If the current system is not capable, then develop and implement measures to invalidate NMT proofs generated from malformed NMTs.
@staheri14 staheri14 changed the title Defining the Verification Process for NMT Proofs Generated from Non-Lexicographically Ordered Trees Defining and Clarifying the Verification Process for NMT Proofs Generated from Non-Lexicographically Ordered Trees Feb 13, 2023
@evan-forbes
Copy link
Member

just leaving a thought here: I think we'll have the functionality to verify an out of order nmt inclusion proof, but also have some way of notifying the user that one or more nodes was out of lexicographical order. This way, we could prove that an invalid order was committed to.

@staheri14
Copy link
Contributor Author

Per conversation with @evan-forbes, I'd like to provide more detail on the attack scenario that prompted the opening of this issue:

Malicious consensus nodes could produce and commit a block, denoted as B, in which the shares are not ordered lexicographically based on their namespace IDs. A DA (data availability) full node obtains the block header, including the data root denoted as dataRoot, and the underlying shares, and will attempt to construct the block. However, if the node detects that the shares used to construct dataRoot are unordered, it must provide a proof of bad encoding to the light clients. Specifically, it must indicate that there are out-of-order leaves in the tree represented by dataRoot.

The problem is that the current NMT (namespace Merkle tree) implementation does not support this kind of bad-encoding proof. Furthermore, the library does not even allow for the construction of a tree with unordered leaves, since it is protected through the Push method, which checks the namespace ID of the inserted data items. Therefore, the issue at hand is to implement new logic in the NMT library that would allow a full node to provide such bad-encoding proof to the light clients.

staheri14 added a commit that referenced this issue Feb 28, 2023
… to predict panics (#113)

## Overview
An incremental PR toward #109 
**Next**: The validation utility functions developed in this PR must be
further integrated into both the tree construction and proof
verification process to detect invalid inputs and notify the caller of
any errors. This will close #109 and is in line with the objectives
outlined in issues #99 and #97. I will address this in a subsequent pull
request.
## Checklist

- [x] New and updated code has appropriate documentation
- [x] New and updated code has new and/or updated testing
- [x] Required CI checks are passing
- [x] Visual proof for any user facing features like CLI or
documentation updates
- [x] Linked issues closed with keywords
@staheri14
Copy link
Contributor Author

staheri14 commented Apr 4, 2023

It might be already covered by the existing bad encoding fraud proof https://github.com/celestiaorg/celestia-node/blob/main/share/eds/byzantine/bad_encoding.go, we need to check.

Update: The current bad encoding fraud proof logic does not capture this issue, the NMT construction errors out on a set of unordered leaves, disallowing the full node from calculating the row and column root. This means the full node cannot generate a bad encoding proof if shares are not ordered in a row or column.

@staheri14
Copy link
Contributor Author

staheri14 commented Apr 18, 2023

During my call with @evan-forbes, we discussed the potential benefits of enabling light nodes to identify out-of-order leaves when sampling and verifying their inclusion proof, in addition to the full nodes. The adversarial setting is similar to the one described in this comment, but with the added possibility of the full node being malicious and not providing fraud proofs for unordered leaves.

During sampling, a light client may encounter proof of inclusion of shares that are out-of-order (for which the proof verification results in false). While light clients reject such a proof, we would like to enable them to craft a bad encoding proof and propagate it in the network. This may require some changes in the verification methods currently available in NMT, such as the ability to calculate the root even if a violation is detected in the proof.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants