Skip to content
This repository has been archived by the owner on Mar 24, 2023. It is now read-only.

Add fork choice rule with DAS analysis #179

Merged
merged 8 commits into from
Jul 7, 2021
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@
- [Rationale](./rationale/README.md)
- [Block Rewards](./rationale/rewards.md)
- [Distributing Rewards and Penalties](./rationale/distributing_rewards.md)
- [Fork Choice Rule with Data Availability Sampling](./rationale/fork_choice_das.md)
- [Message Layout](./rationale/message_block_layout.md)
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@
- [Rationale](./rationale/README.md)
- [Block Rewards](./rationale/rewards.md)
- [Distributing Rewards and Penalties](./rationale/distributing_rewards.md)
- [Fork Choice Rule with Data Availability Sampling](./rationale/fork_choice_das.md)
- [Message Layout](./rationale/message_block_layout.md)
1 change: 1 addition & 0 deletions src/rationale/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,5 @@

- [Block Rewards](./rewards.md)
- [Distributing Rewards and Penalties](./distributing_rewards.md)
- [Fork Choice Rule with Data Availability Sampling](./fork_choice_das.md)
- [Message Layout](./message_block_layout.md)
39 changes: 39 additions & 0 deletions src/rationale/fork_choice_das.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Fork Choice Rule with Data Availability Sampling

- [Preamble](#preamble)
- [Invalid vs Unavailable](#invalid-vs-unavailable)
- [Scenarios](#scenarios)

## Preamble

Tendermint provides finality under an honest 2/3 of stake assumption. It is one of several ["BFT" consensus protocols](https://arxiv.org/abs/1807.04938) (also known as "classical" consensus protocols). Under that assumptions, new _valid_ blocks are immediately and forever final as soon as 2/3 of stake commits to the block. Therefore, under that assumption, Tendermint is fork-free.

Contemporary blockchains support full nodes (which are secure under no assumption on stake honesty) and light nodes (which are secure under an honest majority of stake assumption). LazyLedger is unique in [supporting light nodes with stronger security guarantees](./../specs/node_types.md#node-type-definitions):
adlerjohn marked this conversation as resolved.
Show resolved Hide resolved

1. full nodes are secure under no assumptions on stake honesty
1. light nodes (and partial nodes) are secure under [an honest minority of nodes and synchronous communication](https://arxiv.org/abs/1809.09044), and no assumptions on stake honesty
1. superlight nodes are secure under an honest majority of stake assumption

The introduction of light nodes that do not depend on an honest majority assumption also introduces additional cases that must be analyzed.

## Invalid vs Unavailable

Tendermint (and other consensus protocols) requires blocks to be _valid_, i.e. pass a [validity predicate](https://arxiv.org/abs/1807.04938) before they are accepted by an honest node. Note that both validity and invalidity are deterministic and monotonic, i.e. that once a block is valid or invalid, it will be valid or invalid for all future time.

With [Data Availability Sampling](https://arxiv.org/abs/1809.09044) (DAS), there is a notion of _available_ and _unavailable_ blocks. Both are probabilistic rather than deterministic. Availability is assumed monotonic (i.e. once a block is available, it will remain available since The Internet Never Forgets), but unavailability is not. A block proposer may hide a block to make currently-online nodes see the block as unavailable, then reveal the entire (valid) block at a later time.

## Scenarios

We consider two scenarios.

**A dishonest majority hide a committed block, commit to a second block at the same height within the weak subjectivity window to fork the chain, then reveal the first block**. This is trivially equivocation and requires social consensus to resolve which fork to accept. The unavailability of the first block is orthogonal. Nodes that detect equivocation by a majority of stake within the weak subjectivity window must halt regardless.

**A dishonest majority hide a committed block, commit additional blocks on top of it, then reveal the first block within the weak subjectivity window**. There is no equivocation. Note that a node cannot distinguish a dishonest majority in this scenario from a transient network failure on their end and an honest majority.

A requirement is that full nodes and light nodes agree on the same head of the chain automatically in this case, i.e. without human intervention.

Light nodes follow consensus (i.e. validator set changes and commits) and perform DAS. If a block is seen as unavailable but has a commit, DAS is performed on the block continuously until either DAS passes, or the weak subjectivity window is exceeded at which point the node halts.

This comment was marked as resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume performing DAS continuously on a block is ok, because if it's the case that there was a fork, the chain should halt anyway.

Is this "halting" implementing in Tendermint in anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the Tendermint implementation currently in the case of 2/3 of the consensus equivocating?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

N.B. I don't think Tendermint has a weak subjectivity window.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in the Tendermint implementation currently in the case of 2/3 of the consensus equivocating?

I'd need to double-check but it is certainly something that would require social coordination to resolve. IIRC, this would lead to a) full nodes gossiping double sign evidence to notify each other, the 2/3 can still not include this in the block obviously and b) a consensus failure in case a tendermint (full) node would see both forks (panics and stops).

N.B. I don't think Tendermint has a weak subjectivity window.

N.B. == nota bene? Tendermint fullnodes can either catch up from genesis until they caught up with consensus (after that the above holds in case of forks) or use state sync. In the latter case they are indeed initialized subjectively like any tendermint light client (with a hash in that weak subjectivity window).


Full nodes fully download and execute blocks. If a block is seen as unavailable but has a commit, full downloading is re-attempted continuously until either it succeeds, or the weak subjectivity window is exceeded at which point the node halts.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamsi this is one point that might warrant a closer look. I suggest here just doing re-downloading/DAS continuously because it's simpler. But you could also do it by triggering a downloading/DAS attempt when you see a new block with a commit. I figured the latter would be more complicated for full nodes since you'd have to fall back to light client mode, but maybe I'm wrong.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a block is seen as unavailable but has a commit

Isn't that basically against the assumption that at least one validator (e.g. the block proposer) has the data and has gossiped it to the network of validators (and fullnodes). How else would there be a commit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that is a purely theoretical case, then this does not matter much.

Copy link
Member Author

@adlerjohn adlerjohn Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this case isn't theoretical; it could happen if 2/3 of stake is malicious. Even in that case, we want full nodes and light nodes to agree on the same head of the chain. Note that there won't be a fork, since this scenario involves no equivocation, but the two nodes could still be on different heads without continuous re-downloading/DAS attempts: different heights on the same fork.

Copy link
Member

@liamsi liamsi Jun 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this scenario to occur you'd need:

  • 2/3 of stake malicious
  • 2/3 of stake signed and finalized a block (and the data during that consensus round was broadcasted to all peers running the consensus reactor/protocol–even to those that are just passively following along and not signing)
  • validators and all other nodes that saw the data during or after consensus withhold that data – not only 2/3 of the stake but everyone else too

It seems to me that this means that this is an extreme edge-case (and basically is equivalent to the data not being available).

IMO, in practice if a node does not get the data in a few minutes (or max hours), it should not not continue attempting to download the block simply. Most likely this would rather require human interaction instead.

Copy link
Member

@liamsi liamsi Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full nodes fully download and execute blocks.

Does the time of this matter at all? Do they download and execute blocks as they are produced or could that also be at a much later point in time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. This paragraph only really covers what happens in the consensus reactor, or more specifically what happens to a node that keeps up with consensus within the weak subjectivity window. IBD isn't covered, but that's because there's not much to cover: just get a trusted checkpoint and that's your fork choice rule. Uninteresting. If the node wants to do IBD by first downloading all blocks, then executing them, that's an implementation detail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does IBD stand for?

just get a trusted checkpoint and that's your fork choice rule. Uninteresting.

IIRC, trusted checkpoints are only used in combination with state sync (state sync requires a checkpoint), full nodes that sync from genesis and replay all blocks do not require a checkpoint.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does IBD stand for?

Initial Block Download, i.e. what happens in the blockchain reactor.

full nodes that sync from genesis and replay all blocks do not require a checkpoint.

This is actually a problem. The weak subjectivity trusted checkpoint informs the fork choice rule when doing any sync (full, state, only headers) when offline for longer than the unbonding window. It does not affect block validity. We absolutely need to have a trusted checkpoint even when point full sync.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a problem.

I think in the case of IBD, it will switch to the consensus reactor once it caught up and any full node would at least quickly detect that it was fed a fork. In the case that 2/3 of validators are byzantine, the network has to resolve this outside of the protocol anyways.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that it was fed a fork

Over a long enough time scale, the probability of there being a long-range fork in the distant past approaches 100%. But as was previously discussed in a call, it doesn't actually matter if you feed in a trusted checkpoint before starting in expectation of at least one fork or after halting on fork detection. It's an implementation detail.


Under [an honest minority of nodes and synchronous communication](https://arxiv.org/abs/1809.09044) assumptions, passing DAS probabilistically guarantees the block can be fully downloaded. Therefore, the above protocol guarantees light nodes and full nodes will agree on the same head.