Deal with case where data becomes available after time out #321

musalbas · 2021-05-11T16:45:01Z

In the current implementation, the data availability check is failed if data the sampling timeout is exceeded.

We need to deal with the case where (i) the block data becomes available after the timeout and (ii) the block is part of the canonical chain that has consensus. In this case, the block should be treated as available, otherwise we will end up with a chain split with newly bootstrapped clients who see the block as available.

Perhaps one way of dealing with this could be, if a new block is received that is part of the canonical chain that has consensus, but the previous block was seen as unavailable, the node should re-validate the data availability of the previous block, as a pre-condition to accepting that new block.

Action Items

clarify tendermint fullnodes behavior if they ignored a block (bc/ it was invalid) and receive another one with a larger height (do they recheck the in-between blocks?)
lay out a plan depending on the outcome of the above

liamsi · 2021-05-22T09:45:08Z

In the current implementation, the data availability check is failed if data the sampling timeout is exceeded.

Yes but a) it will retry and b) if it is triggered to look for a new height, it will ofc revisit past heights that did not pass validation. In case of DAS, this means the current light client would also recheck availability.

Perhaps one way of dealing with this could be, if a new block is received that is part of the canonical chain that has consensus, but the previous block was seen as unavailable, the node should re-validate the data availability of the previous block, as a pre-condition to accepting that new block.

As mentioned above, if the light client is requested to update to a new height, it will check from the last height that passed all validation (incl. availability) to the latest height. It does not matter if the heights in between previously failed or not. The current implementation always checks for consensus first, and then for availability.

Even if we switched to a pure p2p light client instead of the current RPC / proxy one, this behavior is baked into the verification logic. Does this mean, we can close this issue? Or is there more to it?

musalbas · 2021-05-24T10:29:20Z

As mentioned above, if the light client is requested to update to a new height, it will check from the last height that passed all validation (incl. availability) to the latest height. It does not matter if the heights in between previously failed or not. The current implementation always checks for consensus first, and then for availability.

What about for Tendermint nodes, do they also have this behaviour?

liamsi · 2021-06-08T13:18:59Z

Need to double-check. Updated the opening comment with actionable tasks. It's worth noting that we don't yet have integrated sampling in tendermint fullnodes. So I assume the question is, what happens if a fullnodes rejects a block at height h but receives an otherwise valid block with height h+d.

It would help, if we had clarity on what a lazyledger fullnode actually is. Currently, in the specs:

Full nodes provide the strongest security guarantees. Block bodies do not need to be stored.

Block headers: Extended Block Headers

Block bodies: Full Bodies

Transactions: Full Transactions

https://github.com/lazyledger/lazyledger-specs/blob/a32cba77ab2d3b754b6bf2edb056bf84f2e6be7d/src/specs/node_types.md#node-type-definitions

adlerjohn · 2021-06-08T13:29:50Z

Putting my thoughts down:

full nodes fully download all block bodies, therefore they won't ever accept a block unless it is deterministically (rather than probabilistically i.e. via DAS) available. As such, they don't care about blocks that are made available after the fact, since they'll never re-org. The revealed blocks would simply be equivocation. That being said, the node should half if it detects an equivocation from >1/3 of stake.
partial nodes don't fully download all block bodies, so things get more tricky here.

liamsi · 2021-06-09T10:11:33Z

partial nodes don't fully download all block bodies, so things get more tricky here.

Yeah, I'm pretty sure when @musalbas was asking if Tendermint fullnodes have the same behaviour as I described for the light clients, he had partial nodes in mind. The wording is a bit confusing here. My understanding of how we want to change Tendermint fullnodes, is described here: #384 (reply in thread)

liamsi · 2021-06-30T13:37:41Z

I'm closing this issue: For lazyledger-core there are no changes necessary. We might have to revisit this in the context of light clients and partial nodes. But for me it seems clear how to proceed from celestiaorg/celestia-specs#179

liamsi mentioned this issue Sep 19, 2022

specs: consensus leader selection & fork choice celestiaorg/celestia-app#735

Closed

4 tasks

liamsi mentioned this issue Jun 12, 2021

ADR: DAS node-types aka partial nodes #388

Closed

adlerjohn mentioned this issue Jun 14, 2021

Add fork choice rule with DAS analysis celestiaorg/celestia-specs#179

Merged

adlerjohn self-assigned this Jun 16, 2021

adlerjohn added this to Sprint Pool (Unassigned) in Weekly Sprint (core/app/libs) via automation Jun 16, 2021

adlerjohn moved this from Sprint Pool (Unassigned) to Done in Weekly Sprint (core/app/libs) Jun 16, 2021

liamsi moved this from Done to Assigned & Part of this Iteration in Weekly Sprint (core/app/libs) Jun 16, 2021

liamsi closed this as completed Jun 30, 2021

Weekly Sprint (core/app/libs) automation moved this from Assigned & Part of this Iteration to Done Jun 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deal with case where data becomes available after time out #321

Deal with case where data becomes available after time out #321

musalbas commented May 11, 2021 •

edited by liamsi

Loading

liamsi commented May 22, 2021 •

edited

Loading

musalbas commented May 24, 2021

liamsi commented Jun 8, 2021 •

edited

Loading

adlerjohn commented Jun 8, 2021

liamsi commented Jun 9, 2021

liamsi commented Jun 30, 2021

Deal with case where data becomes available after time out #321

Deal with case where data becomes available after time out #321

Comments

musalbas commented May 11, 2021 • edited by liamsi Loading

Action Items

liamsi commented May 22, 2021 • edited Loading

musalbas commented May 24, 2021

liamsi commented Jun 8, 2021 • edited Loading

adlerjohn commented Jun 8, 2021

liamsi commented Jun 9, 2021

liamsi commented Jun 30, 2021

musalbas commented May 11, 2021 •

edited by liamsi

Loading

liamsi commented May 22, 2021 •

edited

Loading

liamsi commented Jun 8, 2021 •

edited

Loading