-
Notifications
You must be signed in to change notification settings - Fork 282
feat(permissionless batches): recovery mode after permissionless batches #1115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(permissionless batches): recovery mode after permissionless batches #1115
Conversation
…retrieve data from L1
…s submitted in one transaction
…single transactions
…-use-code-from-l1-follower
…1-follower-mode-update-da-codec
…atch within a set of batches
…erifier-use-code-from-l1-follower
…1-follower-mode-update-da-codec
Warning Rate limit exceeded@Thegaram has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 12 minutes and 59 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (2)
WalkthroughThis pull request introduces new command-line flags for data availability (DA) recovery in both the main and utilities packages. It revises flag definitions, updates the blockchain block-building function with optional signing, and implements additional logging and context cancellation in various DA syncing components. The syncing pipeline now accepts new recovery parameters and includes control flow modifications for DA syncing, along with a patch version bump. Changes
Sequence Diagram(s)sequenceDiagram
participant SP as SyncingPipeline
participant DS as DASyncer
participant BC as BlockChain
participant LOG as Logger
participant CONF as Config
SP->>CONF: Load recovery settings (RecoveryMode, InitialL1Block, etc.)
SP->>DS: Call SyncOneBlock(block, override, sign)
DS->>BC: BuildAndWriteBlock(parentBlock, header, txs, sign)
BC-->>DS: Return signed block, status, error
DS->>SP: Return processing result
LOG-->>SP: Log progress and errors
sequenceDiagram
participant BQ as BatchQueue
participant CTX as Context
participant DA as DAQueue
BQ->>CTX: Listen for cancellation signal
alt Context active
BQ->>DA: Process next DA entry
else Context cancelled
BQ-->>BQ: Return nil and cancellation error
end
Possibly related PRs
Suggested reviewers
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
…ent's height so that created blocks will always become part of canonical chain
…less-batches-recovery-rebased
…batches-recovery-rebased Conflicts: params/version.go rollup/da_syncer/da/calldata_blob_source.go rollup/da_syncer/da/commitV7.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (12)
rollup/da_syncer/da_queue.go (3)
14-15
: Provide docstrings for clarity.It would be helpful to add short explanatory comments describing the roles of
l1height
andinitialBatch
, making the code self-documenting and aiding maintainers in understanding how these fields are used and interrelated.
22-25
: Include a doc comment forNewDAQueue
.Document the significance of
initialBatch
in the constructor so that developers understand how it filters out older data upon initialization.
40-45
: Guard excessive calls togetNextData
.In scenarios where the data source is temporarily unavailable, repeatedly calling
getNextData
without a delay might degrade performance. A minor delay or backoff between fetch attempts may help reduce resource usage.rollup/da_syncer/da_syncer.go (2)
18-19
: Documentl2EndBlock
.Add a short doc comment clarifying that
l2EndBlock
serves as the upper bound for synchronization. This helps future maintainers quickly grasp the purpose of this field.
43-51
: Improve error messages for operators.When
SetHead
fails, returning a more descriptive error can aid in diagnosing chain override failures and reduce potential confusion for maintainers or operators.rollup/da_syncer/syncing_pipeline.go (2)
27-31
: Clarify the new recovery fields.The new
RecoveryMode
,InitialL1Block
,InitialBatch
,SignBlocks
, andL2EndBlock
fields have clear functional roles but lack an overarching explanation. A higher-level doc comment tying them together would help future maintainers.
210-210
: Enrich debug logging.It might be advantageous to include the last block or batch index in the debug log for quicker troubleshooting when the pipeline is empty.
rollup/da_syncer/serrors/errors.go (1)
15-15
: Consider using consistent error type forTerminated
.For consistency with other errors in the package (
TemporaryError
andEOFError
), consider wrapping the termination error in asyncError
struct with a new error type.-Terminated = fmt.Errorf("terminated") +terminated Type = iota + 2 // Add new type after 'eof' +Terminated = NewTerminatedError(nil) +func NewTerminatedError(err error) error { + return &syncError{t: terminated, err: err} +}rollup/da_syncer/batch_queue.go (1)
39-43
: LGTM! Context cancellation handling added.The addition of context cancellation handling improves the responsiveness of the
NextBatch
method. Consider moving the select statement inside the loop to reduce the frequency of context checks.for { - select { - case <-ctx.Done(): - return nil, ctx.Err() - default: - } - daEntry, err := bq.DAQueue.NextDA(ctx) if err != nil { + if ctx.Err() != nil { + return nil, ctx.Err() + } return nil, err }rollup/da_syncer/da/commitV0.go (1)
69-74
: Consider initializing all fields in empty constructor.While the event field is now properly initialized, other fields are left at their zero values. Consider explicitly initializing all fields for clarity.
func NewCommitBatchDAV0Empty(event *l1.CommitBatchEvent) *CommitBatchDAV0 { return &CommitBatchDAV0{ + version: encoding.CodecVersionV0, batchIndex: 0, + parentTotalL1MessagePopped: 0, + skippedL1MessageBitmap: []byte{}, + chunks: []*encoding.DAChunkRawTx{}, + l1Txs: []*types.L1MessageTx{}, event: event, } }cmd/utils/flags.go (1)
1674-1688
: Consider adding validation for recovery mode parameters.While the implementation is correct, consider adding validation for the recovery mode parameters to ensure:
- InitialL1Block is less than or equal to the current L1 block
- InitialBatch is valid and exists
- L2EndBlock is greater than the current L2 block
Example validation:
if ctx.IsSet(DARecoveryModeFlag.Name) { cfg.DA.RecoveryMode = ctx.Bool(DARecoveryModeFlag.Name) + if cfg.DA.RecoveryMode { + if ctx.IsSet(DARecoveryInitialL1BlockFlag.Name) { + initialL1Block := ctx.Uint64(DARecoveryInitialL1BlockFlag.Name) + currentL1Block, err := getCurrentL1Block() + if err != nil { + return fmt.Errorf("failed to get current L1 block: %w", err) + } + if initialL1Block > currentL1Block { + return fmt.Errorf("initial L1 block %d is greater than current L1 block %d", initialL1Block, currentL1Block) + } + cfg.DA.InitialL1Block = initialL1Block + } + // Add similar validation for InitialBatch and L2EndBlock + } }core/blockchain.go (1)
1809-1901
: Consider defining custom error types for better error handling.While the current error handling is thorough, consider defining custom error types for specific failure scenarios (e.g.,
ErrBlockSigningFailed
,ErrBlockValidationFailed
). This would make it easier for callers to handle specific error cases programmatically.Example:
+var ( + ErrBlockSigningFailed = errors.New("block signing failed") + ErrBlockValidationFailed = errors.New("block validation failed") +) func (bc *BlockChain) BuildAndWriteBlock(parentBlock *types.Block, header *types.Header, txs types.Transactions, sign bool) (*types.Block, WriteStatus, error) { // ... if sign { if err = bc.engine.Seal(bc, fullBlock, resultCh, stopCh); err != nil { - return nil, NonStatTy, fmt.Errorf("error sealing block %d: %w", fullBlock.Number().Uint64(), err) + return nil, NonStatTy, fmt.Errorf("%w: %v", ErrBlockSigningFailed, err) } } // ... if err = bc.validator.ValidateBody(fullBlock); err != nil { bc.reportBlock(fullBlock, receipts, err) - return nil, NonStatTy, fmt.Errorf("error validating block %d: %w", fullBlock.Number().Uint64(), err) + return nil, NonStatTy, fmt.Errorf("%w: %v", ErrBlockValidationFailed, err) } // ... }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (13)
cmd/geth/main.go
(1 hunks)cmd/utils/flags.go
(2 hunks)core/blockchain.go
(3 hunks)eth/backend.go
(1 hunks)params/version.go
(1 hunks)rollup/da_syncer/batch_queue.go
(1 hunks)rollup/da_syncer/da/calldata_blob_source.go
(3 hunks)rollup/da_syncer/da/commitV0.go
(3 hunks)rollup/da_syncer/da/commitV7.go
(2 hunks)rollup/da_syncer/da_queue.go
(1 hunks)rollup/da_syncer/da_syncer.go
(2 hunks)rollup/da_syncer/serrors/errors.go
(1 hunks)rollup/da_syncer/syncing_pipeline.go
(7 hunks)
🔇 Additional comments (23)
rollup/da_syncer/da_queue.go (2)
33-46
: Validate robustly to avoid potential infinite loops.The
for { ... }
loop strategy works if the data source consistently progresses to higher batch indices. If an external factor preventsgetNextData
from providing entries beyonddq.initialBatch
, the loop may remain indefinite. Consider a safeguard or a suitable timeout/backoff in production scenarios.
50-52
: Skips older entries effectively.The updated logic to skip entries below
dq.initialBatch
is appropriate for the recovery scenario, ensuring only relevant batches are processed.rollup/da_syncer/da_syncer.go (2)
30-30
: Unit-test override and sign behaviors.The new
override
andsign
parameters introduce flexible block handling. Ensure both true/false branches are tested to confirm correct chain rewriting and block signing.
75-77
: Ensure termination handling is complete.Terminating once
l2EndBlock
is reached is logical. Verify that upstream processes or GUIs properly respond toserrors.Terminated
so the system doesn’t remain in a partially updated state.rollup/da_syncer/syncing_pipeline.go (5)
80-91
: Recovery mode parameter checks look good.Requiring
InitialL1Block
andInitialBatch
is robust. Ensure that calls from scripts or CI/CD pipelines accurately toggleRecoveryMode
to avoid partial or unintended states.
101-101
: DAQueue initialization is consistent.Instantiating
DAQueue
with the newinitialL1Block
andconfig.InitialBatch
aligns well with the recovery process. Good job adhering to the newly introduced constructor signature.
127-128
: Override and sign in one step.Forwarding both
RecoveryMode
andSignBlocks
toSyncOneBlock
is convenient. Consider verifying whether partial usage (like only signing but not overriding) is valid in all contexts.
242-243
: Terminated error flow is handled properly.Returning and logging upon
serrors.Terminated
is a clean approach, ensuring explicit closure of the main loop with the desired log output.
261-261
: Double-check chain revert logic.Resetting to
s.config.InitialL1Block
may reorder or remove blocks if a partial override occurred. Confirm the chain store can handle any potential revert gracefully in concurrent or large-scale usage.params/version.go (1)
27-27
: LGTM! Version bump is appropriate.The patch version increment from 7 to 8 correctly reflects the addition of new features (recovery mode) while maintaining backward compatibility.
rollup/da_syncer/da/commitV0.go (1)
177-177
: LGTM! Informative logging added.The log message provides useful context about L1 message availability, which will help with debugging and monitoring.
rollup/da_syncer/da/commitV7.go (1)
172-172
: LGTM! Enhanced visibility with informative logging.The added logging statement provides valuable visibility into L1 message availability status, which helps with monitoring and debugging the syncing process.
rollup/da_syncer/da/calldata_blob_source.go (2)
69-69
: LGTM! Enhanced visibility with debug logging.The added debug logging statement provides valuable visibility into the block range being fetched during rollup event retrieval.
199-201
: LGTM! Improved empty batch creation with event context.The modification ensures that empty commit batches are created with proper context from the commit event.
cmd/geth/main.go (1)
180-184
: LGTM! Well-structured recovery mode configuration flags.The added flags provide comprehensive control over the recovery mode process:
- Recovery mode enablement
- Initial L1 block and batch specifications
- Block signing control
- L2 end block specification
eth/backend.go (3)
591-594
: LGTM! Proper handling of ENR updates during DA syncing.The conditional check ensures that ENR updates are skipped when DA syncing is enabled, preventing potential conflicts.
609-612
: LGTM! Proper handler management during DA syncing.The conditional check ensures that the handler is not started when DA syncing is enabled, maintaining operational consistency.
622-625
: LGTM! Consistent handler cleanup during shutdown.The conditional check ensures that handler cleanup is skipped when DA syncing is enabled, maintaining shutdown consistency.
cmd/utils/flags.go (2)
889-904
: LGTM! Well-structured DA syncing flags.The DA syncing flags are well-defined with clear descriptions and appropriate types for enabling DA syncing and configuring the required API endpoints.
905-924
: LGTM! Comprehensive recovery mode configuration flags.The recovery mode flags provide a complete set of configuration options for the L2 network recovery process, including:
- Recovery mode activation
- Initial L1 block and batch specifications
- Optional block signing with Clique signer support
- L2 end block target
core/blockchain.go (3)
1809-1814
: LGTM! Function signature changes align with recovery mode requirements.The updated signature with the
sign
parameter and the additional return value (*types.Block
) provides the necessary flexibility for the recovery mode implementation.
1834-1871
: LGTM! Robust block signing implementation.The signing logic:
- Preserves the original timestamp
- Prepares and signs the block
- Verifies the signed block with the local consensus engine
This ensures the integrity of the signed blocks during recovery.
1889-1897
: LGTM! Comprehensive block validation.The implementation performs thorough validation:
- Validates block body
- Validates block state
- Reports detailed errors with context
This ensures only valid blocks are processed during recovery.
…batches-recovery-rebased
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
core/blockchain.go (1)
1831-1833
: TODO: Implement Clique verification.The comment indicates missing verification of block signature with Clique. This should be addressed in a follow-up PR.
Would you like me to help track this TODO by opening a new issue for implementing Clique verification?
cmd/utils/flags.go (2)
909-928
: Consider adding validation for numeric values.The numeric flags (
InitialL1Block
,InitialBatch
,L2EndBlock
) should include validation to ensure:
- Values are non-negative
- Initial block/batch is less than end block
- Values are within reasonable ranges
Example validation in
setDA
:if ctx.IsSet(DARecoveryInitialL1BlockFlag.Name) { + initialL1Block := ctx.Uint64(DARecoveryInitialL1BlockFlag.Name) + if initialL1Block < 0 { + Fatalf("Initial L1 block must be non-negative") + } cfg.DA.InitialL1Block = ctx.Uint64(DARecoveryInitialL1BlockFlag.Name) }
1681-1695
: Add validation for related recovery parameters.The configuration setup should validate the relationships between recovery parameters to ensure a valid recovery state. Consider adding checks for:
- If recovery mode is enabled, ensure all required parameters are set
- Validate that initial L1 block and batch are compatible
- Ensure L2 end block is greater than the starting point
Example validation:
func setDA(ctx *cli.Context, cfg *ethconfig.Config) { + // Validate recovery mode parameters + if ctx.Bool(DARecoveryModeFlag.Name) { + if !ctx.IsSet(DARecoveryInitialL1BlockFlag.Name) { + Fatalf("Recovery mode requires initial L1 block") + } + if !ctx.IsSet(DARecoveryInitialBatchFlag.Name) { + Fatalf("Recovery mode requires initial batch") + } + if !ctx.IsSet(DARecoveryL2EndBlockFlag.Name) { + Fatalf("Recovery mode requires L2 end block") + } + + initialL1Block := ctx.Uint64(DARecoveryInitialL1BlockFlag.Name) + initialBatch := ctx.Uint64(DARecoveryInitialBatchFlag.Name) + l2EndBlock := ctx.Uint64(DARecoveryL2EndBlockFlag.Name) + + // Add your validation logic here + if l2EndBlock <= initialBatch { + Fatalf("L2 end block must be greater than initial batch") + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
cmd/geth/main.go
(1 hunks)cmd/utils/flags.go
(2 hunks)core/blockchain.go
(3 hunks)params/version.go
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- params/version.go
- cmd/geth/main.go
🔇 Additional comments (5)
core/blockchain.go (3)
1809-1814
: LGTM! Function signature changes are well-designed.The updated signature with the
sign
parameter and enhanced return type provides good flexibility for the recovery mode implementation.
1834-1871
: LGTM! Block signing implementation is robust and secure.The implementation:
- Preserves the original timestamp during signing
- Verifies the signed block against the local consensus engine
- Handles potential sealing failures
1889-1899
: LGTM! Comprehensive error handling and validation.The implementation includes multiple validation stages with proper error handling:
- Block body validation ensures correct L1 message ordering
- State validation confirms the block's integrity
- Error messages are descriptive and properly wrapped
cmd/utils/flags.go (2)
893-908
: LGTM! Simplified flag definitions.The removal of pointer references for DA flags improves code clarity while maintaining functionality.
909-928
: LGTM! Well-structured recovery mode flags.The new flags provide comprehensive control over the recovery process, allowing users to:
- Enable recovery mode
- Specify initial L1 block and batch
- Control block signing
- Define the L2 end block
if override { | ||
parentBlockNumber = block.PartialHeader.Number - 1 | ||
// reset the chain head to the parent block so that the new block can be inserted as part of the new canonical chain. | ||
err := s.blockchain.SetHead(parentBlockNumber) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any concern that the removed blocks will be re-propagated to us and we'll overwrite the chain? Or that's not the case because we disabled the network protocols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we disabled the network protocols (this is just an extension of the L1 follower mode basically). So we operate in isolation. This is either needed as a preparation step
- for the sequencer to recover from permissionless mode
- a user who wants to submit batches in permissionless mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if there is a long unfinalized chain of L2 blocks, then after we recover from enforced batch mode and re-enable L2 sync, it's possible that this old chain will overwrite the new chain, right?
1. Purpose or design rationale of this PR
Replaces #1073 as it is rebased to the latest changes.
This PR implements a recovery mode after batches have been submitted permissionlessly in the context of permissionless batches. During permissionless mode the L2 network will not produce any valid/canonical L2 chain. Batches can be submitted by anyone directly to the L1.
This PR extends the L1 follower mode to be able to recover state from any given L1 block height
--da.recovery.initiall1block
and batch index--da.recovery.initialbatch
to a given L2 end block--da.recovery.l2endblock
, while overriding any existing local state of the node.Additionally, it allows to re-sign the read blocks with
--da.recovery.signblocks
so that once the sequencer comes back online a valid canonical L2 chain can again be created from the permissionless committed batches.2. PR title
Your PR title must follow conventional commits (as we are doing squash merge for each PR), so it must start with one of the following types:
3. Deployment tag versioning
Has the version in
params/version.go
been updated?4. Breaking change label
Does this PR have the
breaking-change
label?Summary by CodeRabbit
New Features
Refactor
Chore