Skip to content

feat(upgrade-signal): wire upgrade-signal support into consensus nodes#3726

Open
PelleKrab wants to merge 14 commits into
mainfrom
pr3/upgrade-signal-consensus
Open

feat(upgrade-signal): wire upgrade-signal support into consensus nodes#3726
PelleKrab wants to merge 14 commits into
mainfrom
pr3/upgrade-signal-consensus

Conversation

@PelleKrab

Copy link
Copy Markdown
Contributor

Summary

  • wires L1 upgrade-signal schedule reads into standalone and integrated consensus startup
  • adds consensus runtime admin refresh support for applying upgrade-signal schedules
  • applies startup upgrade-signal schedules to both EL and CL runtime views in integrated nodes
  • defaults integrated upgrade-signal L1 RPC config from the consensus L1 RPC when no explicit upgrade-signal RPC is provided
  • adds separate opt-in EL and CL upgrade-signal metrics with layer and upgrade labels
  • routes startup apply and runtime admin refresh through shared upgrade-signal schedule validation/application helpers
  • renames public upgrade-signal surfaces from hardfork terminology to upgrade terminology
  • narrows base-upgrade-signal exports

Testing

  • cargo fmt --check
  • git diff --check origin/main..HEAD
  • cargo check -p base-upgrade-signal -p base-consensus-node -p base-execution-cli -p base-node-core -p base
  • cargo test -p base-upgrade-signal --lib
  • cargo test -p base-consensus-cli -p base-execution-cli -p base-node-core --lib upgrade_signal
  • cargo test -p base --bin base upgrade_signal
  • tested on devnet

@PelleKrab PelleKrab requested a review from refcell June 23, 2026 18:14
@cb-heimdall

cb-heimdall commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 -1/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 0
Sum 1

Comment thread bin/base/src/commands/sequencer.rs Outdated
Comment thread bin/base/src/commands/rpc.rs Outdated
Comment thread crates/execution/cli/src/standard_node.rs Outdated
@PelleKrab PelleKrab force-pushed the pr3/upgrade-signal-consensus branch from d39a50b to 1715ab8 Compare June 23, 2026 20:34
Comment thread crates/consensus/rpc/src/admin.rs Outdated
@PelleKrab PelleKrab force-pushed the pr3/upgrade-signal-consensus branch from 1715ab8 to a1ea274 Compare June 23, 2026 21:22
Comment thread crates/utilities/upgrade-signal/src/runtime/summary.rs
Comment thread crates/utilities/upgrade-signal/src/metrics.rs
@PelleKrab PelleKrab force-pushed the pr3/upgrade-signal-consensus branch from a1ea274 to 79bbd92 Compare June 24, 2026 00:49
Comment thread bin/base/src/commands/integrated_upgrade_signal.rs Outdated
Comment thread crates/consensus/service/src/actors/rpc/actor.rs
Comment thread crates/consensus/service/src/service/node.rs Outdated
Comment thread bin/base/src/commands/integrated_upgrade_signal.rs Outdated
sender: oneshot::Sender<RollupConfig>,
) {
if let Err(error) = sender.send((*self.rollup_config).clone()) {
if let Err(error) = sender.send(self.rollup_config.with_runtime_upgrade_overrides()) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is .with_runtime_upgrade_overrides expensive here? We don't want to do any expensive operations here ideally

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most expensive part is a clone, we clone RollupConfig and then we do a pass over to update any timestamp changes. So as it stands its pretty cheap, but we could definitely get this back to a single clone in future. I think it would be good to do this in a separate Dynamic upgrade activation optimization PR, since there are some other changes I would like to make.

Comment thread crates/consensus/service/src/actors/upgrade_signal.rs
Comment thread crates/consensus/service/src/service/builder.rs
Comment thread crates/consensus/service/src/service/node.rs
Comment thread crates/utilities/upgrade-signal/src/config/args.rs Outdated
Comment thread crates/utilities/upgrade-signal/src/config/schedule.rs Outdated

@refcell refcell left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very good. Left a few comments that I think are worth addressing before we merge.

@cb-heimdall

Copy link
Copy Markdown
Collaborator

Review Error for refcell @ 2026-06-24 11:38:25 UTC
User failed mfa authentication, see go/mfa-help

Comment thread bin/base/src/commands/rpc.rs Outdated
Comment thread bin/base/src/commands/sequencer.rs Outdated
Comment on lines +28 to +37
/// Hardfork IDs that are allowed to mutate the local schedule.
/// Upgrade IDs that are allowed to mutate the local schedule.
///
/// If omitted, every read hardfork ID is eligible for application when the selected mode
/// If omitted, every read upgrade ID is eligible for application when the selected mode
/// permits schedule mutation.
#[arg(
long = "upgrade-signal.apply-hardfork-id",
env = "BASE_NODE_UPGRADE_SIGNAL_APPLY_HARDFORK_ID",
long = "upgrade-signal.apply-upgrade-id",
env = "BASE_NODE_UPGRADE_SIGNAL_APPLY_UPGRADE_ID",
value_delimiter = ','
)]
pub apply_hardfork_ids: Vec<String>,
pub apply_upgrade_ids: Vec<String>,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a subtle runtime issue here when upgrade-signal.apply-upgrade-id is configured to a subset of upgrades. Let me walk through what I'm seeing happens.

The runtime refresh path reads the L1 schedule for upgrade_ids, then filters it down to apply_upgrade_ids via application_schedule(). The filtered schedule is then passed into UpgradeSignalRuntimeApplier::apply_schedule(), which creates a fresh RuntimeRegistrySink.

When apply_schedule_to_sink() finishes, it calls staged_sink.finalize(), and RuntimeRegistrySink::finalize() commits by calling RuntimeUpgradeRegistry::replace_overrides(chain_id, updates) or clear_chain() if the filtered updates are empty. So the filtered application schedule is treated as the complete replacement set for the whole chain.

This means a runtime refresh can unintentionally drop existing runtime overrides for upgrades that were not included in apply_upgrade_ids. For example, if the registry currently has {Azul: 100, Cobalt: 300}, and the node is configured to read [Azul, Beryl, Cobalt] but apply only [Beryl], then the L1 read may correctly return all three signals, but application_schedule() reduces it to only {Beryl: 200}. finalize() then replaces the entire chain registry with just {Beryl: 200}, removing Azul and Cobalt even though L1 did not clear them and the user only scoped which upgrade should be applied.

I'm not sure we should allow this behavior.

One option is to just remove apply_upgrade_ids and use upgrade_ids, requiring the applied schedule to be complete. Alternatively, we could support a subset by making the runtime path merge touched IDs instead of replacing the whole registry, e.g. update only the IDs present in the filtered schedule and avoid calling clear_chain() for an empty scoped schedule. If both full replacement and scoped updates are desired, the code probably needs to carry that distinction explicitly so finalize() knows whether it is committing a complete schedule or only a selected subset.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went for the full schedule apply to keep it simple for now, as we don't gain too much from individual upgrade reads. We may want to add a read full schedule to the smart contract to further simplify it.

Comment thread bin/base/src/commands/rpc.rs

@refcell refcell left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another look and I think there's a couple issues with how we configure applying upgrades as well as blocking node startup on failure.

@PelleKrab PelleKrab force-pushed the pr3/upgrade-signal-consensus branch from 79bbd92 to 6156bad Compare June 24, 2026 23:33
Comment thread crates/consensus/cli/src/node.rs Outdated
PelleKrab and others added 14 commits June 24, 2026 17:30
Adds the upgrade signal metrics actor, the admin_refreshUpgradeSignal RPC,
startup schedule application to RollupConfig, and the three rollout modes
(metrics-only -> startup-apply -> runtime-admin) in the consensus CLI/service.

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
The base rpc command reads one pinned startup schedule and applies it to
both the execution and consensus configs before launching embedded services.

Generated with Claude Code

Co-Authored-By: Claude <noreply@anthropic.com>
Default the execution upgrade-signal reader to the consensus L1 RPC in integrated commands, honor the upgrade-signal L1 RPC override during consensus startup reads, and fix the rebased protocol test call sites for the updated L1BlockInfoTx::try_new signature.

Amp-Thread-ID: https://ampcode.com/threads/T-019ef559-ecab-707e-bbb4-f9222db8a5a3
Co-authored-by: Amp <amp@ampcode.com>
@PelleKrab PelleKrab force-pushed the pr3/upgrade-signal-consensus branch from 6156bad to c276f0d Compare June 25, 2026 00:34
@github-actions

Copy link
Copy Markdown
Contributor

Review Summary

This PR refactors upgrade signal handling across execution and consensus layers, introducing unified startup application (apply_startup_to_sinks), the ConsensusNodeStartOptions builder, and extracting the upgrade signal actor into its own module. The overall design is clean and the refactoring reduces duplication.

No new critical issues found beyond what previous inline comments have raised. The prior findings remain relevant:

Still-relevant prior findings

  1. Wire format breaking change (summary.rs, metrics.rs): The field renames (hardfork_idupgrade_id, label hardforkupgrade) change the JSON RPC response shape and Prometheus metric names. This should be documented in release notes so operators can update dashboards and tooling in lockstep.

  2. Refresher silently dropped when admin RPC disabled (rpc/actor.rs:130): When --upgrade-signal.mode=runtime-admin is set but --rpc.enable-admin is not, the refresher is constructed but never wired into the RPC module. A startup warning or configuration validation would prevent operator confusion.

  3. Duplicate info logs on admin refresh (admin.rs + refresher.rs): Both the UpgradeSignalRefresher::refresh() and the previous RPC handler (now removed) logged the same summary fields. The refresher still logs at info! level; verify this is the desired single-log behavior after the handler cleanup.

Resolved prior findings

  • The long method name (start_with_overrides_and_cancellation_and_upgrade_signal_startup) has been properly replaced with the ConsensusNodeStartOptions struct and start_with_options method.
  • The integrated_upgrade_signal.rs concern no longer applies — the startup validation now goes through apply_startup_to_sinks which calls runtime_validation.validate_schedule().

Architecture notes

  • The UpgradeSignalNodeConfig in the consensus service properly resolves to fail-closed validation when no activation admin is provided, which is the right default for standalone consensus nodes.
  • The AlreadyApplied startup mode correctly prevents double-application in embedded (rpc/sequencer) commands where execution already applied the schedule.
  • The UpgradeSignalMetricsActor uses a proper double-select! pattern for cancellation safety during L1 polling.

@github-actions

Copy link
Copy Markdown
Contributor

✅ base-std fork tests: all 616 passed

base/base is fully in sync with the base-std spec.

Dependency Ref Commit
base-std main 4658f1b7
base-anvil 0092692587d8d064dd2c6923ce26a682c58f3694 00926925

@PelleKrab PelleKrab requested a review from refcell June 25, 2026 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants