perf(trie): proofmanager optimisation WIP #18829

yongkangc · 2025-10-02T04:49:26Z

core changes:

add worker pooling logic
simplify channel design

impact:

reduce overhead from task creation
simplify abstractions

closes:

Copilot

Pull Request Overview

This PR implements worker pooling optimization for the ProofTaskManager to reduce overhead from task creation and simplify channel design. The optimization introduces separate worker pools for storage and account proofs, using long-lived database transactions instead of creating new ones for each request.

Adds storage and account worker pools with persistent database transactions
Refactors channel architecture from mpsc to crossbeam channels for better performance
Introduces new metrics for tracking worker pool efficiency and queue depths

Reviewed Changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
crates/trie/parallel/src/stats.rs	Adds new metrics for tracking storage proof immediate vs blocked operations
crates/trie/parallel/src/proof_task_metrics.rs	Extends metrics with queue depth gauges and wait time histograms
crates/trie/parallel/src/proof_task.rs	Core refactor implementing worker pools, new job types, and crossbeam channels
crates/trie/parallel/src/proof.rs	Updates proof building to use new channel types and on-demand storage fetching
crates/trie/parallel/Cargo.toml	Adds crossbeam-channel dependency
crates/engine/tree/src/tree/payload_validator.rs	Propagates error from ProofTaskManager::new
crates/engine/tree/src/tree/payload_processor/multiproof.rs	Updates to use dual manager architecture
crates/engine/tree/src/tree/payload_processor/mod.rs	Creates separate storage and account proof managers
crates/engine/tree/benches/state_root_task.rs	Updates benchmark to handle new error handling
crates/engine/primitives/src/config.rs	Adds configuration for storage and account worker counts

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/stats.rs

crates/trie/parallel/src/proof_task.rs

crates/engine/tree/src/tree/payload_processor/mod.rs

crates/engine/primitives/src/config.rs

crates/engine/tree/src/tree/payload_processor/mod.rs

crates/engine/tree/src/tree/payload_processor/multiproof.rs

crates/trie/parallel/src/proof.rs

crates/engine/tree/src/tree/payload_processor/multiproof.rs

crates/trie/parallel/src/proof_task.rs

yongkangc · 2025-10-03T07:29:05Z

do we even need this type and queuing in general if we could wire
multiproof — StorageProof/AccProof --> Workers
without going through the additional manager

if we use multi receiver (1 per worker) then there’s no need to do manual queueing unless we can optimize chunking or some other preprocessing
so to me the ProofTaskManager intermediary seems redundant now
if we swap out

/// Sender to the account proof task manager.
account_proof_task_handle: ProofTaskManagerHandle<FactoryTx<Factory>>,
/// Sender to the storage proof task manager.
storage_proof_task_handle: ProofTaskManagerHandle<FactoryTx<Factory>>,
in pub struct MultiproofManager<Factory: DatabaseProviderFactory> {

to

    /// Channel for dispatching storage proof work to workers
    storage_work_tx: crossbeam_channel::Sender<ProofJob>,
    /// Channel for dispatching account multiproof work to workers
    account_work_tx: crossbeam_channel::Sender<AccountProofJob<FactoryTx<Factory>>>,
from pub struct ProofTaskManager<Factory: DatabaseProviderFactory> {

spawn multiproof manager + wire it with the proof workers
multi proof manager then dispatches requests (storage,account) directly to the workers
no prooftaskmanager intermediary

- add storage proof workers configuration and integrate into proof task manager

- Introduced `AccountMultiproofInput` structure to handle account multiproof tasks. - Added `execute_account_multiproof_worker` function to manage multiproof execution. - Updated `ProofTaskManager` to support account multiproof workers and task dispatching. - Refactored proof building logic to utilize pre-computed storage proofs for account multiproof generation.

- Added `account_proof_task_handle` to `MultiproofManager` for managing account multiproof tasks. - Updated multiproof calculation logic to queue tasks to the account proof manager. - Improved error handling for account manager task failures. - Refactored test setup to create separate storage and account proof managers.

- wait just in time - Replaced standard mpsc channels with crossbeam channels for improved performance and flexibility in proof task management. - Updated `execute_account_multiproof_worker` to handle storage proof requests on-demand, enhancing efficiency during trie traversal. - Introduced standardized error handling for closed storage managers. - Refactored related structures and functions to accommodate the new channel types and improve overall code clarity.

- Updated `build_account_multiproof_with_storage` to accept storage proof receivers, allowing for lazy fetching of storage proofs during trie traversal. - Removed pre-computed storage proof handling in favor of real-time proof retrieval, enhancing performance and efficiency. - Introduced standardized error handling for closed storage proof channels. - Adjusted related functions and structures to support the new fetching mechanism.

- Added metrics for tracking the depth of storage and account queues. - Implemented recording of wait times for storage and account jobs in the queue. - Updated `ProofTaskMetrics` and `ProofTaskManager` to support new metrics functionality. - Enhanced job structures to include timestamps for enqueued jobs, facilitating wait time calculations.

- Enhanced the storage proof fetching mechanism to include non-blocking receive attempts, improving efficiency during trie traversal. - Implemented error handling for various receive scenarios, ensuring robustness in proof retrieval. - Added tests for streaming fetcher with mixed storage proofs and ordering independence, validating the correctness of the parallel proof task execution.

…nd ordering independence - Implemented two new tests to validate the behavior of parallel proof handling with mixed storage targets and varying storage sizes. - Ensured that the parallel proof results are consistent regardless of the order of completion, enhancing the robustness of the proof task execution. - Updated the test setup to create diverse states for comprehensive coverage of edge cases in proof computation.

…elTrieTracker - Introduced metrics for tracking immediate and blocked storage proofs in `ParallelTrieStats`. - Added methods to increment and retrieve these metrics, enhancing performance monitoring. - Updated `ParallelTrieTracker` to support the new storage proof metrics, ensuring comprehensive statistics collection during trie operations.

…proof managers return type - Added error handling for spawn failures in the state root benchmark to ensure robustness. - Updated the return type of `create_proof_managers` to a type alias for improved clarity and maintainability.

- Updated `ProofTaskManager` to allow separate configuration for storage and account workers, enhancing flexibility in task execution. - Added tests to validate transaction reuse across multiple proofs and ensure robust handling of concurrent storage proofs without deadlocks. - Implemented checks for expected worker counts and transaction management, improving performance monitoring and reliability in proof tasks.

- Add type alias for complex proof managers return type - Fix needless borrow in proof manager creation - Add backticks to documentation - Fix explicit_iter_loop warnings in tests - Fix spawn result handling in benchmarks 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

… tasks - Updated `execute_account_multiproof_worker` to return errors directly instead of using a result sender, improving error propagation. - Wrapped proof computation in panic recovery to prevent worker failures from causing zombie states. - Clamped worker counts to a minimum of 1 to avoid deadlocks, ensuring robust task execution. - Removed unused metrics from `ParallelTrieTracker`, streamlining the structure and focusing on essential statistics. - Adjusted tests to reflect changes in worker count configurations, enhancing test reliability.

- Updated `execute_account_multiproof_worker` to directly return results, improving error handling and simplifying the function signature. - Wrapped account multiproof execution in panic recovery to prevent worker failures from causing disruptions. - Enhanced storage proof fetching with non-blocking receive attempts, improving efficiency and responsiveness. - Introduced a new method `decoded_multiproof_with_stats` to return both multiproof results and performance statistics, aiding in performance monitoring. - Added tests to validate the new metrics tracking for storage proofs, ensuring comprehensive coverage and reliability in proof task execution.

- Reorganized the logic for queuing storage proof requests during trie traversal to ensure all accounts in the extended prefix set are processed, even those with no storage changes. - Removed redundant code for storage proof request queuing, enhancing clarity and maintainability. - Updated the handling of destroyed accounts to avoid unnecessary cloning, improving performance and memory efficiency.

- No storage proof receiver found

- accounts NOT in targets are being encountered

- config.frozen_prefix_sets is frozen ONCE at config creation time (beginning of block execution), but it should be updated with each transaction's state changes

- The config should NOT cache frozen_prefix_sets

- if config.prefix_sets stays at 0

… NOT in frozen_prefix_sets.account_prefix_set - the refactor only queued storage proofs for the account prefix set. When the trie walker hit an address that existed only in the storage prefix set, there was no queued receiver and it fell back to a slow synchronous proof. Queuing storage proofs for the union of account and storage prefix sets (like main) fixes the regression.

- The refactor removed the cache usage from commit 8effbf2, causing every sibling leaf encountered during account trie walking to trigger a full storage trie walk.

Copilot

Pull Request Overview

Copilot reviewed 12 out of 13 changed files in this pull request and generated 3 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/trie/parallel/src/proof_task.rs

crates/engine/tree/src/tree/payload_processor/multiproof.rs

yongkangc · 2025-10-06T09:48:36Z

crates/trie/parallel/src/proof_task.rs

-pub struct ProofTaskManager<Factory: DatabaseProviderFactory> {
-    /// Max number of database transactions to create
-    max_concurrency: usize,
-    /// Number of database transactions created
-    total_transactions: usize,
-    /// Consistent view provider used for creating transactions on-demand
-    view: ConsistentDbView<Factory>,
-    /// Proof task context shared across all proof tasks
-    task_ctx: ProofTaskCtx,
-    /// Proof tasks pending execution
-    pending_tasks: VecDeque<ProofTaskKind>,
-    /// The underlying handle from which to spawn proof tasks
-    executor: Handle,
-    /// The proof task transactions, containing owned cursor factories that are reused for proof
-    /// calculation.
-    proof_task_txs: Vec<ProofTaskTx<FactoryTx<Factory>>>,
-    /// A receiver for new proof tasks.
-    proof_task_rx: Receiver<ProofTaskMessage<FactoryTx<Factory>>>,
-    /// A sender for sending back transactions.
-    tx_sender: Sender<ProofTaskMessage<FactoryTx<Factory>>>,
-    /// The number of active handles.
-    ///
-    /// Incremented in [`ProofTaskManagerHandle::new`] and decremented in
-    /// [`ProofTaskManagerHandle::drop`].
-    active_handles: Arc<AtomicUsize>,
-    /// Metrics tracking blinded node fetches.
-    #[cfg(feature = "metrics")]
-    metrics: ProofTaskMetrics,
+/// Error when storage manager is closed.
+#[inline]
+fn storage_manager_closed_error() -> ParallelStateRootError {
+    ParallelStateRootError::Other("storage manager closed".into())
 }


removing this uncessary abstraction completely

yongkangc · 2025-10-06T10:11:20Z

Note: this is a POC and will be broken down into multiple sub prs with this as the main reference, with the reason being this is too breaking and consists of many complex changes.

mediocregopher · 2025-10-06T09:51:19Z

crates/engine/tree/src/tree/payload_processor/mod.rs

+                    TrieInput::from_state(hashed_state),
+                    &TreeConfig::default(),
+                )
+                .unwrap();


nit: let's expect here

mediocregopher · 2025-10-06T10:05:29Z

crates/trie/parallel/src/stats.rs

            trie: self.trie.finish(),
            precomputed_storage_roots: self.precomputed_storage_roots,
            missed_leaves: self.missed_leaves,
+            // TODO: Remove this after testing. This is to understand the efficiency of the worker


Needs removing?

mediocregopher · 2025-10-06T10:07:13Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

+        debug!(
+            target: "engine::root",
+            "MultiProofConfig::new_from_input: INITIAL prefix_sets account_len={} storage_len={}",
+            prefix_sets.account_prefix_set.len(),
+            prefix_sets.storage_prefix_sets.len(),
+        );


Suggested change

debug!(

target: "engine::root",

"MultiProofConfig::new_from_input: INITIAL prefix_sets account_len={} storage_len={}",

prefix_sets.account_prefix_set.len(),

prefix_sets.storage_prefix_sets.len(),

);

debug!(

target: "engine::root",

account_prefix_sets_len = ?prefix_sets.account_prefix_set.len(),

storage_prefix_sets_len = ?prefix_sets.storage_prefix_sets.len(),

"MultiProofConfig::new_from_input",

);

mediocregopher · 2025-10-06T10:09:42Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

        if sequence >= self.next_to_deliver {
            self.pending_proofs.insert(sequence, update);
+        } else {
+            debug!(


nit: might be better as a trace

mediocregopher · 2025-10-06T10:10:37Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

+            if !self.pending_proofs.is_empty() {
+                let pending_sequences: Vec<u64> =
+                    self.pending_proofs.keys().take(10).copied().collect();
+                debug!(


I think this might be some leftover debugging?

mediocregopher · 2025-10-06T10:11:36Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

+        let old_next = self.next_to_deliver;
        self.next_to_deliver += consecutive_proofs.len() as u64;

+        debug!(


More leftover debugging

mediocregopher · 2025-10-06T10:14:03Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

        consistent_view: ConsistentDbView<Factory>,
        mut input: TrieInput,
    ) -> (TrieInput, Self) {
+        let prefix_sets = Arc::new(std::mem::take(&mut input.prefix_sets));


I think we could freeze the prefix set at this point, rather than cloning/freezing for every multiproof?

mediocregopher · 2025-10-06T10:14:32Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

+        account_proof_task_handle: ProofTaskManagerHandle<FactoryTx<Factory>>,
        storage_proof_task_handle: ProofTaskManagerHandle<FactoryTx<Factory>>,
        max_concurrent: usize,
+        missed_leaves_storage_roots: Arc<DashMap<B256, B256>>,


Does this get used outside of the manager?

mediocregopher · 2025-10-06T10:19:35Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

                        }
+
+                        // Timeout detection: if no progress for 10 seconds, dump diagnostic info
+                        if updates_finished && last_progress_time.elapsed().as_secs() > 10 {


More debugging to be removed?

mediocregopher · 2025-10-06T10:41:27Z

crates/engine/tree/src/tree/payload_processor/multiproof.rs

-                storage_targets,
-                ?source,
-                "Starting multiproof calculation",
+        self.executor.spawn_blocking(move || {


I think after this round of changes is done we should take another pass and see if we can factor out this spawn_blocking.

yongkangc · 2025-10-07T10:21:17Z

POC done

github-project-automation bot added this to Reth Tracker Oct 2, 2025

github-project-automation bot moved this to Backlog in Reth Tracker Oct 2, 2025

yongkangc force-pushed the yk/refactor_storage_multiproof branch from d3e816b to ed56919 Compare October 2, 2025 04:59

yongkangc self-assigned this Oct 2, 2025

yongkangc changed the title ~~wip: proofmanager optimisation~~ perf(trie): proofmanager optimisation WIP Oct 2, 2025

yongkangc requested a review from Copilot October 2, 2025 10:23

Copilot AI reviewed Oct 2, 2025

View reviewed changes

mattsse reviewed Oct 2, 2025

View reviewed changes

yongkangc force-pushed the yk/refactor_storage_multiproof branch 2 times, most recently from 927ad6a to 9b121c0 Compare October 4, 2025 13:19

yongkangc and others added 19 commits October 4, 2025 13:21

feat(engine): wiring up of storage proof with tokio pool

f20d5c8

- add storage proof workers configuration and integrate into proof task manager

preliminary integration of acc proofs

2b38f20

fix for test

2c424fe

fix clippy

e24037f

comment

76675d9

added back stats

47ded5b

yongkangc added 14 commits October 6, 2025 04:31

fix: attempt to fix multiproof generation

3c77e7d

debug log:

5bda712

- No storage proof receiver found

add debug logs again to understan the bug

aa0e9d0

- accounts NOT in targets are being encountered

add debug log:

b89e38d

- config.frozen_prefix_sets is frozen ONCE at config creation time (beginning of block execution), but it should be updated with each transaction's state changes

fix: remove caching of frozen_prefix_set

5aadea6

- The config should NOT cache frozen_prefix_sets

fmt, clippy

0552828

debug logs:

befaf10

- if config.prefix_sets stays at 0

feat: restore caching for missed leaf storage roots

2103862

- The refactor removed the cache usage from commit 8effbf2, causing every sibling leaf encountered during account trie walking to trigger a full storage trie walk.

fixes to compile

38bb377

rm stats

85f95fe

add logs: identify root cause of stall

bb3d9bd

fix clippy

e012683

yongkangc requested a review from Copilot October 6, 2025 09:31

Copilot AI reviewed Oct 6, 2025

View reviewed changes

crates/trie/parallel/src/proof_task.rs Show resolved Hide resolved

crates/trie/parallel/src/proof_task.rs Outdated Show resolved Hide resolved

crates/engine/tree/src/tree/payload_processor/multiproof.rs Show resolved Hide resolved

yongkangc added 4 commits October 6, 2025 09:34

improve comments

36a6390

update comments

2f6de56

fmt

cf86a17

rm comments

1a41987

yongkangc commented Oct 6, 2025

View reviewed changes

mediocregopher reviewed Oct 6, 2025

View reviewed changes

jenpaff moved this from Backlog to In Progress in Reth Tracker Oct 6, 2025

This was referenced Oct 7, 2025

perf(tree): worker pooling for storage in multiproof generation #18883

Closed

perf(tree): worker pooling for storage in multiproof generation #18887

Merged

yongkangc moved this from In Progress to Done in Reth Tracker Oct 7, 2025

yongkangc closed this Oct 7, 2025

yongkangc mentioned this pull request Oct 8, 2025

perf(tree): worker pooling for account proofs #18901

Open

perf(trie): proofmanager optimisation WIP #18829

perf(trie): proofmanager optimisation WIP #18829

Uh oh!

Conversation

yongkangc commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yongkangc commented Oct 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yongkangc commented Oct 6, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yongkangc commented Oct 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yongkangc commented Oct 2, 2025 •

edited

Loading

yongkangc commented Oct 3, 2025 •

edited

Loading