Skip to content

Conversation

yongkangc
Copy link
Member

@yongkangc yongkangc commented Oct 10, 2025

Context:

  • As part of our performance work to reduce overhead and improve scheduling, we added worker pooling for multiproof generation.
  • This PR aims to perform cleanup and remove ProofTaskManager as an abstraction as we can now directly dispatch the proofs jobs to workers.

closes: #18801

impact:

Change Impact
Remove run() loop thread -1 thread, -1 channel hop
Direct channel sends ~some time saved per task
Eliminate enum wrapping ~2 allocations saved per task

reference PRs:

yongkangc and others added 30 commits October 7, 2025 06:35
- Added configuration for maximum and minimum storage proof workers.
- Implemented a worker pool for processing storage proof tasks, improving efficiency by reusing transactions.
- Updated `ProofTaskManager` to handle storage proof tasks via a dedicated channel.
- Enhanced metrics to track storage proof requests and fallback scenarios.
- Adjusted existing tests to accommodate the new storage worker functionality.
- Enhanced documentation for `StorageProofJob` to clarify its current unused status and potential for future type-safe design.
- Updated comments in `ProofTaskManager` regarding the handling of on-demand tasks and the possibility of refactoring to a more type-safe enum.
- Improved logging for worker pool disconnection scenarios, emphasizing fallback to on-demand execution.
…clarity

- Updated comments in `ProofTaskManager` to enhance clarity regarding on-demand transaction handling and queue management.
- Renamed `pending_on_demand` to `on_demand_queue` for better understanding of its purpose.
- Adjusted the `new` function documentation to reflect the correct allocation of concurrency budget between storage workers and on-demand transactions.
- Improved the `queue_proof_task` method to use the new queue name.
…ement

- Removed the unused `OnDemandTask` enum and updated comments in `ProofTaskManager` to clarify the distinction between storage worker pool and on-demand execution.
- Enhanced documentation to better describe the public interface and task submission process.
- Improved clarity regarding transaction handling and execution paths for proof requests.
- Eliminated the `storage_proof_workers` field and related constants from `TreeConfig`.
- Updated the default implementation and related methods to reflect the removal, streamlining the configuration structure.
- Improved comments in `ProofTaskManager` and related functions for better clarity on task management and processing.
- Updated queue capacity calculation to use 4x buffering, reducing fallback to slower on-demand execution during burst loads.
- Removed redundant variable assignments to streamline the code.
…ursor factories

- Introduced pre-created cursor factories in `storage_worker_loop` to reduce overhead during proof computation.
- Updated `compute_storage_proof` to accept cursor factories as parameters, enhancing efficiency and clarity.
- Improved logging to provide better insights into proof task calculations.
- not change the logic for pending_tasks and proof_tasks_txs (on-demand proofs) and just continue using it for the BlindedAccountNode requests, but start using dedicated storage workers for StorageProof and BlindedStorageNode requests
- Added a function to determine the default number of storage worker threads based on available parallelism.
- Updated TreeConfig to include a storage_worker_count field, initialized with the default value.
- Modified payload processor to utilize the new storage_worker_count instead of a hardcoded value.
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated no new comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@mattsse mattsse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

left some suggestions, iirc additional changes to the worker pool are planned anyway and we could tackle those separately

let _ = self.proof_task_handle.queue_task(ProofTaskKind::StorageProof(input, sender));
receiver
self.proof_task_handle
.queue_storage_proof(input)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we rename these after we merge, because I find queue very confusing here because this only sends

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thats true -> should rename it as send_task

Comment on lines +882 to +883
for worker_id in 0..storage_worker_count {
let provider_ro = view.provider_ro()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is still something we pay for upfront, meaning this isn't done in the background

this does mean it currently takes more time to set this up if we bump the worker count?

I think ideally we return the channels right away and do this setup in the background so that we don't block here:

let proof_handle = match ProofTaskManagerHandle::new(

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sense, good idea for us to do in bg

Comment on lines +909 to +910
for worker_id in 0..account_worker_count {
let provider_ro = view.provider_ro()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Base automatically changed from yk/worker_pool_acc to main October 15, 2025 00:41
@yongkangc yongkangc requested a review from gakonst as a code owner October 15, 2025 00:41
/// eliminating the need for a routing thread. All handles share reference-counted
/// channels, and workers shut down gracefully when all handles are dropped.
#[derive(Debug, Clone)]
pub struct ProofTaskManagerHandle {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this a tiny bit confusing that we have a ProofTaskManagerHandle, but no ProofTaskManager, as there's usually a pattern of having a fn handle(&self) -> Handle method. Not necessary to address in this PR, can be a followup.

self.account_work_tx
.send(AccountWorkerJob::AccountMultiproof { input, result_sender: tx })
.map_err(|_| {
ProviderError::other(std::io::Error::other("account workers unavailable"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would also like to address this in a followup cleanup, this feels like misuse of std::io::Error

let (storage_work_tx, storage_work_rx) = unbounded::<StorageWorkerJob>();
let (account_work_tx, account_work_rx) = unbounded::<AccountWorkerJob>();

tracing::info!(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tracing::info!(
tracing::debug!(

@yongkangc yongkangc enabled auto-merge October 15, 2025 01:36
@yongkangc yongkangc added this pull request to the merge queue Oct 15, 2025
Merged via the queue into main with commit 11c9949 Oct 15, 2025
39 of 40 checks passed
@yongkangc yongkangc deleted the yk/pool_clean branch October 15, 2025 02:07
@github-project-automation github-project-automation bot moved this from In Progress to Done in Reth Tracker Oct 15, 2025
@jenpaff jenpaff moved this from Done to Completed in Reth Tracker Oct 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Completed

Development

Successfully merging this pull request may close these issues.

Remove ProofTaskManager to simplify

5 participants