Skip to content

EOA Production Readiness #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

EOA Production Readiness #14

wants to merge 6 commits into from

Conversation

d4mr
Copy link
Member

@d4mr d4mr commented Jul 10, 2025

Summary by CodeRabbit

  • New Features

    • Introduced an atomic Redis-backed transaction store layer for externally owned accounts, ensuring consistent transaction lifecycle management with concurrency control.
    • Added comprehensive cleanup and reconciliation of submitted transactions with confirmed states, including detailed violation detection and reporting.
    • Enhanced execution routing and queue management with direct Redis connection handling and namespace support.
    • Added event structures and webhook notification envelopes for EOA executor stages, enabling detailed webhook integrations.
    • Introduced a generic webhook notification envelope type and a function to queue webhook notifications for asynchronous delivery.
    • Added a method to create transaction contexts from existing Redis pipelines for atomic job queuing.
  • Improvements

    • Replaced previous transaction store and locking mechanisms with atomic operations using Redis transactions and optimistic concurrency control.
    • Updated transaction lifecycle terminology from "pending" to "submitted" throughout the worker flows.
    • Refined transaction sending logic with scaled delays, improved balance fetching conditions, and explicit signing credential usage.
    • Enhanced error handling with distinct error variants for transaction sending failures and lock loss.
    • Simplified transaction signing interfaces to accept explicit credentials and addresses.
    • Streamlined lock acquisition and release using direct Redis commands for better concurrency management.
    • Centralized Redis client creation and connection management in server initialization and queue management.

Copy link

coderabbitai bot commented Jul 10, 2025

Walkthrough

The entire previous EOA executor store implementation was replaced by a new modular Redis-backed store with atomic transaction support and refined concurrency control. The EOA worker was updated to use this atomic store, with transaction lifecycle states renamed and flows refactored for explicit credential handling, improved locking, and error classification. The execution router and queue manager were adapted to use Redis connection managers and namespaces instead of a shared store instance. New webhook event structures and queuing functions were added to support notification workflows.

Changes

File(s) Change Summary
executors/src/eoa/store.rs Deleted entire file; previous monolithic Redis-backed EOA executor store removed.
executors/src/eoa/store/mod.rs Added new Redis-backed EOA executor store with structured transaction and nonce management, error handling, and namespace support.
executors/src/eoa/store/atomic.rs Added atomic Redis transaction layer with optimistic concurrency control and retry logic for EOA transaction state transitions.
executors/src/eoa/store/submitted.rs Added submitted transaction cleanup logic with atomic reconciliation of confirmed transactions and violation detection.
executors/src/eoa/store/error.rs Added helper methods for constructing lock lost and nonce sync required errors.
executors/src/eoa/worker.rs Refactored worker to use atomic store and Redis connection manager; renamed transaction states (pending→submitted, failed→replaced); updated signing and sending flows to accept explicit signing credentials; improved locking with lease tokens; enhanced error handling.
server/src/execution_router/mod.rs Removed shared EOA executor store field; added Redis connection manager and namespace fields; instantiate EOA executor store locally per request; propagate signing credentials without worker ID.
server/src/queue/manager.rs Removed EOA executor store field; updated constructor to accept Redis client; updated EOA executor worker to use Redis connection manager and namespace instead of store.
server/src/main.rs Created Redis client explicitly; passed Redis client to queue manager; initialized execution router with Redis connection manager and namespace; removed EOA executor store field.
executors/src/eoa/mod.rs Added new public submodule events.
executors/src/eoa/events.rs Added new event structs and enums for EOA executor webhook notifications with methods to generate webhook envelopes for various executor stages.
executors/src/webhook/envelope.rs Added generic BareWebhookNotificationEnvelope struct and conversion method to full webhook notification envelope.
executors/src/webhook/mod.rs Added function to queue webhook envelopes into a webhook job queue with serialization and logging.
twmq/src/lib.rs Added method to create transaction context from existing Redis pipeline for atomic job queuing.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ExecutionRouter
    participant EoaExecutorWorker
    participant AtomicEoaExecutorStore
    participant Chain
    participant Redis

    Client->>ExecutionRouter: execute_eoa(request)
    ExecutionRouter->>EoaExecutorWorker: Submit EOA job (with noop signing credential)
    EoaExecutorWorker->>AtomicEoaExecutorStore: acquire_eoa_lock_aggressively(lease_token)
    AtomicEoaExecutorStore->>Redis: WATCH keys and validate lock ownership
    EoaExecutorWorker->>AtomicEoaExecutorStore: get_submitted_transactions_count()
    alt Work remains
        EoaExecutorWorker->>Chain: send transaction (signed with explicit credential)
        EoaExecutorWorker->>AtomicEoaExecutorStore: atomic_move_borrowed_to_submitted()
        EoaExecutorWorker->>AtomicEoaExecutorStore: add_gas_bump_attempt() / update_health_data()
    else No work
        EoaExecutorWorker->>AtomicEoaExecutorStore: release_eoa_lock()
    end
    EoaExecutorWorker->>AtomicEoaExecutorStore: clean_submitted_transactions()
    AtomicEoaExecutorStore->>Redis: MULTI/EXEC atomic cleanup pipeline
Loading

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Clippy (1.86.0)
Updating crates.io index
Updating git repository `ssh://[email protected]/thirdweb-dev/vault.git`

error: failed to get vault-sdk as a dependency of package engine-aa-core v0.1.0 (/aa-core)

Caused by:
failed to load source for dependency vault-sdk

Caused by:
Unable to update ssh://[email protected]/thirdweb-dev/vault.git?branch=pb%2Fupdate-alloy#b0a72f93

Caused by:
failed to create directory /usr/local/git/db/vault-02bb070e2aa0045e

Caused by:
Permission denied (os error 13)


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0925946 and 1862517.

📒 Files selected for processing (1)
  • twmq/src/lib.rs (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • twmq/src/lib.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: coverage
  • GitHub Check: test
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
executors/src/eoa/worker.rs (2)

759-786: Consider extracting balance update logic into a separate method.

While the optimization to reduce RPC calls is good, the nested conditionals and TODO comment indicate this needs refactoring. Consider extracting this into a dedicated method for clarity:

-        // TODO: refactor this, very ugly
-        if health.balance <= health.balance_threshold {
-            if now - health.balance_fetched_at > HEALTH_CHECK_INTERVAL {
-                let balance = chain
-                    .provider()
-                    .get_balance(scoped.eoa())
-                    .await
-                    .map_err(|e| {
-                        let engine_error = e.to_engine_error(chain);
-                        EoaExecutorWorkerError::RpcError {
-                            message: format!("Failed to get balance: {}", engine_error),
-                            inner_error: engine_error,
-                        }
-                    })?;
-
-                health.balance = balance;
-                health.balance_fetched_at = now;
-                scoped.update_health_data(&health).await?;
-            }
-
-            if health.balance <= health.balance_threshold {
-                tracing::warn!(
-                    "EOA has insufficient balance (<= {} wei), skipping send flow",
-                    health.balance_threshold
-                );
-                return Ok(0);
-            }
+        // Update balance if needed and check if sufficient
+        if !self.ensure_sufficient_balance(&mut health, scoped, chain, now).await? {
+            tracing::warn!(
+                "EOA has insufficient balance (<= {} wei), skipping send flow", 
+                health.balance_threshold
+            );
+            return Ok(0);
         }

Then implement ensure_sufficient_balance as a separate method that handles the balance fetching logic cleanly.


526-527: Performance opportunity: Batch Redis operations as noted in TODO.

The TODO correctly identifies that sequential Redis operations for state transitions could be batched for better performance. This would reduce round trips to Redis and improve throughput.

Would you like me to help implement batched state transitions for borrowed -> submitted and borrowed -> recycled operations?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00832c3 and cc4ff1d.

📒 Files selected for processing (3)
  • executors/src/eoa/store.rs (9 hunks)
  • executors/src/eoa/worker.rs (27 hunks)
  • server/src/execution_router/mod.rs (2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/store.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/worker.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
🧬 Code Graph Analysis (1)
executors/src/eoa/worker.rs (3)
executors/src/eoa/store.rs (4)
  • nonce (454-459)
  • eoa (2349-2351)
  • from (1988-1992)
  • from (1996-2001)
twmq/src/multilane.rs (1)
  • count (283-347)
core/src/error.rs (5)
  • from (251-268)
  • from (272-276)
  • from (452-456)
  • from (460-464)
  • from (468-472)
🔇 Additional comments (11)
server/src/execution_router/mod.rs (1)

432-432: Clarify the purpose of noop_signing_credential field name.

The field name noop_signing_credential is confusing. If this credential is used for actual signing operations in the worker, the "noop" prefix is misleading. Consider renaming it to signing_credential or documenting why it's called "noop".

executors/src/eoa/worker.rs (10)

14-19: LGTM! Import organization looks good.

The addition of chain::{Chain, ChainService} imports improves code organization by consolidating related imports together.


47-51: Good refactor: Direct credential passing improves security.

Replacing worker_id with noop_signing_credential is a solid improvement that allows direct credential handling for no-op transactions, eliminating the need for credential lookup and improving security.


86-96: Excellent error classification improvement.

Separating TransactionSendError from generic RpcError provides better granularity for handling transaction broadcasting failures specifically, which improves debugging and error recovery strategies.


246-278: Excellent terminology improvements for transaction states.

The renaming from Pending→Submitted and Failed→Replaced provides much clearer semantics:

  • "Submitted" accurately reflects transactions that have been broadcast to the network
  • "Replaced" is more precise than "Failed" for transactions superseded by others with the same nonce

This improves code readability and reduces confusion about transaction lifecycle states.


338-393: Smart improvement: Using lease_token for lock management.

Switching from worker_id in job data to lease_token from the job context is a more robust approach. The lease token is managed by the job queue system and ensures proper lock ownership verification, which aligns well with the WATCH-based coordination mentioned in the learnings.


442-456: Good addition: Tracking submitted transactions for complete lifecycle.

Including submitted_count in the work remaining check ensures the worker continues processing until all transactions reach a terminal state (confirmed or replaced). This prevents premature worker termination when transactions are still awaiting confirmation.


1180-1188: Good optimization: Scaled delays prevent nonce conflicts.

The change from a fixed delay to 50 * i milliseconds creates a sensible spacing between transaction broadcasts, reducing the likelihood of nonce conflicts while maintaining reasonable throughput.


1343-1377: Clean refactor: Explicit credential passing and simplified no-op transactions.

The changes improve the no-op transaction flow by:

  1. Accepting credentials explicitly rather than relying on implicit data
  2. Simplifying transaction construction with sensible defaults
  3. Using the new TransactionSendError for better error classification

The implementation is more straightforward and maintainable.


1853-1892: Good separation of concerns in transaction signing.

Decoupling the signing process by explicitly passing from address and credential improves flexibility and makes the signing flow more testable. This allows signing transactions without requiring the full transaction data structure.


1204-1210: Ensure pending transaction handling aligns with the provider’s return type

The call in executors/src/eoa/worker.rs (around lines 1204–1210) now does:

Ok(pending) => {
    // Transaction sent successfully
    match scoped.move_borrowed_to_submitted(
        prepared.nonce,
        &pending.tx_hash().to_string(),
        &prepared.transaction_id,

• Please confirm that chain.provider().send_tx_envelope(...) indeed returns a type exposing tx_hash() (i.e. a pending‐transaction struct) and not, for example, a receipt or unit.
• If the Alloy provider’s API has changed, adjust this extraction or mapping accordingly.

Comment on lines 1852 to 1861
for transaction_id in &oldest_transactions {
let tx_data_key = self.transaction_data_key_name(transaction_id);
let _: () = conn.del(&tx_data_key).await?;
}

// Remove from tracking sets
for transaction_id in &oldest_transactions {
let _: () = conn.zrem(&completed_eoa_key, transaction_id).await?;
let _: () = conn.zrem(&completed_global_key, transaction_id).await?;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Batch Redis operations in pruning methods for better performance.

The pruning methods execute individual del and zrem operations in loops, which is inefficient. For better performance, use Redis pipelines to batch these operations.

Apply this pattern to both pruning methods:

-        // Remove transaction data and tracking
-        for transaction_id in &oldest_transactions {
-            let tx_data_key = self.transaction_data_key_name(transaction_id);
-            let _: () = conn.del(&tx_data_key).await?;
-        }
-
-        // Remove from tracking sets
-        for transaction_id in &oldest_transactions {
-            let _: () = conn.zrem(&completed_eoa_key, transaction_id).await?;
-            let _: () = conn.zrem(&completed_global_key, transaction_id).await?;
-        }
+        // Batch remove transaction data and tracking
+        let mut pipeline = twmq::redis::pipe();
+        for transaction_id in &oldest_transactions {
+            let tx_data_key = self.transaction_data_key_name(transaction_id);
+            pipeline.del(&tx_data_key);
+            pipeline.zrem(&completed_eoa_key, transaction_id);
+            pipeline.zrem(&completed_global_key, transaction_id);
+        }
+        let _: Vec<twmq::redis::Value> = pipeline.query_async(&mut conn).await?;

Also applies to: 1894-1902

🤖 Prompt for AI Agents
In executors/src/eoa/store.rs around lines 1852 to 1861 and also 1894 to 1902,
the code performs individual Redis `del` and `zrem` operations inside loops,
which is inefficient. Refactor these loops to use Redis pipelines by batching
all `del` commands together and all `zrem` commands together in a single
pipeline execution. This will reduce the number of round trips to Redis and
improve performance.

};

// Get all EOA-specific completed transaction keys
let keys: Vec<String> = conn.keys(&pattern).await?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Replace KEYS command with SCAN to avoid blocking Redis.

Using keys() can block Redis in production environments with many keys. Use SCAN for non-blocking iteration.

Replace the KEYS usage with SCAN:

-        // Get all EOA-specific completed transaction keys
-        let keys: Vec<String> = conn.keys(&pattern).await?;
+        // Get all EOA-specific completed transaction keys using SCAN
+        let mut keys = Vec::new();
+        let mut cursor = 0u64;
+        loop {
+            let (new_cursor, batch): (u64, Vec<String>) = twmq::redis::cmd("SCAN")
+                .arg(cursor)
+                .arg("MATCH")
+                .arg(&pattern)
+                .arg("COUNT")
+                .arg(100)
+                .query_async(&mut conn)
+                .await?;
+            keys.extend(batch);
+            cursor = new_cursor;
+            if cursor == 0 {
+                break;
+            }
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let keys: Vec<String> = conn.keys(&pattern).await?;
// Get all EOA-specific completed transaction keys using SCAN
let mut keys = Vec::new();
let mut cursor = 0u64;
loop {
let (new_cursor, batch): (u64, Vec<String>) = twmq::redis::cmd("SCAN")
.arg(cursor)
.arg("MATCH")
.arg(&pattern)
.arg("COUNT")
.arg(100)
.query_async(&mut conn)
.await?;
keys.extend(batch);
cursor = new_cursor;
if cursor == 0 {
break;
}
}
🤖 Prompt for AI Agents
In executors/src/eoa/store.rs at line 1924, replace the blocking Redis KEYS
command with the non-blocking SCAN command. Modify the code to asynchronously
iterate over keys matching the pattern using SCAN, collecting them into a
Vec<String> without blocking Redis, ensuring efficient and safe key retrieval in
production environments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (7)
executors/src/eoa/store/submitted.rs (1)

188-190: Address the TODO for queuing success jobs.

The cleanup operation moves transactions to success state but doesn't queue the success jobs. This could lead to missing webhook notifications or other post-success processing.

Would you like me to implement the success job queuing logic or create an issue to track this task?

executors/src/eoa/store/atomic.rs (1)

259-259: Fix trailing comma in documentation example.

Remove the trailing comma after namespace parameter.

-/// let base_store = EoaExecutorStore::new(redis, namespace, );
+/// let base_store = EoaExecutorStore::new(redis, namespace);
executors/src/eoa/store/mod.rs (1)

67-68: Add TODO comment for attempt_number field.

The attempt_number field is initialized to 0 and updated later when reading all attempts. Consider adding a TODO comment to make this clearer.

     pub details: Signed<TypedTransaction>,
     pub sent_at: u64, // Unix timestamp in milliseconds
+    // TODO: This is set to 0 initially and updated when reading all attempts
     pub attempt_number: u32,
executors/src/eoa/worker.rs (4)

427-444: Consider using AtomicEoaExecutorStore's release method.

The current implementation manually deletes the lock key, but AtomicEoaExecutorStore provides a release_eoa_lock() method that handles this with proper error handling. However, this would require passing the store instance, which might not be available in the hook context.


492-492: Valid TODO: Batch Redis operations for better performance.

The sequential Redis operations could be slow with many borrowed transactions. Consider using Redis pipelines to batch the state transitions.

Would you like me to implement the batched version or create an issue to track this optimization?


694-721: Refactor nested balance checking logic.

The balance update logic works but the nested conditions make it hard to follow. Consider extracting to a helper method.

async fn should_update_balance(&self, health: &EoaHealth, now: u64) -> bool {
    health.balance <= health.balance_threshold 
        && now - health.balance_fetched_at > HEALTH_CHECK_INTERVAL
}

// Then in send_flow:
if self.should_update_balance(&health, now).await {
    let balance = chain.provider().get_balance(scoped.eoa()).await
        .map_err(|e| /* ... */)?;
    
    health.balance = balance;
    health.balance_fetched_at = now;
    scoped.update_health_data(&health).await?;
}

if health.balance <= health.balance_threshold {
    tracing::warn!(/* ... */);
    return Ok(0);
}

763-769: Remove commented-out code.

This commented code appears to be unused and should be removed to keep the codebase clean.

-        // let highest_submitted_nonce_txs =
-        //     scoped.get_highest_submitted_nonce_tranasactions().await?;
-
-        // let highest_submitted_nonce = highest_submitted_nonce_txs
-        //     .first()
-        //     .and_then(|tx| Some(tx.nonce));
-
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc4ff1d and 81d56cc.

📒 Files selected for processing (10)
  • executors/src/eoa/mod.rs (1 hunks)
  • executors/src/eoa/store.rs (0 hunks)
  • executors/src/eoa/store/atomic.rs (1 hunks)
  • executors/src/eoa/store/error.rs (1 hunks)
  • executors/src/eoa/store/mod.rs (1 hunks)
  • executors/src/eoa/store/submitted.rs (1 hunks)
  • executors/src/eoa/worker.rs (46 hunks)
  • server/src/execution_router/mod.rs (4 hunks)
  • server/src/main.rs (2 hunks)
  • server/src/queue/manager.rs (2 hunks)
💤 Files with no reviewable changes (1)
  • executors/src/eoa/store.rs
✅ Files skipped from review due to trivial changes (1)
  • executors/src/eoa/mod.rs
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/store/error.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
server/src/execution_router/mod.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
server/src/queue/manager.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
executors/src/eoa/store/submitted.rs (1)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
executors/src/eoa/store/atomic.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/store/mod.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/worker.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
🧬 Code Graph Analysis (1)
executors/src/eoa/store/error.rs (1)
executors/src/eoa/store/atomic.rs (1)
  • eoa (282-284)
🔇 Additional comments (11)
server/src/main.rs (1)

54-54: Clean architectural refactoring for Redis client management.

The centralization of Redis client creation and explicit passing of connection managers improves modularity and makes the dependencies clearer.

Also applies to: 57-57, 78-79

executors/src/eoa/store/error.rs (1)

1-22: Well-structured error helper methods.

The error constructors properly encapsulate the context needed for debugging lock loss and nonce synchronization issues, which aligns with the WATCH-based coordination pattern mentioned in the learnings.

server/src/execution_router/mod.rs (1)

413-421: Clean refactoring to local store instantiation.

The change from shared store to local instantiation with explicit Redis connection manager improves isolation and makes the store lifecycle clearer.

server/src/queue/manager.rs (1)

50-50: Consistent refactoring for Redis connection management.

The changes properly propagate the new Redis client management pattern through the queue layer, maintaining consistency with the broader architectural changes.

Also applies to: 215-216

executors/src/eoa/store/submitted.rs (2)

19-65: Well-implemented transaction serialization with robust error handling.

The SubmittedTransaction struct properly handles Redis string parsing with appropriate error logging for malformed data. The format documentation is clear and helpful.


216-276: Comprehensive violation detection for data integrity.

The detect_violations function thoroughly identifies anomalies including cross-nonce duplications, multiple confirmations per nonce, and missing receipts. This provides excellent observability for debugging transaction processing issues.

executors/src/eoa/store/atomic.rs (3)

19-34: Well-designed trait for atomic Redis operations.

The separation of validation and operation phases is excellent, allowing precondition checks before pipeline execution. The trait provides a clean abstraction for WATCH/MULTI/EXEC patterns.


387-464: Robust implementation of lock-protected operations.

The method properly implements WATCH/MULTI/EXEC with exponential backoff retry logic. Lock ownership is validated both before and after operations, ensuring consistency.


594-812: Consistent and well-structured atomic operations.

All atomic methods follow the same pattern using the execute_with_watch_and_retry helper, ensuring consistency and proper error handling across operations.

executors/src/eoa/store/mod.rs (1)

590-621: Well-implemented transaction addition with atomic operations.

Good use of pipeline to atomically store transaction data and add to pending queue. The timestamp-based ordering in the sorted set allows processing transactions in FIFO order.

executors/src/eoa/worker.rs (1)

254-424: Excellent refactoring to use atomic store.

The worker has been successfully refactored to use the new AtomicEoaExecutorStore. The aggressive lock acquisition and WATCH-based coordination mentioned in the learnings are properly implemented. The error handling and state transitions are well-structured.

"eoa_{}_{}",
eoa_execution_options.from, base_execution_options.chain_id
),
noop_signing_credential: signing_credential,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Misleading field name noop_signing_credential.

The field name suggests this is a placeholder or no-operation credential, but it's actually passed as the real signing credential for EOA transactions. Consider renaming to signing_credential or fallback_signing_credential for clarity.

-            noop_signing_credential: signing_credential,
+            signing_credential: signing_credential,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
noop_signing_credential: signing_credential,
signing_credential: signing_credential,
🤖 Prompt for AI Agents
In server/src/execution_router/mod.rs at line 440, the field named
`noop_signing_credential` is misleading because it is used as the actual signing
credential for EOA transactions. Rename this field to a clearer name such as
`signing_credential` or `fallback_signing_credential` to accurately reflect its
purpose and improve code readability.

Comment on lines +225 to +226
// Add nonce to recycled set (with timestamp as score)
pipeline.zadd(&self.recycled_key, self.nonce, self.nonce);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Comment inconsistency with implementation.

The comment mentions "with timestamp as score" but the implementation uses nonce as both member and score. Either update the comment to reflect the actual behavior or use a timestamp if that was the intended design.

-        // Add nonce to recycled set (with timestamp as score)
-        pipeline.zadd(&self.recycled_key, self.nonce, self.nonce);
+        // Add nonce to recycled set (ordered by nonce value)
+        pipeline.zadd(&self.recycled_key, self.nonce, self.nonce);
🤖 Prompt for AI Agents
In executors/src/eoa/store/atomic.rs around lines 225 to 226, the comment states
that the nonce is added to the recycled set with a timestamp as the score, but
the code uses the nonce itself as the score. To fix this, either update the
comment to say that the nonce is used as the score or modify the code to use the
current timestamp as the score if that was the intended behavior.

Comment on lines +491 to +492
/// self.execute_with_watch_and_retry(, worker_id, &safe_tx).await?;
/// ```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix incorrect example in documentation.

The example shows passing worker_id as a parameter, but the method signature doesn't include this parameter.

-/// self.execute_with_watch_and_retry(, worker_id, &safe_tx).await?;
+/// self.execute_with_watch_and_retry(&safe_tx).await?;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/// self.execute_with_watch_and_retry(, worker_id, &safe_tx).await?;
/// ```
/// self.execute_with_watch_and_retry(&safe_tx).await?;
///
🤖 Prompt for AI Agents
In executors/src/eoa/store/atomic.rs around lines 491 to 492, the documentation
example incorrectly shows passing `worker_id` as a parameter to the method
`execute_with_watch_and_retry`, but the method signature does not accept this
parameter. Update the example to remove `worker_id` from the argument list so it
matches the actual method signature.

///
/// This will return all hash:id pairs for the highest nonce
#[tracing::instrument(skip_all)]
pub async fn get_highest_submitted_nonce_tranasactions(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo in method name.

The method name contains a typo: "tranasactions" should be "transactions".

-    pub async fn get_highest_submitted_nonce_tranasactions(
+    pub async fn get_highest_submitted_nonce_transactions(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub async fn get_highest_submitted_nonce_tranasactions(
pub async fn get_highest_submitted_nonce_transactions(
🤖 Prompt for AI Agents
In executors/src/eoa/store/mod.rs at line 638, the method name
get_highest_submitted_nonce_tranasactions contains a typo. Rename the method to
get_highest_submitted_nonce_transactions by correcting "tranasactions" to
"transactions" to fix the spelling error.

@@ -1178,7 +1129,7 @@ where
.map(|(i, prepared)| async move {
// Add delay for ordering (except first transaction)
if i > 0 {
sleep(Duration::from_millis(50)).await; // 50ms delay between consecutive nonces
sleep(Duration::from_millis(50 * i as u64)).await; // 50ms delay between consecutive nonces
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider capping the send delay to prevent excessive waits.

The linear scaling of delays (50ms * index) could lead to large delays for batches. For 20 transactions, the last one waits 950ms. Consider capping or using a different strategy.

-sleep(Duration::from_millis(50 * i as u64)).await; // 50ms delay between consecutive nonces
+// Cap delay at 200ms to prevent excessive waits
+let delay_ms = (50 * i as u64).min(200);
+sleep(Duration::from_millis(delay_ms)).await;
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sleep(Duration::from_millis(50 * i as u64)).await; // 50ms delay between consecutive nonces
// Cap delay at 200ms to prevent excessive waits
let delay_ms = (50 * i as u64).min(200);
sleep(Duration::from_millis(delay_ms)).await;
🤖 Prompt for AI Agents
In executors/src/eoa/worker.rs at line 1132, the delay before sending
transactions increases linearly with the index, which can cause excessive wait
times for large batches. Modify the code to cap the delay at a maximum value,
for example, use the minimum between 50ms times the index and a fixed upper
limit like 500ms. This prevents the delay from growing indefinitely while
maintaining staggered sends.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
executors/src/eoa/worker.rs (1)

1137-1137: Consider capping the send delay to prevent excessive waits.

The linear scaling of delays (50ms * index) could lead to large delays for batches. For 20 transactions, the last one waits 950ms. Consider capping or using a different strategy.

🧹 Nitpick comments (7)
executors/src/webhook/mod.rs (1)

482-482: Remove unnecessary return keyword.

The return keyword is not needed here as this is the last expression in the closure.

Apply this diff:

-                return (payload, webhook_notification_envelope);
+                (payload, webhook_notification_envelope)
executors/src/eoa/events.rs (1)

48-119: Consider removing unnecessary clones for better performance.

The methods are well-structured, but several .clone() calls could be avoided by taking ownership of the parameters instead of borrowing.

For example, in send_attempt_success_envelopes:

 pub fn send_attempt_success_envelopes(
     &self,
-    submitted_transaction: SubmittedTransaction,
+    submitted_transaction: SubmittedTransaction,
 ) -> BareWebhookNotificationEnvelope<SerializableSuccessData<SubmittedTransaction>> {
     BareWebhookNotificationEnvelope {
         transaction_id: self.transaction_data.transaction_id.clone(),
         executor_name: EXECUTOR_NAME.to_string(),
         stage_name: EoaExecutorStage::SendAttempt.to_string(),
         event_type: StageEvent::Success,
         payload: SerializableSuccessData {
-            result: submitted_transaction.clone(),
+            result: submitted_transaction,
         },
     }
 }

Similar optimizations can be applied to other methods where the parameter is already owned.

executors/src/eoa/store/submitted.rs (3)

89-89: Fix typo in comment.

"uknown" should be "unknown".

Apply this diff:

-    /// Any nonces that have no confirmations (transactions we sent got replaced by a different one uknown to us)
+    /// Any nonces that have no confirmations (transactions we sent got replaced by a different one unknown to us)

190-191: Address TODO comment for production readiness.

The TODO comment indicates missing success job queuing functionality that should be implemented for production readiness.

This appears to be part of the webhook notification system. The success jobs should likely trigger webhook notifications for confirmed transactions. Would you like me to help implement this or create an issue to track this requirement?


22-48: Consider improving error handling in parsing logic.

The from_redis_strings method filters out invalid entries but only logs errors. For production readiness, consider whether parsing failures should be handled more explicitly or if additional validation is needed.

Consider adding metrics or more structured error handling:

                } else {
-                    tracing::error!("Invalid queued_at timestamp: {}", tx.0);
+                    tracing::error!(
+                        redis_string = %tx.0,
+                        "Invalid queued_at timestamp format in submitted transaction"
+                    );
+                    // Consider adding metrics here for monitoring
                    None
                }
executors/src/eoa/worker.rs (2)

497-497: Address the batching TODO for performance optimization.

The sequential processing of borrowed state transitions could become a bottleneck under high load. Consider implementing batch operations for better performance.

Would you like me to help design a batched approach for these atomic operations?


768-773: Remove commented code or implement the functionality.

The commented code for highest submitted nonce transactions should either be removed if not needed or properly implemented if required.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81d56cc and 67a7ab7.

📒 Files selected for processing (7)
  • executors/src/eoa/events.rs (1 hunks)
  • executors/src/eoa/mod.rs (1 hunks)
  • executors/src/eoa/store/submitted.rs (1 hunks)
  • executors/src/eoa/worker.rs (46 hunks)
  • executors/src/webhook/envelope.rs (1 hunks)
  • executors/src/webhook/mod.rs (3 hunks)
  • server/src/queue/manager.rs (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • executors/src/eoa/mod.rs
  • server/src/queue/manager.rs
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/events.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/store/submitted.rs (1)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
executors/src/eoa/worker.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
🔇 Additional comments (12)
executors/src/webhook/envelope.rs (1)

44-71: Well-designed webhook envelope pattern.

The separation between BareWebhookNotificationEnvelope and WebhookNotificationEnvelope is a clean design pattern that allows for composable webhook notification creation. The conversion method appropriately generates metadata like notification IDs while preserving the core notification data.

executors/src/webhook/mod.rs (1)

435-507: Well-implemented webhook queuing function.

The function properly handles serialization, job creation, and logging. The error handling with Result types is appropriate, and the use of structured logging provides good observability.

executors/src/eoa/events.rs (1)

35-44: Clean enum design with proper Display implementation.

The EoaExecutorStage enum with its Display implementation follows Rust conventions well and provides clear string representations for the different stages.

executors/src/eoa/store/submitted.rs (1)

107-216: Well-implemented atomic transaction pattern.

The SafeRedisTransaction implementation properly handles Redis WATCH operations, validation, and atomic updates. The cleanup logic correctly reconciles submitted transactions with confirmed ones while detecting various violation scenarios. This follows the learnings about WATCH-based coordination for safe Redis operations.

executors/src/eoa/worker.rs (8)

14-39: LGTM! Import changes align with architectural improvements.

The new imports for Redis connection management, webhook functionality, and atomic store operations support the transition to direct Redis management and improved concurrency control.


97-101: Excellent error classification improvement.

Adding a dedicated TransactionSendError variant provides better distinction between transaction broadcast failures and generic RPC errors, which will improve debugging and error handling logic.


241-248: Good enhancement with rich receipt data.

Including the full TransactionReceipt provides comprehensive transaction information for confirmed transactions, which is valuable for downstream processing and debugging.


279-283: Approve architectural changes to worker structure.

The shift to direct Redis connection management and webhook queue integration aligns with the new atomic store design and improves worker autonomy.


505-530: Excellent use of atomic operations for crash recovery.

The atomic state transitions from borrowed to submitted/recycled with proper error classification ensure data consistency during crash recovery scenarios.


1795-1836: Approve the improved credential handling.

Making SigningCredential an explicit parameter improves method clarity and allows better separation of signing logic from transaction data management.


630-685: Excellent improvements to confirmation flow processing.

The new approach with CleanupReport, rich receipt data collection, and proper transaction categorization provides much better visibility into transaction processing outcomes.


432-449: Good implementation of explicit lock management.

The direct Redis DEL operation for lock release with proper error logging provides clear lock lifecycle management and good visibility into potential issues.

Comment on lines +699 to +700
// TODO: refactor this, very ugly
if health.balance <= health.balance_threshold {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor the health checking logic.

The TODO comment indicates this logic needs cleanup. The nested conditionals and repeated balance fetching make the code hard to follow.

Consider extracting the balance checking logic into a separate method:

-        // TODO: refactor this, very ugly
-        if health.balance <= health.balance_threshold {
-            if now - health.balance_fetched_at > HEALTH_CHECK_INTERVAL {
-                let balance = chain
-                    .provider()
-                    .get_balance(scoped.eoa())
-                    .await
-                    .map_err(|e| {
-                        let engine_error = e.to_engine_error(chain);
-                        EoaExecutorWorkerError::RpcError {
-                            message: format!("Failed to get balance: {}", engine_error),
-                            inner_error: engine_error,
-                        }
-                    })?;
-
-                health.balance = balance;
-                health.balance_fetched_at = now;
-                scoped.update_health_data(&health).await?;
-            }
-
-            if health.balance <= health.balance_threshold {
-                tracing::warn!(
-                    "EOA has insufficient balance (<= {} wei), skipping send flow",
-                    health.balance_threshold
-                );
-                return Ok(0);
-            }
-        }
+        if !self.ensure_sufficient_balance(&mut health, scoped, chain, now).await? {
+            return Ok(0);
+        }

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In executors/src/eoa/worker.rs around lines 699 to 700, the health checking
logic is cluttered and hard to read due to nested conditionals and repeated
balance checks. Refactor by extracting the balance comparison into a dedicated
method that returns a boolean indicating if the balance is below or equal to the
threshold. Replace the existing inline conditional with a call to this new
method to improve readability and maintainability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
executors/src/eoa/events.rs (1)

78-82: Consider making retry configuration more explicit.

The hardcoded values in the nack envelope (delay_ms: None, max_attempts: None, etc.) may be intentional design decisions, but consider whether these should be configurable parameters for different deployment environments.

If retry behavior needs to be configurable, consider adding a configuration parameter:

 pub fn send_attempt_nack_envelope(
     &self,
     nonce: u64,
     error: EoaExecutorWorkerError,
     attempt_number: u32,
+    retry_config: Option<&RetryConfig>,
 ) -> BareWebhookNotificationEnvelope<SerializableNackData<EoaSendAttemptNackData>>
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67a7ab7 and 1c72974.

📒 Files selected for processing (2)
  • executors/src/eoa/events.rs (1 hunks)
  • executors/src/webhook/mod.rs (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • executors/src/webhook/mod.rs
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
executors/src/eoa/events.rs (2)
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: In the EOA executor system, aggressive lock acquisition is safe because every Redis state mutation uses WATCH operations on the lock key. If the lock is lost during a transaction, the WATCH causes the transaction to fail and the worker exits gracefully. This provides coordination between workers even when using forceful lock takeover.
Learnt from: d4mr
PR: thirdweb-dev/engine-core#5
File: executors/src/eoa/worker.rs:173-176
Timestamp: 2025-07-06T15:44:13.701Z
Learning: The EOA executor store uses comprehensive WATCH-based coordination where every Redis state mutation watches the lock key and validates ownership before proceeding. If the lock is lost during any operation, the transaction fails with LockLost error. This makes aggressive lock acquisition safe because only the actual lock holder can successfully perform state mutations, regardless of who claims the lock.
🔇 Additional comments (6)
executors/src/eoa/events.rs (6)

1-15: Well-organized imports with clear module separation.

The imports are logically grouped and provide clear access to the necessary dependencies for event handling and webhook notifications.


17-19: Simple and focused event structure.

The EoaExecutorEvent struct provides a clean wrapper around transaction data for event generation.


21-25: Comprehensive nack data structure with proper serialization support.

The struct correctly captures both the nonce and error details needed for nack events, with appropriate derives for the webhook system.


27-42: Well-defined stage enum with consistent string representation.

The EoaExecutorStage enum properly covers the key transaction lifecycle stages, and the Display implementation uses consistent snake_case naming for external APIs.


46-134: Consistent and well-structured envelope creation methods.

All envelope creation methods follow a consistent pattern and properly populate the webhook notification structure. The use of cloning is appropriate for the webhook system's needs.


118-133: Stage mapping is correct as implemented.

The EoaExecutorStage enum only defines three variants—SendAttempt, TransactionReplaced, and TransactionConfirmed—and the failure envelope appropriately uses SendAttempt. There is no dedicated “failure” stage in the enum, so mapping failures to the send‐attempt stage is intentional and consistent with the other envelope methods:

  • transaction_replaced_envelopeTransactionReplaced
  • transaction_confirmed_envelopeTransactionConfirmed
  • transaction_failed_envelopeSendAttempt

No changes required here.

d4mr added 2 commits July 12, 2025 07:25
- Introduced `transaction_context_from_pipeline` method to allow atomic job queuing within an existing Redis transaction.
- This enhancement improves the flexibility of transaction handling in the queue implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant