-
Notifications
You must be signed in to change notification settings - Fork 12
Deduplicate transactions on BatchTxPool prior to submission
#916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds txHash tracking to pooled EVM transactions, prevents duplicate submissions by checking txHash presence in the pool and returning Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant BatchPool
participant Submitter
participant Gateway
Client->>BatchPool: Add(signedTx, txHash, nonce)
BatchPool->>BatchPool: create pooledEvmTx {txHash, nonce, raw}
BatchPool->>BatchPool: check pool for txHash (slices.Contains)
alt duplicate found
BatchPool-->>Client: ErrDuplicateTransaction
else new tx
BatchPool-->>Client: accepted
end
Note right of BatchPool: periodic flush / build batch
BatchPool->>BatchPool: sort/compact unique by txHash
BatchPool->>Submitter: send batch of unique hexTxs
Submitter->>Gateway: submit batch
Gateway-->>Submitter: response
Submitter-->>BatchPool: update/remove submitted txs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/requester/batch_tx_pool.go(4 hunks)tests/tx_batching_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
🪛 GitHub Check: Lint
services/requester/batch_tx_pool.go
[failure] 214-214:
S1008: should use 'return a.txHash == b.txHash' instead of 'if a.txHash == b.txHash { return true }; return false' (staticcheck)
[failure] 213-213:
unusedresult: result of slices.CompactFunc call not used (govet)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (4)
services/requester/batch_tx_pool.go (3)
6-6: LGTM: Import added for deduplication.The
slicespackage import is correctly added to support theCompactFuncused for filtering duplicate transactions.
28-28: LGTM: Transaction hash field added for deduplication.The
txHashfield enables duplicate detection during batch submission, which addresses the issue of identical transactions being submitted multiple times.
148-148: LGTM: Transaction hash correctly captured.The transaction hash is properly stored when adding transactions to the pool, enabling deduplication logic downstream.
tests/tx_batching_test.go (1)
517-562: LGTM: Test validates duplicate transaction handling.The test correctly validates the deduplication feature:
- Sends an initial transaction with nonce=0
- Sends 5 duplicate transactions with nonce=1 (identical payloads)
- Verifies all receipts eventually succeed
Since duplicate transactions produce identical hashes, the
hashesslice contains repeated values, and the test verifies the same receipts multiple times. This correctly validates that duplicates are handled without causing execution failures.
c62c88b to
d023295
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/requester/batch_tx_pool.go(4 hunks)tests/tx_batching_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
d023295 to
c2b098a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
222-225: LGTM! Payload building is clean and efficient.Preallocating
hexEncodedTxswith the exact length ofuniqueTxsand directly iterating over the deduplicated transactions is the correct approach.Optional: Consider adding observability for duplicate detection.
For monitoring bot behavior and validating the fix, you could log or track metrics on the number of duplicates filtered:
seen[tx.txHash] = struct{}{} uniqueTxs = append(uniqueTxs, tx) } + +if duplicates := len(pooledTxs) - len(uniqueTxs); duplicates > 0 { + t.logger.Debug(). + Int("original", len(pooledTxs)). + Int("unique", len(uniqueTxs)). + Int("duplicates", duplicates). + Msg("filtered duplicate transactions in batch") +}This would help confirm the deduplication is working as intended and provide insights into how often duplicates occur.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
services/requester/batch_tx_pool.go(3 hunks)tests/tx_batching_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/tx_batching_test.go
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (3)
services/requester/batch_tx_pool.go (3)
25-29: LGTM! Transaction hash tracking added correctly.Adding the
txHashfield topooledEvmTxenables deduplication by transaction hash, which is the right approach for identifying identical transactions.
147-149: LGTM! Transaction hash captured correctly when pooling.The
txHashis computed fromtx.Hash()and stored alongside the payload and nonce. This is the correct location to capture the hash, as it's protected by the mutex and only applies to transactions being pooled for batch submission.
211-220: LGTM! Deduplication correctly handles non-adjacent duplicates.The map-based deduplication approach correctly addresses the concern from previous reviews about
slices.CompactFunconly removing adjacent duplicates. Using aseenset ensures that all duplicate transactions are filtered out regardless of their position in the sorted slice, preventing the "insufficient computation" errors caused by redundant submissions.The
pooledTxs[:0]pattern efficiently reuses the underlying array while safely building the deduplicated slice.
janezpodhostnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good but should we instead prevent adding duplicates to the poll in the first place, instead of deduplicating them later?
c2b098a to
d22b265
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
services/requester/batch_tx_pool.go (1)
149-154: LGTM: Duplicate detection correctly prevents redundant submissions.The approach of rejecting duplicates at submission time (via
slices.Contains) is cleaner than deduplicating during batch processing—it provides immediate feedback and avoids the "adjacent duplicates only" pitfall ofslices.CompactFunc. ThetxMuxlock ensures thread-safe duplicate checking.Note:
slices.Containsperforms an O(n) scan, but given that the pool is cleared everyTxBatchIntervaland the expected transaction count per EOA is small, this is acceptable.If you observe high transaction volumes per EOA in production, consider switching to a map-based lookup for O(1) duplicate detection:
// In BatchTxPool struct, add: pooledTxHashes map[gethCommon.Address]map[gethCommon.Hash]struct{} // In Add(), replace slices.Contains with: if t.pooledTxHashes[from] == nil { t.pooledTxHashes[from] = make(map[gethCommon.Hash]struct{}) } if _, exists := t.pooledTxHashes[from][tx.Hash()]; exists { return errs.ErrDuplicateTransaction } t.pooledTxHashes[from][tx.Hash()] = struct{}{} t.pooledTxs[from] = append(t.pooledTxs[from], userTx) // In processPooledTransactions(), also clear: t.pooledTxHashes = make(map[gethCommon.Address]map[gethCommon.Hash]struct{})
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(3 hunks)tests/tx_batching_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/tx_batching_test.go
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-06-19T11:36:25.478Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 831
File: services/requester/batch_tx_pool.go:0-0
Timestamp: 2025-06-19T11:36:25.478Z
Learning: In Go, when copying maps that contain slices (like `map[gethCommon.Address][]pooledEvmTx`), perform deep copies by iterating over the map and copying each slice individually using `make()` and `copy()` to avoid shared references that could lead to race conditions and data corruption.
Applied to files:
services/requester/batch_tx_pool.go
📚 Learning: 2025-03-07T01:35:09.751Z
Learnt from: peterargue
Repo: onflow/flow-evm-gateway PR: 772
File: services/requester/keystore/key_store.go:50-62
Timestamp: 2025-03-07T01:35:09.751Z
Learning: In the flow-evm-gateway codebase, panics are acceptable in scenarios where immediate detection of critical bugs is desired during development and testing, particularly for invariant violations that should never occur in a correctly functioning system (e.g., when a key is available but locked in the keystore implementation).
Applied to files:
services/requester/batch_tx_pool.go
🧬 Code graph analysis (1)
services/requester/batch_tx_pool.go (1)
models/errors/errors.go (1)
ErrDuplicateTransaction(33-33)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (4)
services/requester/batch_tx_pool.go (3)
6-6: LGTM: Import additions are appropriate.The
slicesstandard library import anderrsalias are used correctly for duplicate detection and error handling respectively.Also applies to: 21-21
27-31: LGTM: txHash field addition enables duplicate detection.The new
txHashfield correctly supports duplicate transaction detection. SincegethCommon.Hashis a comparable array type[32]byte, struct equality viaslices.Containswill work correctly for identifying duplicate transactions.
206-221: LGTM: Batch submission logic correctly processes deduplicated transactions.Sorting by nonce ensures the correct execution order, and since duplicates are already filtered at submission time in
Add(), no additional deduplication is needed here. Clean separation of concerns.models/errors/errors.go (1)
31-33: LGTM: Error definition is clear and correctly categorized.
ErrDuplicateTransactionappropriately wrapsErrInvalidand provides a clear message for rejected duplicate submissions. The error is used correctly in the batch transaction pool to provide meaningful feedback when duplicate transactions are detected.
@janezpodhostnik That's a good idea. This way we can even be explicit to the user, that they have submitted a tx which is already in the pool. Updated in d22b265 . |
d22b265 to
86a8dac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/tx_batching_test.go (2)
532-534: Consider clarifying the nonce strategy.The nonce is incremented here to set up the duplicate scenario, where all subsequent transactions in the loop use nonce 1. This is correct but could benefit from a brief comment explaining that the nonce is deliberately not incremented in the loop to create identical transactions.
Example:
signed, _, err := evmSign(big.NewInt(10), 21000, eoaKey, nonce, &testAddr, nil) require.NoError(t, err) + // Increment nonce for the duplicate test transactions that follow nonce += 1
540-558: Consider documenting the iteration count.The loop runs 5 times to test multiple duplicate submissions, which is reasonable. However, the number "5" is a magic constant without explanation. Consider adding a brief comment or using a named constant to clarify the intent.
Example:
+ // Submit 5 identical transactions to test duplicate detection: + // the first should succeed, the rest should be rejected as duplicates for i := range 5 {
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
models/errors/errors.go(1 hunks)services/requester/batch_tx_pool.go(3 hunks)tests/tx_batching_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- services/requester/batch_tx_pool.go
- models/errors/errors.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
🔇 Additional comments (1)
tests/tx_batching_test.go (1)
517-570: Well-structured test for duplicate transaction prevention.The test correctly validates the duplicate detection mechanism by:
- Submitting an initial transaction (nonce 0)
- Submitting 5 identical transactions (all nonce 1) where only the first succeeds
- Verifying the duplicate error message matches expectations
- Confirming successful transactions execute correctly
The test follows established patterns in this file and properly uses assertion helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/tx_batching_test.go (1)
539-561: Sign once and replay the identical raw payloadRight now we call
evmSigninside the loop, so every duplicate submission is freshly signed. That happens to work because go-ethereum currently produces deterministic signatures, but if the signer ever adds entropy (or we swap libraries), these calls would yield differentv/r/svalues, the hashes would diverge, and the test would stop exercising actual duplicate detection. To make the scenario robust—and to mirror bots resending the exact same raw payload—sign once, keep the bytes, and reuse them for the remaining submissions.- // Submit 5 identical transactions to test duplicate detection: - // the first should succeed, the rest should be rejected as duplicates - for i := range 5 { - // All these transactions are duplicates, since we don't change any - // of the payload data. These will end up having the same tx hash - // as well. - signed, _, err := evmSign(big.NewInt(10), 15_000_000, eoaKey, nonce, &testAddr, nil) - require.NoError(t, err) - - // only the first transaction is valid, the rest 4 are duplicates - // of the 1st one. - if i == 0 { - txHash, err := rpcTester.sendRawTx(signed) - require.NoError(t, err) - hashes = append(hashes, txHash) - } else { - _, err := rpcTester.sendRawTx(signed) - require.Error(t, err) - require.ErrorContains(t, err, "invalid: transaction already in pool") - } - } + dupSigned, _, err := evmSign(big.NewInt(10), 15_000_000, eoaKey, nonce, &testAddr, nil) + require.NoError(t, err) + + // Submit 5 identical transactions to test duplicate detection: + // the first should succeed, the rest should be rejected as duplicates. + for i := range 5 { + if i == 0 { + txHash, err := rpcTester.sendRawTx(dupSigned) + require.NoError(t, err) + hashes = append(hashes, txHash) + } else { + _, err := rpcTester.sendRawTx(dupSigned) + require.Error(t, err) + require.ErrorContains(t, err, "invalid: transaction already in pool") + } + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/tx_batching_test.go(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-12T13:21:59.066Z
Learnt from: m-Peter
Repo: onflow/flow-evm-gateway PR: 890
File: tests/integration_test.go:692-707
Timestamp: 2025-11-12T13:21:59.066Z
Learning: In the flow-evm-gateway integration tests (tests/integration_test.go), the gateway's RPC server continues to operate independently even after the context passed to bootstrap.Run() is cancelled. Context cancellation affects upstream dependencies (like the Emulator acting as an Access Node) but does not immediately shut down the RPC server, allowing tests to verify backup failover scenarios.
Applied to files:
tests/tx_batching_test.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test
janezpodhostnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice
| userTx := pooledEvmTx{txPayload: hexEncodedTx, nonce: tx.Nonce()} | ||
| userTx := pooledEvmTx{txPayload: hexEncodedTx, txHash: tx.Hash(), nonce: tx.Nonce()} | ||
| // Prevent submission of duplicate transactions, based on their tx hash | ||
| if slices.Contains(t.pooledTxs[from], userTx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of putting the txHash to the pooledEvmTx, what about maintaining a separate index for the txHash?
So that this existence check can be done in O(1), instead of O(n), n being the number of txs in the pool for a single address. Otherwise, if some bot sends lots of txs, this call would become slow.
For instance:
type BatchTxPool struct {
*SingleTxPool
pooledTxs map[gethCommon.Address][]pooledEvmTx
pooledTxHashs map[txHash]struct{}
txMux sync.Mutex
eoaActivity *expirable.LRU[gethCommon.Address, time.Time]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pooledTxs get submitted to the network around every ~2 seconds, so we are talking about a very small number of transactions for a single address. The largest I have observed is 12 transactions for a single address, which is quite small. Do you think it's worth the complexity of maintaining a separate index for de-duplicating? We will also have to remove entries from that separate index, when transactions do get submitted, which means it will require some sync mechanism.
Closes: #907
Description
There were some cases where trading bots would submit 6-7 identical transactions, and because their gas limit would be around 5.5 million, it would fail with an insufficient computation error. Even though one of the transactions could possibly be executed successfully.
For contributor use:
masterbranchFiles changedin the Github PR explorerSummary by CodeRabbit
Bug Fixes
Tests