Skip to content

Conversation

@UdjinM6
Copy link

@UdjinM6 UdjinM6 commented Nov 20, 2025

Issue being fixed or feature implemented

Previously, evodb_verify_or_repair_impl held cs_main for the entire operation, which could take minutes when verifying/repairing large block ranges. This caused significant lock contention and blocked other operations requiring cs_main.

This commit reduces the cs_main lock scope to only the initial setup phase where we resolve block indexes from the active chain. The actual verification and repair work (applying diffs, rebuilding lists from blocks, verifying snapshots) now runs without holding cs_main.

What was done?

Changes:

  • Wrap block index resolution in a scoped cs_main lock
  • Remove AssertLockHeld(cs_main) from helper functions:
    • RecalculateAndRepairDiffs
    • CollectSnapshotBlocks
    • VerifySnapshotPair
    • RepairSnapshotPair
    • RebuildListFromBlock (CSpecialTxProcessor)
  • Update function signatures to remove EXCLUSIVE_LOCKS_REQUIRED(cs_main)

How Has This Been Tested?

Run evodb verify/repair on a mainnet node and monitor logs - it keeps processing other stuff while rpc command is still running.

Breaking Changes

This is safe because:

  • CBlockIndex pointers remain valid after lock release (never deleted)
  • Block parent relationships (pprev, GetAncestor) are immutable
  • ReadBlockFromDisk takes cs_main internally when accessing nFile/nDataPos
  • Helper functions only process already-loaded block data and snapshots
  • ChainLocks prevent deep reorgs in Dash anyway

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

Previously, evodb_verify_or_repair_impl held cs_main for the entire
operation, which could take minutes when verifying/repairing large block
ranges. This caused significant lock contention and blocked other
operations requiring cs_main.

This commit reduces the cs_main lock scope to only the initial setup
phase where we resolve block indexes from the active chain. The actual
verification and repair work (applying diffs, rebuilding lists from
blocks, verifying snapshots) now runs without holding cs_main.

Changes:
- Wrap block index resolution in a scoped cs_main lock
- Remove AssertLockHeld(cs_main) from helper functions:
  * RecalculateAndRepairDiffs
  * CollectSnapshotBlocks
  * VerifySnapshotPair
  * RepairSnapshotPair
  * RebuildListFromBlock (CSpecialTxProcessor)
- Update function signatures to remove EXCLUSIVE_LOCKS_REQUIRED(cs_main)

This is safe because:
- CBlockIndex pointers remain valid after lock release (never deleted)
- Block parent relationships (pprev, GetAncestor) are immutable
- ReadBlockFromDisk takes cs_main internally when accessing nFile/nDataPos
- Helper functions only process already-loaded block data and snapshots
- ChainLocks prevent deep reorgs in Dash anyway
@UdjinM6 UdjinM6 added this to the 23.0.1 milestone Nov 20, 2025
@github-actions
Copy link

github-actions bot commented Nov 20, 2025

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

@coderabbitai
Copy link

coderabbitai bot commented Nov 20, 2025

Walkthrough

This pull request relaxes or removes cs_main lock assertions and thread-safety annotations in evo codepaths. Several CDeterministicMNManager methods (RecalculateAndRepairDiffs, CollectSnapshotBlocks, VerifySnapshotPair, RepairSnapshotPair) have had ::cs_main requirements removed from their declarations and internal assertions. CSpecialTxProcessor::RebuildListFromBlock also has its EXCLUSIVE_LOCKS_REQUIRED(cs_main) annotation and an internal cs_main assertion removed. A new method MigrateLegacyDiffs(const CBlockIndex* const tip_index) with EXCLUSIVE_LOCKS_REQUIRED(!cs, ::cs_main) was added. In the RPC layer, cs_main scoping was narrowed during block resolution and the callback used for list building was switched to RebuildListFromBlock (renamed from BuildNewListFromBlock). No functional logic changes beyond locking preconditions are introduced.

Sequence Diagram(s)

No sequence diagram included — the changes are restricted to synchronization contract/annotation adjustments rather than new control-flow paths.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Areas requiring extra attention:

  • Verify all callers satisfy the new lock preconditions for:
    • RecalculateAndRepairDiffs (now EXCLUSIVE_LOCKS_REQUIRED(!cs))
    • CollectSnapshotBlocks, VerifySnapshotPair, RepairSnapshotPair (no longer require ::cs_main)
    • RebuildListFromBlock (no longer requires cs_main)
  • Audit removed runtime assertions (cs_main) to ensure callers provide equivalent synchronization where needed.
  • Inspect the new MigrateLegacyDiffs declaration and its intended callers/usage for correct locking.
  • Confirm the narrowed cs_main scope in RPC block resolution does not open race windows between resolution and subsequent operations.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: reducing cs_main lock scope in evodb verify/repair operations, which is the primary focus of all modifications across the affected files.
Description check ✅ Passed The description is directly related to the changeset, explaining the performance issue being addressed, detailing what changes were made, and providing justification for why the changes are safe.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce506bc and 084bb62.

📒 Files selected for processing (2)
  • src/evo/deterministicmns.h (2 hunks)
  • src/evo/specialtxman.h (1 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: knst
Repo: dashpay/dash PR: 6692
File: src/llmq/blockprocessor.cpp:217-224
Timestamp: 2025-08-19T14:57:31.801Z
Learning: In PR #6692, knst acknowledged a null pointer dereference issue in ProcessBlock() method where LookupBlockIndex may return nullptr but is passed to gsl::not_null, and created follow-up PR #6789 to address it, consistent with avoiding scope creep in performance-focused PRs.
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6786
File: ci/test/04_install.sh:99-101
Timestamp: 2025-08-01T07:46:37.840Z
Learning: In backport PRs like #6786, UdjinM6 prefers to defer non-critical fixes (such as shell command expansion issues) to separate commits/PRs to maintain focus on the primary backport objectives, consistent with the project's pattern of avoiding scope creep.
Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Learnt from: knst
Repo: dashpay/dash PR: 6511
File: src/evo/deterministicmns.cpp:1369-1373
Timestamp: 2025-01-07T18:50:44.838Z
Learning: The functions `MigrateDBIfNeeded` and `MigrateDBIfNeeded2` in `src/evo/deterministicmns.cpp` are temporary and will be removed in a future version. Refactoring suggestions for these functions should be avoided.
Learnt from: kwvg
Repo: dashpay/dash PR: 6718
File: test/functional/test_framework/test_framework.py:2102-2102
Timestamp: 2025-06-09T16:43:20.996Z
Learning: In the test framework consolidation PR (#6718), user kwvg prefers to limit functional changes to those directly related to MasternodeInfo, avoiding scope creep even for minor improvements like error handling consistency.
📚 Learning: 2025-11-13T20:02:55.480Z
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6969
File: src/evo/deterministicmns.h:441-479
Timestamp: 2025-11-13T20:02:55.480Z
Learning: In `src/evo/deterministicmns.h`, the `internalId` field in `CDeterministicMN` and the `mnInternalIdMap` in `CDeterministicMNList` are non-deterministic and used only for internal bookkeeping and efficient lookups. Different nodes can assign different internalIds to the same masternode depending on their sync history. Methods like `IsEqual()` intentionally ignore internalId mappings and only compare consensus-critical deterministic fields (proTxHash, collateral, state, etc.).

Applied to files:

  • src/evo/specialtxman.h
  • src/evo/deterministicmns.h
📚 Learning: 2025-01-07T18:50:44.838Z
Learnt from: knst
Repo: dashpay/dash PR: 6511
File: src/evo/deterministicmns.cpp:1369-1373
Timestamp: 2025-01-07T18:50:44.838Z
Learning: The functions `MigrateDBIfNeeded` and `MigrateDBIfNeeded2` in `src/evo/deterministicmns.cpp` are temporary and will be removed in a future version. Refactoring suggestions for these functions should be avoided.

Applied to files:

  • src/evo/deterministicmns.h
🧬 Code graph analysis (1)
src/evo/deterministicmns.h (2)
src/validation.cpp (1)
  • ChainstateManager (6051-6061)
src/instantsend/signing.h (1)
  • Consensus (16-18)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
  • GitHub Check: x86_64-pc-linux-gnu / Build depends
  • GitHub Check: x86_64-w64-mingw32 / Build depends
  • GitHub Check: x86_64-apple-darwin / Build depends
  • GitHub Check: x86_64-pc-linux-gnu_nowallet / Build depends
  • GitHub Check: x86_64-pc-linux-gnu_multiprocess / Build depends
  • GitHub Check: arm-linux-gnueabihf / Build depends
  • GitHub Check: Lint / Run linters
🔇 Additional comments (3)
src/evo/deterministicmns.h (2)

743-747: Lock contract correctly implemented: RecalculateAndRepairDiffs safely runs heavy work without cs_main

Verification confirms the refactoring is sound:

  • Call site (evo.cpp:1767–1799): LOCK(::cs_main) is scoped and released before line 1815 invokes RecalculateAndRepairDiffs, so the caller obeys the !cs contract.
  • Implementation (deterministicmns.cpp:1585–1700): Only performs read-only access to m_evoDb, read-only access to const CBlockIndex pointers, and calls internal helpers. No lock acquisition or cs_main assumption.
  • Lock annotation correctly reflects the change: EXCLUSIVE_LOCKS_REQUIRED(!cs) alone, removing the former cs_main dependency, allowing the heavy verify/repair work to run concurrently without blocking the global chainstate lock.

Helper methods (CollectSnapshotBlocks, VerifySnapshotPair, RepairSnapshotPair) are internal and inherit the parent method's lock contract safely.


758-765: Snapshot helpers verified as lock-agnostic with no external cache dependencies

Verification confirms the refactored signatures and implementations are sound:

  • CollectSnapshotBlocks: Pure block index navigation, no manager state access
  • VerifySnapshotPair: Reads diffs from m_evoDb only (line 1760), no cache access
  • RepairSnapshotPair: Disk-based block reading and list building, no cache access
  • WriteRepairedDiffs: Properly holds lock before clearing caches (lines 1900–1903), with EXCLUSIVE_LOCKS_REQUIRED(!cs) enforcing the lock policy

All three helpers work exclusively with materialized snapshots and block index pointers, never touching mnListsCache or mnListDiffsCache without holding cs.

src/evo/specialtxman.h (1)

85-87: RebuildListFromBlock removal of cs_main requirement is correct and properly implemented

Verification confirms the change is sound:

  • Implementation (lines 183-283+ in specialtxman.cpp) contains no AssertLockHeld(cs_main) and uses only lock-free operations: CCoinsViewCache reads, DeploymentActiveAfter on consensus parameters, and local list mutations.

  • RPC call site (evodb verify/repair at line 1815) releases cs_main before calling RecalculateAndRepairDiffs, which is declared with EXCLUSIVE_LOCKS_REQUIRED(!cs) to explicitly reflect that it does not require the global lock. The scoped LOCK(::cs_main) ends at line 1807, well before the callback is invoked.

  • Normal validation path (line 180 in specialtxman.cpp) calls RebuildListFromBlock from a different overload that has AssertLockHeld(cs_main), so that call site remains protected.

The header annotation change aligns the declaration with actual requirements, enabling minutes-long evodb verify/repair operations without blocking the main lock.

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Collaborator

@kwvg kwvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 084bb62

Copy link
Collaborator

@knst knst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 084bb62

@PastaPastaPasta PastaPastaPasta merged commit a3118c1 into dashpay:develop Nov 21, 2025
32 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants