Skip to content

Conversation

@PastaPastaPasta
Copy link
Member

Issue being fixed or feature implemented

cs_db isn't super contended, only a few percent of overall contention time, around 3-4% of contentions, however, if we look at the spots where it's contended each of them could be replaced with a read lock. This should reduce contention over cs_db to near 0. While this isn't a huge gain; it's valuable to avoid context switching or stalls in the islock hot loop.

====================================================================================================
TOP 40 LOCATIONS BY TOTAL CONTENTION TIME
====================================================================================================
Lock Name                                Location                               Count    Total(μs)    Avg(μs)    Max(μs)  Nodes
----------------------------------------------------------------------------------------------------
cs_db                                    ./instantsend/db.h:139                  2389       854134      357.5     143876      1
cs_db                                    instantsend/db.cpp:239                  1744       699770      401.2      20189      1
cs_db                                    instantsend/db.cpp:313                   352        63247      179.7       8314      1


====================================================================================================
SUMMARY STATISTICS
====================================================================================================
Total lock contentions: 157051
Total contention time: 126780252 μs (126780.25 ms)
Unique lock locations: 164
Average contention time: 807.3 μs

What was done?

Updated CInstantSendDb to use shared locks instead of exclusive locks for read operations, improving concurrency. This change affects methods such as KnownInstantSendLock, GetInstantSendLockCount, and others, ensuring that multiple threads can read from the database simultaneously without blocking each other. The cs_db mutex has also been changed to a SharedMutex to support this new locking strategy.

How Has This Been Tested?

Builds locally

Breaking Changes

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

Updated CInstantSendDb to use shared locks instead of exclusive locks for read operations, improving concurrency. This change affects methods such as KnownInstantSendLock, GetInstantSendLockCount, and others, ensuring that multiple threads can read from the database simultaneously without blocking each other. The cs_db mutex has also been changed to a SharedMutex to support this new locking strategy.
@PastaPastaPasta PastaPastaPasta added this to the 23.1 milestone Nov 11, 2025
@github-actions
Copy link

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

size_t CInstantSendDb::GetInstantSendLockCount() const
{
LOCK(cs_db);
READ_LOCK(cs_db);
Copy link
Collaborator

@knst knst Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is internal db object (leveldb::DB* pdb;) ready for parallel reads?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

levelDB appears to say it is internally safe: https://github.com/google/leveldb/blob/main/doc/index.md#concurrency

Copy link
Member Author

@PastaPastaPasta PastaPastaPasta Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpt agrees

'''
Looked at dbwrapper.h to confirm how CDBWrapper uses LevelDB (leveldb::DB* pdb).
Yes: leveldb::DB supports concurrent reads. Multiple threads can call DB::Get and create/use independent iterators concurrently without external locking.
Don’t share iterators across threads. Each thread must use its own leveldb::Iterator.
Reads and writes may run concurrently; writes are serialized internally. For a consistent view across keys during concurrent writes, use snapshots (ReadOptions::snapshot).
In our wrapper, pdb->Get(...), pdb->NewIterator(iteroptions), and pdb->GetApproximateSizes(...) map directly to LevelDB calls and are safe for parallel reads. The shared readoptions/iteroptions structs are read-only.
Caveat: while LevelDB reads are thread-safe, our higher-level caches (e.g., unordered_lru_cache, Uint256LruHashMap) are not thread-safe on their own. They still need protection by cs_db even during read paths that update the caches.
'''

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I am reviewing instantsent/db just in case if we can drop somewhere cs_db completely.

It seems as not easy due to islockCache which is used almost everywhere.

Feasible candidates are:

  1. CInstantSendDb::RemoveInstantSendLock in case of keep_cache=true
  2. CInstantSendDb::WriteInstantSendLockMined

Also see #6964

@coderabbitai
Copy link

coderabbitai bot commented Nov 11, 2025

Walkthrough

This change optimizes concurrency in the instantsend database layer by converting the primary synchronization primitive from an exclusive Mutex to a SharedMutex, enabling multiple threads to concurrently perform read-only queries. All const accessor and query methods are updated to acquire shared read locks instead of exclusive locks. Lock annotations on public methods are relaxed from EXCLUSIVE_LOCKS_REQUIRED(!cs_db) to LOCKS_EXCLUDED(cs_db), while internal methods are updated to reflect SHARED_LOCKS_REQUIRED(cs_db). The change preserves write-path exclusivity while allowing parallel read access to the instantsend database.

Sequence Diagrams

sequenceDiagram
    participant T1 as Thread 1
    participant T2 as Thread 2
    participant DB as InstantSend DB<br/>(SharedMutex)

    rect rgb(200, 220, 255)
    Note over T1,DB: BEFORE: Exclusive Access
    T1->>DB: LOCK(cs_db) — Query
    Note over DB: Exclusive lock held
    T2->>DB: LOCK(cs_db) — Query (blocked)
    T1->>DB: Release lock
    T2->>DB: Acquires lock, Query executes
    end

    rect rgb(220, 250, 220)
    Note over T1,DB: AFTER: Shared Reads + Exclusive Writes
    T1->>DB: READ_LOCK(cs_db) — Query
    Note over DB: Shared lock held
    T2->>DB: READ_LOCK(cs_db) — Query (concurrent)
    Note over DB: Both threads read simultaneously
    par
        T1->>DB: Release read lock
        T2->>DB: Release read lock
    end
    T1->>DB: LOCK(cs_db) — Write
    Note over DB: Exclusive lock for mutation
    T1->>DB: Release lock
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Lock classification verification: Confirm that all methods converted to shared-read semantics (READ_LOCK, SHARED_LOCKS_REQUIRED) are truly read-only and do not mutate shared state.
  • Synchronization semantics: Verify that no write paths were inadvertently converted to shared locks and that exclusive locks remain for all mutation operations.
  • Internal consistency: Ensure that lock annotation changes on internal helper methods (GetInstantSendLockByHashInternal, GetInstantSendLockHashByTxidInternal, GetInstantSendLocksByParent) are correctly applied and that assertion types (AssertSharedLockHeld vs. AssertLockHeld) match their contexts.
  • Pattern consistency: While changes follow a homogeneous pattern across multiple methods, the systematic nature reduces review burden; however, correctness of the locking model is critical.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: replacing exclusive locks with shared locks in CInstantSendDb.
Description check ✅ Passed The description is directly related to the changeset, providing motivation, explaining the changes made, and documenting testing.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 368eebb and f620ec7.

📒 Files selected for processing (2)
  • src/instantsend/db.cpp (5 hunks)
  • src/instantsend/db.h (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction

Files:

  • src/instantsend/db.cpp
  • src/instantsend/db.h
🧠 Learnings (9)
📓 Common learnings
Learnt from: kwvg
Repo: dashpay/dash PR: 6543
File: src/wallet/receive.cpp:240-251
Timestamp: 2025-02-06T14:34:30.466Z
Learning: Pull request #6543 is focused on move-only changes and refactoring, specifically backporting from Bitcoin. Behavior changes should be proposed in separate PRs.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().

Applied to files:

  • src/instantsend/db.cpp
  • src/instantsend/db.h
📚 Learning: 2025-10-21T11:09:34.688Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6849
File: src/governance/governance.cpp:1339-1343
Timestamp: 2025-10-21T11:09:34.688Z
Learning: In the Dash Core codebase, `CacheMap` (defined in src/cachemap.h) is internally thread-safe and uses its own `mutable CCriticalSection cs` to protect access to its members. Methods like `GetSize()`, `Insert()`, `Get()`, `HasKey()`, etc., can be called without holding external locks.

Applied to files:

  • src/instantsend/db.cpp
  • src/instantsend/db.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.

Applied to files:

  • src/instantsend/db.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.

Applied to files:

  • src/instantsend/db.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.

Applied to files:

  • src/instantsend/db.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.

Applied to files:

  • src/instantsend/db.h
📚 Learning: 2025-01-02T21:50:00.967Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6504
File: src/llmq/context.cpp:42-43
Timestamp: 2025-01-02T21:50:00.967Z
Learning: LLMQContext manages concurrency for the `CInstantSendManager`. Previously, this was handled globally; now it's handled as a class member in `LLMQContext`, but the concurrency control remains consistent.

Applied to files:

  • src/instantsend/db.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
Repo: dashpay/dash PR: 6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.

Applied to files:

  • src/instantsend/db.h
🧬 Code graph analysis (2)
src/instantsend/db.cpp (1)
src/instantsend/db.h (3)
  • cs_db (39-39)
  • cs_db (41-162)
  • GetInstantSendLockCount (120-141)
src/instantsend/db.h (2)
src/instantsend/db.cpp (14)
  • GetInstantSendLocksByParent (324-352)
  • GetInstantSendLocksByParent (324-324)
  • GetInstantSendLockByHashInternal (267-290)
  • GetInstantSendLockByHashInternal (267-267)
  • GetInstantSendLockHashByTxidInternal (292-303)
  • GetInstantSendLockHashByTxidInternal (292-292)
  • KnownInstantSendLock (237-242)
  • KnownInstantSendLock (237-237)
  • GetInstantSendLockCount (244-265)
  • GetInstantSendLockCount (244-244)
  • GetInstantSendLockByTxid (305-309)
  • GetInstantSendLockByTxid (305-305)
  • GetInstantSendLockByInput (311-322)
  • GetInstantSendLockByInput (311-311)
src/instantsend/instantsend.cpp (6)
  • GetInstantSendLockCount (921-924)
  • GetInstantSendLockCount (921-921)
  • GetInstantSendLockByHash (840-863)
  • GetInstantSendLockByHash (840-840)
  • GetInstantSendLockByTxid (865-872)
  • GetInstantSendLockByTxid (865-865)
🪛 GitHub Actions: Clang Diff Format Check
src/instantsend/db.h

[error] 73-78: Clang format differences found. The diff tool reported formatting differences and the step exited with code 1. Run clang-format/clang-format-diff.py to fix formatting in this file.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: mac-build / Build source
  • GitHub Check: linux64_tsan-build / Build source
  • GitHub Check: linux64-build / Build source
  • GitHub Check: linux64_sqlite-build / Build source
  • GitHub Check: linux64_ubsan-build / Build source
  • GitHub Check: linux64_fuzz-build / Build source
  • GitHub Check: win64-build / Build source
  • GitHub Check: linux64_nowallet-build / Build source
  • GitHub Check: Lint / Run linters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

void WriteBlockInstantSendLocks(const gsl::not_null<std::shared_ptr<const CBlock>>& pblock, gsl::not_null<const CBlockIndex*> pindexConnected) EXCLUSIVE_LOCKS_REQUIRED(!cs_db);
void RemoveBlockInstantSendLocks(const gsl::not_null<std::shared_ptr<const CBlock>>& pblock, gsl::not_null<const CBlockIndex*> pindexDisconnected) EXCLUSIVE_LOCKS_REQUIRED(!cs_db);
bool KnownInstantSendLock(const uint256& islockHash) const EXCLUSIVE_LOCKS_REQUIRED(!cs_db);
bool KnownInstantSendLock(const uint256& islockHash) const LOCKS_EXCLUDED(cs_db);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this was changed? (same for other similar changes below)

InstantSendLockPtr CInstantSendDb::GetInstantSendLockByHashInternal(const uint256& hash, bool use_cache) const
{
AssertLockHeld(cs_db);
AssertSharedLockHeld(cs_db);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you actually do this? we update islockCache in this method

uint256 CInstantSendDb::GetInstantSendLockHashByTxidInternal(const uint256& txid) const
{
AssertLockHeld(cs_db);
AssertSharedLockHeld(cs_db);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same but for txidCache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants