perf: convert m_nodes_mutex from recursive mutex to shared mutex #6912

PastaPastaPasta · 2025-10-21T21:46:05Z

Issue being fixed or feature implemented

m_nodes_mutex (mostly) just guards m_nodes; this is ~90% of the time only read from. I only see us mutating this data structure when we add or remove connections / peers. All fields inside of the CNode* inside m_nodes_mutex are thread safe. Typically atomics. As such, (it appears) we only really care about protecting the structure itself, and the lifetime of the pointers in it. Both these are still safe with a shared mutex model.

What was done?

Implemented usage of a shared mutex. We see lock contention drop significantly.

•	Total contentions: develop 11,551 → this PR 5,160 (↓ 55%)
•	Total contention time: develop 3,310.7 ms → this PR 2,067.4 ms (↓ 38%)

Contention on m_nodes_mutex went from

count 6091 -> 820 (-86.5%)
time waiting 1,698,355 -> 1,134,811 (-33%)

It is interesting that the average wait time for this mutex does increase, though I guess not super surprising. However, this is still a huge win.

How Has This Been Tested?

Tested on testnet 1 node.

Breaking Changes

None

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have added or updated relevant unit/integration/functional/e2e tests
I have made corresponding changes to the documentation
I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

github-actions · 2025-10-21T21:47:13Z

⚠️ Potential Merge Conflicts Detected

This PR has potential conflicts with the following open PRs:

refactor: drop mutex and atomic from CMasternodeMetaInfo, access to object directly without shared_ptr #6894 - refactor: drop mutex and atomic from CMasternodeMetaInfo, access to object directly without shared_ptr

Please coordinate with the authors of these PRs to avoid merge conflicts.

DashCoreAutoGuix · 2025-10-21T21:48:05Z

Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.37c097de. A new comment will be made when the image is pushed.

DashCoreAutoGuix · 2025-10-21T22:40:02Z

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.37c097de. The image should be on dockerhub soon.

PastaPastaPasta · 2025-10-22T03:11:33Z

(marked ready for review for code rabbit)

DashCoreAutoGuix · 2025-10-22T03:12:04Z

Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.63cf92d8. A new comment will be made when the image is pushed.

coderabbitai · 2025-10-22T03:17:57Z

Walkthrough

Refactors networking code for const-correctness and shared locking. m_nodes_mutex changed from RecursiveMutex to SharedMutex; several mutexes made mutable. Added AssertSharedLockHeldInline and an AssertSharedLockHeld macro. Introduced const FindNodeLocked overloads and mutable FindNodeLockedMutable wrappers, plus template helpers WithNodeShared, WithNodeExclusive, ExistsNodeShared, and FindNodeLockedBy. Added new ForNode overloads (const and non-const) for NodeId and CService, made CNode::CopyStats const, and exposed GetEvictionCandidates and NodeEvictionCandidate. Replaced call sites to use the new lock-aware APIs and enforced a hard cap of three high-bandwidth header-announcing peers.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 11.90% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title "perf: convert m_nodes_mutex from recursive mutex to shared mutex" is directly related to and clearly summarizes the main change in the PR. According to the raw summary, the PR's core modification involves changing m_nodes_mutex from a RecursiveMutex to a SharedMutex in src/net.h, along with corresponding updates throughout the codebase to support this conversion. The title is specific and informative—it identifies exactly what is being changed (m_nodes_mutex), what it was changed from (recursive mutex), and what it was changed to (shared mutex). The "perf:" prefix appropriately indicates a performance-focused improvement, as SharedMutex can offer better performance characteristics than RecursiveMutex in certain scenarios. The title is concise, avoids vague terminology, and would give a teammate scanning the commit history a clear understanding of the primary change.
Description Check	✅ Passed	The pull request description clearly relates to the changeset by explaining the core modification: converting m_nodes_mutex from a recursive mutex to a shared mutex for performance optimization. The author provides specific motivation (90% read operations on m_nodes), the implementation approach (shared mutex model), and concrete performance metrics (55% contention reduction, 86.5% reduction on m_nodes_mutex specifically). The summary aligns with the technical changes shown in the raw summary across net.h, net.cpp, sync.h, and net_processing.cpp, where m_nodes_mutex is replaced with SharedMutex and associated lock annotations are updated throughout.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

src/net.cpp (1)
3464-3469: Misleading comment about lock access level.

The comment states "Use shared access since we already hold m_nodes_mutex", but at this point the code holds an exclusive lock (via LOCK2 at line 3526, called before getPendingQuorumNodes at line 3547). While the code is correct—FindNodeLocked (const version) can be called with either shared or exclusive lock—the comment is confusing.

Clarify or remove the comment:
-                    // Use shared access since we already hold m_nodes_mutex
                     bool slowHandshake = false;
                     if (const CNode* pnode = FindNodeLocked(addr2)) {
src/net.h (1)
1658-1662: optional<invoke_result_t<...>> can fail for reference results; decay the type

If a callable returns a reference (or cv/ref-qualified type), std::optional<invoke_result_t<...>> is ill-formed. Decay the result type before wrapping it in optional.

Apply this diff to both helpers:
 template<typename Key, typename Callable>
-std::optional<std::invoke_result_t<Callable, const CNode*>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
+std::optional<std::decay_t<std::invoke_result_t<Callable, const CNode*>>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
 {
     READ_LOCK(m_nodes_mutex);
-    if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, const CNode*>>{fn(p)};
+    if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::decay_t<std::invoke_result_t<Callable, const CNode*>>>{fn(p)};
     return std::nullopt;
 }
 
 template<typename Key, typename Callable>
-std::optional<std::invoke_result_t<Callable, CNode*>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
+std::optional<std::decay_t<std::invoke_result_t<Callable, CNode*>>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
 {
     LOCK(m_nodes_mutex);
-    if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, CNode*>>{fn(const_cast<CNode*>(cp))};
+    if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::decay_t<std::invoke_result_t<Callable, CNode*>>>{fn(const_cast<CNode*>(cp))};
     return std::nullopt;
 }
No additional includes needed; <type_traits> is already added.

Also applies to: 1663-1668, 1669-1686, 1687-1695, 1697-1705, 1708-1713, 1715-1721

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e5ce1f0 and 63cf92d.

📒 Files selected for processing (4)

src/net.cpp (36 hunks)
src/net.h (26 hunks)
src/net_processing.cpp (1 hunks)
src/sync.h (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/**/*.{cpp,h,cc,cxx,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction

Files:

src/sync.h
src/net_processing.cpp
src/net.cpp
src/net.h

🧠 Learnings (6)

📚 Learning: 2025-09-09T21:36:58.969Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:11.833Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:31.504Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:31.504Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:11.833Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:58.969Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.

Applied to files:

src/net.h

🧬 Code graph analysis (3)

src/sync.h (1)

src/sync.cpp (5)

AssertLockHeldInternal (280-285)

AssertLockHeldInternal (280-280)

AssertLockHeldInternal (286-286)

AssertLockHeldInternal (287-287)

AssertLockHeldInternal (288-288)

src/net.cpp (2)

src/node/interfaces.cpp (12)

ip (496-503)

ip (496-496)

id (511-517)

id (511-511)

stats (453-479)

stats (453-453)

LOCK (541-545)

LOCK (551-558)

LOCK (559-566)

LOCK (828-837)

LOCK (871-875)

LOCK (1055-1059)

src/net.h (3)

m_nodes_mutex (1688-1693)

CopyStats (1032-1040)

string (107-282)

src/net.h (1)

src/net.cpp (54)

CopyStats (674-724)

CopyStats (674-674)

StopNodes (4177-4228)

StopNodes (4177-4177)

use_v2transport (2025-2025)

CheckIncomingNonce (425-433)

CheckIncomingNonce (425-425)

ForNode (4822-4833)

ForNode (4822-4822)

ForNode (4835-4846)

ForNode (4835-4835)

ForNode (4848-4859)

ForNode (4848-4848)

ForNode (4861-4872)

ForNode (4861-4861)

CNode (4693-4731)

addr (2941-2941)

addr (3393-3393)

AddNode (4298-4310)

AddNode (4298-4298)

GetAddedNodeInfo (3317-3375)

GetAddedNodeInfo (3317-3317)

GetNodeStats (4519-4532)

GetNodeStats (4519-4519)

FindNodeLocked (342-353)

FindNodeLocked (342-342)

FindNodeLocked (355-366)

FindNodeLocked (355-355)

FindNodeLocked (368-379)

FindNodeLocked (368-368)

FindNodeLocked (381-392)

FindNodeLocked (381-381)

FindNodeLockedMutable (394-398)

FindNodeLockedMutable (394-394)

FindNodeLockedMutable (400-404)

FindNodeLockedMutable (400-400)

FindNodeLockedMutable (406-410)

FindNodeLockedMutable (406-406)

FindNodeLockedMutable (412-416)

FindNodeLockedMutable (412-412)

AlreadyConnectedToAddress (419-423)

AlreadyConnectedToAddress (419-419)

GetEvictionCandidates (1820-1868)

GetEvictionCandidates (1820-1820)

AttemptToEvictConnection (1878-1890)

AttemptToEvictConnection (1878-1878)

ConnectNode (449-595)

ConnectNode (449-449)

SocketRecvData (2601-2656)

SocketRecvData (2601-2601)

GetCurrentBlockRelayOnlyConns (3304-3315)

GetCurrentBlockRelayOnlyConns (3304-3304)

MaybePickPreferredNetwork (2910-2924)

MaybePickPreferredNetwork (2910-2910)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: arm-linux-build / Build source
GitHub Check: mac-build / Build source
GitHub Check: linux64_tsan-build / Build source
GitHub Check: linux64_sqlite-build / Build source
GitHub Check: linux64_fuzz-build / Build source
GitHub Check: linux64_ubsan-build / Build source
GitHub Check: linux64-build / Build source
GitHub Check: win64-build / Build source
GitHub Check: Lint / Run linters

🔇 Additional comments (13)

src/net_processing.cpp (1)

1308-1318: Good call keeping SENDCMPCT fan-out bounded.

Capping lNodesAnnouncingHeaderAndIDs at three entries—and explicitly telling the oldest peer to drop out of high-bandwidth mode—brings us back in line with the BIP152 limit without upsetting the existing outbound-protection logic above. Looks solid to me.

src/sync.h (1)

164-167: LGTM: Shared lock assertion helper implemented correctly.

The new AssertSharedLockHeldInline function and AssertSharedLockHeld macro provide the necessary infrastructure for asserting shared lock ownership on SharedMutex. The NO_THREAD_SAFETY_ANALYSIS annotation is correctly applied since the underlying AssertLockHeldInternal requires exclusive lock annotation but we're checking shared lock at runtime.

src/net.cpp (6)

341-392: LGTM: Const-correct FindNodeLocked methods.

The const-qualified FindNodeLocked overloads correctly use const CNode* and enable read-only node lookups under either shared or exclusive locks. The new NodeId overload at lines 381-392 is a useful addition.

419-433: LGTM: Proper const-correctness and read-lock usage.

Both AlreadyConnectedToAddress and CheckIncomingNonce correctly use READ_LOCK for read-only node lookups and are appropriately const-qualified.

1820-1868: LGTM: GetEvictionCandidates properly encapsulated.

The method correctly uses READ_LOCK for read-only node access and returns a value (not references), eliminating lifetime concerns. Good encapsulation of the eviction candidate collection logic.

4835-4859: LGTM: Const-correct ForNode overloads.

The new const ForNode overloads for CService and NodeId correctly use READ_LOCK and accept const functors, enabling read-only node operations without exclusive lock contention. The pattern complements the existing mutable overloads well.

674-674: LGTM: CopyStats correctly made const.

Making CopyStats const is appropriate since it only reads node state without modification, enabling use in read-only contexts.

421-421: Good read/write lock separation throughout.

The systematic conversion to READ_LOCK for read-only node access while retaining exclusive locks for modifications is well-executed. This enables concurrent reads and reduces contention, which is the primary benefit of the RecursiveMutex to SharedMutex conversion.

src/net.h (5)

51-51: Includes and forward-declare: LGTM

type_traits is needed for invoke_result_t below; forward-declaring NodeEvictionCandidate here is appropriate for the declarations that return vectors.

Also applies to: 65-65

735-735: Const-correctness via mutable locks and CopyStats: LGTM

Making cs_vSend, cs_vRecv, and m_subver_mutex mutable enables const methods like CopyStats() to lock internals safely. CopyStats() const with negative lock requirements matches the net.cpp implementation.

Also applies to: 737-737, 761-761, 1032-1032

1301-1304: SharedMutex migration and API annotations: LGTM

ForNode/ForEach*/ForEach*Then now explicitly require the nodes lock not to be held and take shared/exclusive locks internally. Const variants use READ_LOCK; non-const variants use LOCK. This is the right direction for removing recursive locking.

Also applies to: 1307-1317, 1333-1342, 1350-1359, 1366-1375, 1381-1395, 1397-1406, 1413-1423, 1425-1429

1811-1811: m_nodes_mutex as SharedMutex and SHARED_LOCKS_REQUIRED usage: LGTM

Declaring m_nodes_mutex as SharedMutex and introducing SHARED_LOCKS_REQUIRED for read-only paths (e.g., MultipleManualOrFullOutboundConns) aligns with the non-recursive, shared/exclusive model.

Also applies to: 1543-1543

1729-1732: Public APIs with explicit node-lock contracts: LGTM

GetEvictionCandidates()/AttemptToEvictConnection(), GetCurrentBlockRelayOnlyConns(), MaybePickPreferredNetwork(), GetExtra* counts, GetNodeStats(), and DisconnectNode* are all updated to declare !m_nodes_mutex. Matches the corresponding READ/LOCK usage in net.cpp.

Also applies to: 1758-1758, 1770-1770, 1462-1464, 1504-1508

src/net.cpp

src/net.h

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 63cf92d and 7121c3c.

📒 Files selected for processing (2)

src/net.cpp (36 hunks)
src/net.h (26 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/**/*.{cpp,h,cc,cxx,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction

Files:

src/net.h
src/net.cpp

🧠 Learnings (6)

📚 Learning: 2025-09-09T21:36:11.833Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:58.969Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:31.504Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:31.504Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:11.833Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.

Applied to files:

src/net.h

📚 Learning: 2025-09-09T21:36:58.969Z

Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.

Applied to files:

src/net.h

🧬 Code graph analysis (2)

src/net.h (1)

src/net.cpp (62)

CopyStats (674-724)

CopyStats (674-674)

StopNodes (4177-4228)

StopNodes (4177-4177)

use_v2transport (2025-2025)

ForNode (4822-4833)

ForNode (4822-4822)

ForNode (4835-4846)

ForNode (4835-4835)

ForNode (4848-4859)

ForNode (4848-4848)

ForNode (4861-4872)

ForNode (4861-4861)

CNode (4693-4731)

addr (2941-2941)

addr (3393-3393)

GetExtraFullOutboundCount (2859-2875)

GetExtraFullOutboundCount (2859-2859)

GetExtraBlockRelayCount (2877-2889)

GetExtraBlockRelayCount (2877-2877)

GetNodeStats (4519-4532)

GetNodeStats (4519-4519)

DisconnectNode (4534-4541)

DisconnectNode (4534-4534)

DisconnectNode (4543-4555)

DisconnectNode (4543-4543)

DisconnectNode (4557-4560)

DisconnectNode (4557-4557)

DisconnectNode (4562-4573)

DisconnectNode (4562-4562)

MultipleManualOrFullOutboundConns (2904-2908)

MultipleManualOrFullOutboundConns (2904-2904)

FindNodeLocked (342-353)

FindNodeLocked (342-342)

FindNodeLocked (355-366)

FindNodeLocked (355-355)

FindNodeLocked (368-379)

FindNodeLocked (368-368)

FindNodeLocked (381-392)

FindNodeLocked (381-381)

FindNodeLockedMutable (394-398)

FindNodeLockedMutable (394-394)

FindNodeLockedMutable (400-404)

FindNodeLockedMutable (400-400)

FindNodeLockedMutable (406-410)

FindNodeLockedMutable (406-406)

FindNodeLockedMutable (412-416)

FindNodeLockedMutable (412-412)

GetEvictionCandidates (1820-1868)

GetEvictionCandidates (1820-1820)

AttemptToEvictConnection (1878-1890)

AttemptToEvictConnection (1878-1878)

ConnectNode (449-595)

ConnectNode (449-449)

SocketRecvData (2601-2656)

SocketRecvData (2601-2601)

GetCurrentBlockRelayOnlyConns (3304-3315)

GetCurrentBlockRelayOnlyConns (3304-3304)

MaybePickPreferredNetwork (2910-2924)

MaybePickPreferredNetwork (2910-2910)

PerformReconnections (4917-4944)

PerformReconnections (4917-4917)

src/net.cpp (2)

src/node/interfaces.cpp (12)

ip (496-503)

ip (496-496)

id (511-517)

id (511-511)

stats (453-479)

stats (453-453)

LOCK (541-545)

LOCK (551-558)

LOCK (559-566)

LOCK (828-837)

LOCK (871-875)

LOCK (1055-1059)

src/net.h (4)

m_nodes_mutex (1688-1693)

CopyStats (1032-1040)

AllNodes (1299-1299)

string (107-282)

coderabbitai · 2025-10-22T03:35:07Z

src/net.h

+    const CNode* FindNodeLocked(const CNetAddr& ip, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex);
+    const CNode* FindNodeLocked(const std::string& addrName, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex);
+    const CNode* FindNodeLocked(const CService& addr, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex);
+    const CNode* FindNodeLocked(NodeId id, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex);
+
+    // Mutable find helpers for callers that already hold the exclusive lock
+    CNode* FindNodeLockedMutable(const CNetAddr& ip, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex);
+    CNode* FindNodeLockedMutable(const std::string& addrName, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex);
+    CNode* FindNodeLockedMutable(const CService& addr, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex);
+    CNode* FindNodeLockedMutable(NodeId id, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex);
+
+    // Type-agnostic node matching helpers
+    static inline bool NodeMatches(const CNode* p, const CService& addr)
+    {
+        return static_cast<CService>(p->addr) == addr;
+    }
+    static inline bool NodeMatches(const CNode* p, const CNetAddr& ip)
+    {
+        return static_cast<CNetAddr>(p->addr) == ip;
+    }
+    static inline bool NodeMatches(const CNode* p, const std::string& addrName)
+    {
+        return p->m_addr_name == addrName;
+    }
+    static inline bool NodeMatches(const CNode* p, const NodeId id)
+    {
+        return p->GetId() == id;
+    }
+
+    template<typename Key>
+    const CNode* FindNodeLockedBy(const Key& key, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex)
+    {
+        for (const CNode* pnode : m_nodes) {
+            if (fExcludeDisconnecting && pnode->fDisconnect) continue;
+            if (NodeMatches(pnode, key)) return pnode;
+        }
+        return nullptr;
+    }
+
+    // Callback helpers with explicit lock semantics (templated on key type)
+    // Lambda-based shared accessor returning optional result (nullopt = not found)
+    template<typename Key, typename Callable>
+    std::optional<std::invoke_result_t<Callable, const CNode*>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
+    {
+        READ_LOCK(m_nodes_mutex);
+        if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, const CNode*>>{fn(p)};
+        return std::nullopt;
+    }
+
+    // Fast existence check under shared lock
+    template<typename Key>
+    bool ExistsNodeShared(const Key& key) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
+    {
+        READ_LOCK(m_nodes_mutex);
+        return FindNodeLockedBy(key) != nullptr;
+    }
+
+    template<typename Key, typename Callable>
+    std::optional<std::invoke_result_t<Callable, CNode*>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex)
+    {
+        LOCK(m_nodes_mutex);
+        if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, CNode*>>{fn(const_cast<CNode*>(cp))};
+        return std::nullopt;
+    }


⚠️ Potential issue | 🟠 Major

Keep find helpers visible to disconnect-node callers

With the new helpers we now default fExcludeDisconnecting = true (and WithNodeExclusive forwards that default). As soon as a peer has been marked fDisconnect = true, FindNodeLockedBy(...) stops finding it, so DisconnectNode(...) returns false while the socket is still being torn down. Previously the RPC would keep reporting success until the peer was actually removed. This makes the disconnectnode RPC racy/regressive. Please include disconnecting peers when servicing these helpers (e.g. restore the old default or explicitly pass false from WithNodeExclusive) so callers can still find the node until it is gone.

🤖 Prompt for AI Agents

In src/net.h around lines 1658-1721, the new find helpers default fExcludeDisconnecting = true which causes disconnecting peers to be ignored and breaks disconnectnode; restore behavior to include disconnecting peers by changing the default to fExcludeDisconnecting = false in the FindNodeLockedBy template signature (and any other FindNodeLocked* helpers if applicable), or alternately ensure WithNodeShared, ExistsNodeShared and WithNodeExclusive call FindNodeLockedBy(key, /*fExcludeDisconnecting=*/false) so disconnecting peers are still found until removed.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/net.cpp (2)
494-500: Consider using WithNodeShared here for consistency and better concurrency.

This check for an existing connection uses WithNodeExclusive, but the lambda doesn't modify the node. For consistency with the similar check at line 460 (which uses WithNodeShared) and to allow better concurrency, consider changing to:
-if (WithNodeExclusive(static_cast<CService>(addrConnect), [&](CNode* /*pnode*/){ return true; })) {
+if (WithNodeShared(static_cast<CService>(addrConnect), [](const CNode*){ return true; })) {
Note: Also removes unnecessary [&] capture.

3464-3464: Clarify comment to avoid confusion with shared vs. exclusive locks.

The comment "Use shared access since we already hold m_nodes_mutex" could be misinterpreted in the context of this PR's shared/exclusive locking model. Consider clarifying:
-// Use shared access since we already hold m_nodes_mutex
+// Use const access (FindNodeLocked) since we already hold m_nodes_mutex
This makes it clear that "shared access" refers to const/read-only access, not a shared lock.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7121c3c and 420f614.

📒 Files selected for processing (1)

src/net.cpp (35 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

src/**/*.{cpp,h,cc,cxx,hpp}

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction

Files:

src/net.cpp

🧬 Code graph analysis (1)

src/net.cpp (1)

src/net.h (4)

m_nodes_mutex (1688-1693)

CopyStats (1032-1040)

AllNodes (1299-1299)

string (107-282)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: linux64_nowallet-build / Build source
GitHub Check: linux64_ubsan-build / Build source
GitHub Check: linux64_sqlite-build / Build source
GitHub Check: linux64_fuzz-build / Build source
GitHub Check: linux64-build / Build source
GitHub Check: arm-linux-build / Build source
GitHub Check: win64-build / Build source
GitHub Check: Lint / Run linters

🔇 Additional comments (4)

src/net.cpp (4)

341-392: LGTM! Const-correct FindNodeLocked variants enable shared-lock access.

The addition of const overloads for FindNodeLocked is the correct approach to support both shared and exclusive lock contexts. These methods properly return const CNode* for read-only access.

1820-1868: LGTM! GetEvictionCandidates correctly uses shared locking.

The method correctly uses READ_LOCK for building the eviction candidate list from read-only node data. The const qualifier appropriately reflects that this method doesn't modify CConnman state.

4833-4857: LGTM! Const ForNode overloads properly use shared locking.

The new const overloads correctly use READ_LOCK for read-only node iteration and accept const lambdas, enabling concurrent read access to nodes.

4878-4894: LGTM! NodesSnapshot correctly uses shared locking.

The constructor appropriately uses READ_LOCK for creating a read-only snapshot of the nodes list, and properly calls AddRef() to ensure node lifetime.

DashCoreAutoGuix · 2025-10-22T04:04:05Z

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.63cf92d8. The image should be on dockerhub soon.

DashCoreAutoGuix · 2025-10-22T04:06:04Z

Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.420f6148. A new comment will be made when the image is pushed.

DashCoreAutoGuix · 2025-10-22T04:57:46Z

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.420f6148. The image should be on dockerhub soon.

…des; CNode is internally thread safe

DashCoreAutoGuix · 2025-10-22T17:14:05Z

Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.c1b3047d. A new comment will be made when the image is pushed.

DashCoreAutoGuix · 2025-10-22T18:03:35Z

Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.c1b3047d. The image should be on dockerhub soon.

UdjinM6 · 2025-10-29T14:09:36Z

pls see 20aedd4

perf: convert m_nodes_mutex from recursive mutex to mutex

37c097d

PastaPastaPasta added the guix-build label Oct 21, 2025

perf: convert m_nodes_mutex from mutex to shared mutex

63cf92d

PastaPastaPasta marked this pull request as ready for review October 22, 2025 03:11

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

src/net.cpp Show resolved Hide resolved

src/net.h Outdated Show resolved Hide resolved

src/net.h Show resolved Hide resolved

code review fixes

7121c3c

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

style: clean up whitespace in net.cpp

420f614

coderabbitai bot reviewed Oct 22, 2025

View reviewed changes

perf: only acquire exclusive locks on m_nodes_mutex if we mutate m_no…

c1b3047

…des; CNode is internally thread safe

PastaPastaPasta changed the title ~~perf: convert m_nodes_mutex from recursive mutex to mutex~~ perf: convert m_nodes_mutex from recursive mutex to shared mutex Oct 23, 2025

PastaPastaPasta requested review from UdjinM6 and knst October 26, 2025 17:41

perf: convert m_nodes_mutex from recursive mutex to shared mutex #6912

Are you sure you want to change the base?

perf: convert m_nodes_mutex from recursive mutex to shared mutex #6912

Uh oh!

Conversation

PastaPastaPasta commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue being fixed or feature implemented

What was done?

How Has This Been Tested?

Breaking Changes

Checklist:

Uh oh!

github-actions bot commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ Potential Merge Conflicts Detected

Uh oh!

DashCoreAutoGuix commented Oct 21, 2025

Uh oh!

DashCoreAutoGuix commented Oct 21, 2025

Uh oh!

PastaPastaPasta commented Oct 22, 2025

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

coderabbitai bot commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

DashCoreAutoGuix commented Oct 22, 2025

Uh oh!

UdjinM6 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

PastaPastaPasta commented Oct 21, 2025 •

edited

Loading

github-actions bot commented Oct 21, 2025 •

edited

Loading

coderabbitai bot commented Oct 22, 2025 •

edited

Loading