- 
                Notifications
    You must be signed in to change notification settings 
- Fork 1.2k
perf: convert m_nodes_mutex from recursive mutex to shared mutex #6912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
perf: convert m_nodes_mutex from recursive mutex to shared mutex #6912
Conversation
| 
 | 
| Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.37c097de. A new comment will be made when the image is pushed. | 
| Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.37c097de. The image should be on dockerhub soon. | 
| (marked ready for review for code rabbit) | 
| Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.63cf92d8. A new comment will be made when the image is pushed. | 
| WalkthroughRefactors networking code for const-correctness and shared locking. m_nodes_mutex changed from RecursiveMutex to SharedMutex; several mutexes made mutable. Added AssertSharedLockHeldInline and an AssertSharedLockHeld macro. Introduced const FindNodeLocked overloads and mutable FindNodeLockedMutable wrappers, plus template helpers WithNodeShared, WithNodeExclusive, ExistsNodeShared, and FindNodeLockedBy. Added new ForNode overloads (const and non-const) for NodeId and CService, made CNode::CopyStats const, and exposed GetEvictionCandidates and NodeEvictionCandidate. Replaced call sites to use the new lock-aware APIs and enforced a hard cap of three high-bandwidth header-announcing peers. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 ✨ Finishing touches
 🧪 Generate unit tests (beta)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
src/net.cpp (1)
3464-3469: Misleading comment about lock access level.The comment states "Use shared access since we already hold m_nodes_mutex", but at this point the code holds an exclusive lock (via
LOCK2at line 3526, called beforegetPendingQuorumNodesat line 3547). While the code is correct—FindNodeLocked(const version) can be called with either shared or exclusive lock—the comment is confusing.Clarify or remove the comment:
- // Use shared access since we already hold m_nodes_mutex bool slowHandshake = false; if (const CNode* pnode = FindNodeLocked(addr2)) {src/net.h (1)
1658-1662: optional<invoke_result_t<...>> can fail for reference results; decay the typeIf a callable returns a reference (or cv/ref-qualified type), std::optional<invoke_result_t<...>> is ill-formed. Decay the result type before wrapping it in optional.
Apply this diff to both helpers:
template<typename Key, typename Callable> -std::optional<std::invoke_result_t<Callable, const CNode*>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) +std::optional<std::decay_t<std::invoke_result_t<Callable, const CNode*>>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) { READ_LOCK(m_nodes_mutex); - if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, const CNode*>>{fn(p)}; + if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::decay_t<std::invoke_result_t<Callable, const CNode*>>>{fn(p)}; return std::nullopt; } template<typename Key, typename Callable> -std::optional<std::invoke_result_t<Callable, CNode*>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) +std::optional<std::decay_t<std::invoke_result_t<Callable, CNode*>>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) { LOCK(m_nodes_mutex); - if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, CNode*>>{fn(const_cast<CNode*>(cp))}; + if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::decay_t<std::invoke_result_t<Callable, CNode*>>>{fn(const_cast<CNode*>(cp))}; return std::nullopt; }No additional includes needed; <type_traits> is already added.
Also applies to: 1663-1668, 1669-1686, 1687-1695, 1697-1705, 1708-1713, 1715-1721
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
- src/net.cpp(36 hunks)
- src/net.h(26 hunks)
- src/net_processing.cpp(1 hunks)
- src/sync.h(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
- src/sync.h
- src/net_processing.cpp
- src/net.cpp
- src/net.h
🧠 Learnings (6)
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Applied to files:
- src/net.h
🧬 Code graph analysis (3)
src/sync.h (1)
src/sync.cpp (5)
AssertLockHeldInternal(280-285)
AssertLockHeldInternal(280-280)
AssertLockHeldInternal(286-286)
AssertLockHeldInternal(287-287)
AssertLockHeldInternal(288-288)
src/net.cpp (2)
src/node/interfaces.cpp (12)
ip(496-503)
ip(496-496)
id(511-517)
id(511-511)
stats(453-479)
stats(453-453)
LOCK(541-545)
LOCK(551-558)
LOCK(559-566)
LOCK(828-837)
LOCK(871-875)
LOCK(1055-1059)src/net.h (3)
m_nodes_mutex(1688-1693)
CopyStats(1032-1040)
string(107-282)
src/net.h (1)
src/net.cpp (54)
CopyStats(674-724)
CopyStats(674-674)
StopNodes(4177-4228)
StopNodes(4177-4177)
use_v2transport(2025-2025)
CheckIncomingNonce(425-433)
CheckIncomingNonce(425-425)
ForNode(4822-4833)
ForNode(4822-4822)
ForNode(4835-4846)
ForNode(4835-4835)
ForNode(4848-4859)
ForNode(4848-4848)
ForNode(4861-4872)
ForNode(4861-4861)
CNode(4693-4731)
addr(2941-2941)
addr(3393-3393)
AddNode(4298-4310)
AddNode(4298-4298)
GetAddedNodeInfo(3317-3375)
GetAddedNodeInfo(3317-3317)
GetNodeStats(4519-4532)
GetNodeStats(4519-4519)
FindNodeLocked(342-353)
FindNodeLocked(342-342)
FindNodeLocked(355-366)
FindNodeLocked(355-355)
FindNodeLocked(368-379)
FindNodeLocked(368-368)
FindNodeLocked(381-392)
FindNodeLocked(381-381)
FindNodeLockedMutable(394-398)
FindNodeLockedMutable(394-394)
FindNodeLockedMutable(400-404)
FindNodeLockedMutable(400-400)
FindNodeLockedMutable(406-410)
FindNodeLockedMutable(406-406)
FindNodeLockedMutable(412-416)
FindNodeLockedMutable(412-412)
AlreadyConnectedToAddress(419-423)
AlreadyConnectedToAddress(419-419)
GetEvictionCandidates(1820-1868)
GetEvictionCandidates(1820-1820)
AttemptToEvictConnection(1878-1890)
AttemptToEvictConnection(1878-1878)
ConnectNode(449-595)
ConnectNode(449-449)
SocketRecvData(2601-2656)
SocketRecvData(2601-2601)
GetCurrentBlockRelayOnlyConns(3304-3315)
GetCurrentBlockRelayOnlyConns(3304-3304)
MaybePickPreferredNetwork(2910-2924)
MaybePickPreferredNetwork(2910-2910)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: arm-linux-build / Build source
- GitHub Check: mac-build / Build source
- GitHub Check: linux64_tsan-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: win64-build / Build source
- GitHub Check: Lint / Run linters
🔇 Additional comments (13)
src/net_processing.cpp (1)
1308-1318: Good call keeping SENDCMPCT fan-out bounded.Capping
lNodesAnnouncingHeaderAndIDsat three entries—and explicitly telling the oldest peer to drop out of high-bandwidth mode—brings us back in line with the BIP152 limit without upsetting the existing outbound-protection logic above. Looks solid to me.src/sync.h (1)
164-167: LGTM: Shared lock assertion helper implemented correctly.The new
AssertSharedLockHeldInlinefunction andAssertSharedLockHeldmacro provide the necessary infrastructure for asserting shared lock ownership onSharedMutex. TheNO_THREAD_SAFETY_ANALYSISannotation is correctly applied since the underlyingAssertLockHeldInternalrequires exclusive lock annotation but we're checking shared lock at runtime.src/net.cpp (6)
341-392: LGTM: Const-correct FindNodeLocked methods.The const-qualified
FindNodeLockedoverloads correctly useconst CNode*and enable read-only node lookups under either shared or exclusive locks. The newNodeIdoverload at lines 381-392 is a useful addition.
419-433: LGTM: Proper const-correctness and read-lock usage.Both
AlreadyConnectedToAddressandCheckIncomingNoncecorrectly useREAD_LOCKfor read-only node lookups and are appropriately const-qualified.
1820-1868: LGTM: GetEvictionCandidates properly encapsulated.The method correctly uses
READ_LOCKfor read-only node access and returns a value (not references), eliminating lifetime concerns. Good encapsulation of the eviction candidate collection logic.
4835-4859: LGTM: Const-correct ForNode overloads.The new const
ForNodeoverloads forCServiceandNodeIdcorrectly useREAD_LOCKand accept const functors, enabling read-only node operations without exclusive lock contention. The pattern complements the existing mutable overloads well.
674-674: LGTM: CopyStats correctly made const.Making
CopyStatsconst is appropriate since it only reads node state without modification, enabling use in read-only contexts.
421-421: Good read/write lock separation throughout.The systematic conversion to
READ_LOCKfor read-only node access while retaining exclusive locks for modifications is well-executed. This enables concurrent reads and reduces contention, which is the primary benefit of theRecursiveMutextoSharedMutexconversion.src/net.h (5)
51-51: Includes and forward-declare: LGTMtype_traits is needed for invoke_result_t below; forward-declaring NodeEvictionCandidate here is appropriate for the declarations that return vectors.
Also applies to: 65-65
735-735: Const-correctness via mutable locks and CopyStats: LGTMMaking cs_vSend, cs_vRecv, and m_subver_mutex mutable enables const methods like CopyStats() to lock internals safely. CopyStats() const with negative lock requirements matches the net.cpp implementation.
Also applies to: 737-737, 761-761, 1032-1032
1301-1304: SharedMutex migration and API annotations: LGTMForNode/ForEach*/ForEach*Then now explicitly require the nodes lock not to be held and take shared/exclusive locks internally. Const variants use READ_LOCK; non-const variants use LOCK. This is the right direction for removing recursive locking.
Also applies to: 1307-1317, 1333-1342, 1350-1359, 1366-1375, 1381-1395, 1397-1406, 1413-1423, 1425-1429
1811-1811: m_nodes_mutex as SharedMutex and SHARED_LOCKS_REQUIRED usage: LGTMDeclaring m_nodes_mutex as SharedMutex and introducing SHARED_LOCKS_REQUIRED for read-only paths (e.g., MultipleManualOrFullOutboundConns) aligns with the non-recursive, shared/exclusive model.
Also applies to: 1543-1543
1729-1732: Public APIs with explicit node-lock contracts: LGTMGetEvictionCandidates()/AttemptToEvictConnection(), GetCurrentBlockRelayOnlyConns(), MaybePickPreferredNetwork(), GetExtra* counts, GetNodeStats(), and DisconnectNode* are all updated to declare !m_nodes_mutex. Matches the corresponding READ/LOCK usage in net.cpp.
Also applies to: 1758-1758, 1770-1770, 1462-1464, 1504-1508
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
- src/net.cpp(36 hunks)
- src/net.h(26 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
- src/net.h
- src/net.cpp
🧠 Learnings (6)
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:31.504Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.504Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:11.833Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.833Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
Applied to files:
- src/net.h
📚 Learning: 2025-09-09T21:36:58.969Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.969Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Applied to files:
- src/net.h
🧬 Code graph analysis (2)
src/net.h (1)
src/net.cpp (62)
CopyStats(674-724)
CopyStats(674-674)
StopNodes(4177-4228)
StopNodes(4177-4177)
use_v2transport(2025-2025)
ForNode(4822-4833)
ForNode(4822-4822)
ForNode(4835-4846)
ForNode(4835-4835)
ForNode(4848-4859)
ForNode(4848-4848)
ForNode(4861-4872)
ForNode(4861-4861)
CNode(4693-4731)
addr(2941-2941)
addr(3393-3393)
GetExtraFullOutboundCount(2859-2875)
GetExtraFullOutboundCount(2859-2859)
GetExtraBlockRelayCount(2877-2889)
GetExtraBlockRelayCount(2877-2877)
GetNodeStats(4519-4532)
GetNodeStats(4519-4519)
DisconnectNode(4534-4541)
DisconnectNode(4534-4534)
DisconnectNode(4543-4555)
DisconnectNode(4543-4543)
DisconnectNode(4557-4560)
DisconnectNode(4557-4557)
DisconnectNode(4562-4573)
DisconnectNode(4562-4562)
MultipleManualOrFullOutboundConns(2904-2908)
MultipleManualOrFullOutboundConns(2904-2904)
FindNodeLocked(342-353)
FindNodeLocked(342-342)
FindNodeLocked(355-366)
FindNodeLocked(355-355)
FindNodeLocked(368-379)
FindNodeLocked(368-368)
FindNodeLocked(381-392)
FindNodeLocked(381-381)
FindNodeLockedMutable(394-398)
FindNodeLockedMutable(394-394)
FindNodeLockedMutable(400-404)
FindNodeLockedMutable(400-400)
FindNodeLockedMutable(406-410)
FindNodeLockedMutable(406-406)
FindNodeLockedMutable(412-416)
FindNodeLockedMutable(412-412)
GetEvictionCandidates(1820-1868)
GetEvictionCandidates(1820-1820)
AttemptToEvictConnection(1878-1890)
AttemptToEvictConnection(1878-1878)
ConnectNode(449-595)
ConnectNode(449-449)
SocketRecvData(2601-2656)
SocketRecvData(2601-2601)
GetCurrentBlockRelayOnlyConns(3304-3315)
GetCurrentBlockRelayOnlyConns(3304-3304)
MaybePickPreferredNetwork(2910-2924)
MaybePickPreferredNetwork(2910-2910)
PerformReconnections(4917-4944)
PerformReconnections(4917-4917)
src/net.cpp (2)
src/node/interfaces.cpp (12)
ip(496-503)
ip(496-496)
id(511-517)
id(511-511)
stats(453-479)
stats(453-453)
LOCK(541-545)
LOCK(551-558)
LOCK(559-566)
LOCK(828-837)
LOCK(871-875)
LOCK(1055-1059)src/net.h (4)
m_nodes_mutex(1688-1693)
CopyStats(1032-1040)
AllNodes(1299-1299)
string(107-282)
| const CNode* FindNodeLocked(const CNetAddr& ip, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex); | ||
| const CNode* FindNodeLocked(const std::string& addrName, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex); | ||
| const CNode* FindNodeLocked(const CService& addr, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex); | ||
| const CNode* FindNodeLocked(NodeId id, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex); | ||
|  | ||
| // Mutable find helpers for callers that already hold the exclusive lock | ||
| CNode* FindNodeLockedMutable(const CNetAddr& ip, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex); | ||
| CNode* FindNodeLockedMutable(const std::string& addrName, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex); | ||
| CNode* FindNodeLockedMutable(const CService& addr, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex); | ||
| CNode* FindNodeLockedMutable(NodeId id, bool fExcludeDisconnecting = true) EXCLUSIVE_LOCKS_REQUIRED(m_nodes_mutex); | ||
|  | ||
| // Type-agnostic node matching helpers | ||
| static inline bool NodeMatches(const CNode* p, const CService& addr) | ||
| { | ||
| return static_cast<CService>(p->addr) == addr; | ||
| } | ||
| static inline bool NodeMatches(const CNode* p, const CNetAddr& ip) | ||
| { | ||
| return static_cast<CNetAddr>(p->addr) == ip; | ||
| } | ||
| static inline bool NodeMatches(const CNode* p, const std::string& addrName) | ||
| { | ||
| return p->m_addr_name == addrName; | ||
| } | ||
| static inline bool NodeMatches(const CNode* p, const NodeId id) | ||
| { | ||
| return p->GetId() == id; | ||
| } | ||
|  | ||
| template<typename Key> | ||
| const CNode* FindNodeLockedBy(const Key& key, bool fExcludeDisconnecting = true) const SHARED_LOCKS_REQUIRED(m_nodes_mutex) | ||
| { | ||
| for (const CNode* pnode : m_nodes) { | ||
| if (fExcludeDisconnecting && pnode->fDisconnect) continue; | ||
| if (NodeMatches(pnode, key)) return pnode; | ||
| } | ||
| return nullptr; | ||
| } | ||
|  | ||
| // Callback helpers with explicit lock semantics (templated on key type) | ||
| // Lambda-based shared accessor returning optional result (nullopt = not found) | ||
| template<typename Key, typename Callable> | ||
| std::optional<std::invoke_result_t<Callable, const CNode*>> WithNodeShared(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) | ||
| { | ||
| READ_LOCK(m_nodes_mutex); | ||
| if (const CNode* p = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, const CNode*>>{fn(p)}; | ||
| return std::nullopt; | ||
| } | ||
|  | ||
| // Fast existence check under shared lock | ||
| template<typename Key> | ||
| bool ExistsNodeShared(const Key& key) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) | ||
| { | ||
| READ_LOCK(m_nodes_mutex); | ||
| return FindNodeLockedBy(key) != nullptr; | ||
| } | ||
|  | ||
| template<typename Key, typename Callable> | ||
| std::optional<std::invoke_result_t<Callable, CNode*>> WithNodeExclusive(const Key& key, Callable&& fn) EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex) | ||
| { | ||
| LOCK(m_nodes_mutex); | ||
| if (const CNode* cp = FindNodeLockedBy(key)) return std::optional<std::invoke_result_t<Callable, CNode*>>{fn(const_cast<CNode*>(cp))}; | ||
| return std::nullopt; | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep find helpers visible to disconnect-node callers
With the new helpers we now default fExcludeDisconnecting = true (and WithNodeExclusive forwards that default). As soon as a peer has been marked fDisconnect = true, FindNodeLockedBy(...) stops finding it, so DisconnectNode(...) returns false while the socket is still being torn down. Previously the RPC would keep reporting success until the peer was actually removed. This makes the disconnectnode RPC racy/regressive. Please include disconnecting peers when servicing these helpers (e.g. restore the old default or explicitly pass false from WithNodeExclusive) so callers can still find the node until it is gone.
🤖 Prompt for AI Agents
In src/net.h around lines 1658-1721, the new find helpers default
fExcludeDisconnecting = true which causes disconnecting peers to be ignored and
breaks disconnectnode; restore behavior to include disconnecting peers by
changing the default to fExcludeDisconnecting = false in the FindNodeLockedBy
template signature (and any other FindNodeLocked* helpers if applicable), or
alternately ensure WithNodeShared, ExistsNodeShared and WithNodeExclusive call
FindNodeLockedBy(key, /*fExcludeDisconnecting=*/false) so disconnecting peers
are still found until removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/net.cpp (2)
494-500: Consider using WithNodeShared here for consistency and better concurrency.This check for an existing connection uses
WithNodeExclusive, but the lambda doesn't modify the node. For consistency with the similar check at line 460 (which usesWithNodeShared) and to allow better concurrency, consider changing to:-if (WithNodeExclusive(static_cast<CService>(addrConnect), [&](CNode* /*pnode*/){ return true; })) { +if (WithNodeShared(static_cast<CService>(addrConnect), [](const CNode*){ return true; })) {Note: Also removes unnecessary
[&]capture.
3464-3464: Clarify comment to avoid confusion with shared vs. exclusive locks.The comment "Use shared access since we already hold m_nodes_mutex" could be misinterpreted in the context of this PR's shared/exclusive locking model. Consider clarifying:
-// Use shared access since we already hold m_nodes_mutex +// Use const access (FindNodeLocked) since we already hold m_nodes_mutexThis makes it clear that "shared access" refers to const/read-only access, not a shared lock.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
- src/net.cpp(35 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
- src/net.cpp
🧬 Code graph analysis (1)
src/net.cpp (1)
src/net.h (4)
m_nodes_mutex(1688-1693)
CopyStats(1032-1040)
AllNodes(1299-1299)
string(107-282)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: linux64_nowallet-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: win64-build / Build source
- GitHub Check: Lint / Run linters
🔇 Additional comments (4)
src/net.cpp (4)
341-392: LGTM! Const-correct FindNodeLocked variants enable shared-lock access.The addition of const overloads for
FindNodeLockedis the correct approach to support both shared and exclusive lock contexts. These methods properly returnconst CNode*for read-only access.
1820-1868: LGTM! GetEvictionCandidates correctly uses shared locking.The method correctly uses
READ_LOCKfor building the eviction candidate list from read-only node data. The const qualifier appropriately reflects that this method doesn't modify CConnman state.
4833-4857: LGTM! Const ForNode overloads properly use shared locking.The new const overloads correctly use
READ_LOCKfor read-only node iteration and accept const lambdas, enabling concurrent read access to nodes.
4878-4894: LGTM! NodesSnapshot correctly uses shared locking.The constructor appropriately uses
READ_LOCKfor creating a read-only snapshot of the nodes list, and properly callsAddRef()to ensure node lifetime.
| Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.63cf92d8. The image should be on dockerhub soon. | 
| Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.420f6148. A new comment will be made when the image is pushed. | 
| Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.420f6148. The image should be on dockerhub soon. | 
…des; CNode is internally thread safe
| Guix Automation has began to build this PR tagged as v23.0.0-devpr6912.c1b3047d. A new comment will be made when the image is pushed. | 
| Guix Automation has completed; a release should be present here: https://github.com/dashpay/dash-dev-branches/releases/tag/v23.0.0-devpr6912.c1b3047d. The image should be on dockerhub soon. | 
| pls see 20aedd4 | 
Issue being fixed or feature implemented
m_nodes_mutex (mostly) just guards m_nodes; this is ~90% of the time only read from. I only see us mutating this data structure when we add or remove connections / peers. All fields inside of the CNode* inside m_nodes_mutex are thread safe. Typically atomics. As such, (it appears) we only really care about protecting the structure itself, and the lifetime of the pointers in it. Both these are still safe with a shared mutex model.
What was done?
Implemented usage of a shared mutex. We see lock contention drop significantly.
Contention on m_nodes_mutex went from
It is interesting that the average wait time for this mutex does increase, though I guess not super surprising. However, this is still a huge win.
How Has This Been Tested?
Tested on testnet 1 node.
Breaking Changes
None
Checklist:
Go over all the following points, and put an
xin all the boxes that apply.