-
Notifications
You must be signed in to change notification settings - Fork 1.2k
feat(stats): add support for IPv6 and TCP, introduce functional tests, deprecate -statsport
#6837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
@coderabbitai review |
✅ No Merge Conflicts DetectedThis PR currently has no conflicts with other open PRs. |
✅ Actions performedReview triggered.
|
WalkthroughThe patch refactors StatsD client initialization to a URL-aware factory Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/stats/client.h (1)
1-66
: Apply clang-format to fix formatting issuesThe pipeline check indicates formatting differences. Please run
clang-format -i
on this file to ensure consistent formatting.#!/bin/bash # Verify the exact formatting issues clang-format --version 2>/dev/null || echo "clang-format not available"src/stats/client.cpp (2)
1-223
: Apply clang-format to fix formatting issuesThe pipeline check indicates formatting differences. Please run
clang-format -i
on this file.
210-213
: Potential precision loss in format stringUsing
%f
for formatting the value might cause precision issues. Consider using a more precise format specifier or std::to_string for exact representation.- RawMessage msg{strprintf("%s%s%s:%f|%s", m_prefix, key, m_suffix, value, type)}; + // Use higher precision for doubles to avoid data loss + RawMessage msg; + if constexpr (std::is_floating_point_v<T1>) { + msg = strprintf("%s%s%s:%.17g|%s", m_prefix, key, m_suffix, value, type); + } else { + msg = strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, std::to_string(value), type); + }src/stats/rawsender.cpp (1)
1-277
: Apply clang-format to fix formatting issuesThe pipeline check indicates formatting differences. Please run
clang-format -i
on this file.
🧹 Nitpick comments (16)
src/bench/checkblock.cpp (1)
41-43
: Stub StatsdClient init — LGTM; add missing header for make_uniqueAvoid relying on transitive includes.
Apply:
#include <bench/data.h> +#include <memory>
doc/release-notes-6837.md (1)
4-7
: Add IPv6 URL example to prevent ambiguityShowing bracketed literal avoids misconfiguration.
Append e.g.:
- `-statshost` now accepts URLs to allow specifying the protocol, host and port in one argument. + For IPv6 literals, use brackets (e.g., `udp://[::1]:8125`). @@ - `tcp://127.0.0.1:8125`). UDP (`udp://`) remains the default and will be assumed if a protocol is not specified. + `tcp://127.0.0.1:8125`). UDP (`udp://`) remains the default and will be assumed if a protocol is not specified (e.g., `127.0.0.1:8125`).Also applies to: 13-15
src/init.cpp (2)
781-781
: Deprecate -statsport while keeping backward compatibility — looks goodRegistering -statsport as hidden preserves compatibility without surfacing it in help. Consider logging a one-time deprecation warning when -statsport is present to aid users migrating to URL-based configuration.
1643-1654
: Early, fail-fast Statsd client init is appropriate; minor nit on return styleThe early initialization avoids null dereferences in code paths that may emit stats during startup. Error propagation via util::ErrorString is clear. Nit: you can directly return the InitError value to be consistent with nearby code.
Apply this diff for the minor style tweak:
- if (!stats_client) { - InitError(_("Cannot init Statsd client") + Untranslated(" (") + util::ErrorString(stats_client) + Untranslated(")")); - return false; - } + if (!stats_client) { + return InitError(_("Cannot init Statsd client") + Untranslated(" (") + util::ErrorString(stats_client) + Untranslated(")")); + }src/stats/rawsender.h (3)
107-108
: Vector as a queue can degrade performance; prefer deque or batch-swapSwitching m_queue to std::vector is fine if you always flush in bulk via swap-and-clear. If you ever erase from the front, complexity becomes O(n). Please confirm the .cpp only does bulk flushes or switch to std::deque.
If bulk flush is intended, consider documenting it in the member comment.
- /* Queue of (batches of) messages to be sent */ - std::vector<RawMessage> m_queue GUARDED_BY(cs); + /* Queue of (batches of) messages to be sent. NOTE: treated as append-and-bulk-flush only (no front erases). */ + std::vector<RawMessage> m_queue GUARDED_BY(cs);Also applies to: 88-90
58-60
: Constructor error-out parameter is awkwardUsing std::optional<bilingual_str>& as an out-param is unusual and easy to misuse (caller must pass a lvalue). Consider returning util::Result or std::optional<bilingual_str> directly from a static factory to keep the ctor noexcept.
If changing the signature is out of scope, at least document ownership and expected state of ‘error’ on success/failure.
111-112
: Thread member naming nitm_reconn is a std::thread; consider renaming to m_reconn_thread for clarity, matching the comment, or update the comment to match the name.
test/functional/feature_stats.py (4)
49-53
: Use explicit guards instead of assert in test infra startupassert may be stripped with -O. Prefer an explicit guard and AssertionError for reliability in all runs.
- assert not self.running + if self.running: + raise AssertionError("StatsServer already running")
61-71
: Tighten wait loop and reduce flakinessConsider shorter get timeouts with a small sleep to reduce blocking and overall wait, and optionally log sampled messages to aid debugging on failure. This helps CI stability.
115-131
: Exact debug-log matching can be brittle across platforms/build variantsIf possible, match stable substrings or use regex to avoid failures due to minor wording changes. Example:
-with self.nodes[0].assert_debug_log(expected_msgs=['Transmitting stats are disabled, will not init Statsd client']): +with self.nodes[0].assert_debug_log(expected_msgs=['stats are disabled', 'will not init Statsd client']):Or add a regex-based helper in the framework.
25-33
: AF selection: consider binding to the resolved sockaddr directlyaddrinfo may return multiple entries; picking the first is usually fine, but for hosts that resolve to multiple families/records, iterating until bind succeeds is more robust.
Optional improvement:
- Loop over addr_info entries, try socket/bind per entry.
- Close the socket on bind failure before next attempt.
src/stats/client.h (2)
44-56
: Consider making sample_rate parameter default = 1.0f instead of 1.fWhile
1.f
and1.0f
are functionally equivalent, using1.0f
would be more consistent with common C++ style guides and improve readability.- virtual bool dec(std::string_view key, float sample_rate = 1.f) { return false; } - virtual bool inc(std::string_view key, float sample_rate = 1.f) { return false; } - virtual bool count(std::string_view key, int64_t delta, float sample_rate = 1.f) { return false; } - virtual bool gauge(std::string_view key, int64_t value, float sample_rate = 1.f) { return false; } - virtual bool gaugeDouble(std::string_view key, double value, float sample_rate = 1.f) { return false; } - virtual bool timing(std::string_view key, uint64_t ms, float sample_rate = 1.f) { return false; } + virtual bool dec(std::string_view key, float sample_rate = 1.0f) { return false; } + virtual bool inc(std::string_view key, float sample_rate = 1.0f) { return false; } + virtual bool count(std::string_view key, int64_t delta, float sample_rate = 1.0f) { return false; } + virtual bool gauge(std::string_view key, int64_t value, float sample_rate = 1.0f) { return false; } + virtual bool gaugeDouble(std::string_view key, double value, float sample_rate = 1.0f) { return false; } + virtual bool timing(std::string_view key, uint64_t ms, float sample_rate = 1.0f) { return false; }
52-56
: Same float literal consistency suggestion for send methods- virtual bool send(std::string_view key, double value, std::string_view type, float sample_rate = 1.f) { return false; } - virtual bool send(std::string_view key, int32_t value, std::string_view type, float sample_rate = 1.f) { return false; } - virtual bool send(std::string_view key, int64_t value, std::string_view type, float sample_rate = 1.f) { return false; } - virtual bool send(std::string_view key, uint32_t value, std::string_view type, float sample_rate = 1.f) { return false; } - virtual bool send(std::string_view key, uint64_t value, std::string_view type, float sample_rate = 1.f) { return false; } + virtual bool send(std::string_view key, double value, std::string_view type, float sample_rate = 1.0f) { return false; } + virtual bool send(std::string_view key, int32_t value, std::string_view type, float sample_rate = 1.0f) { return false; } + virtual bool send(std::string_view key, int64_t value, std::string_view type, float sample_rate = 1.0f) { return false; } + virtual bool send(std::string_view key, uint32_t value, std::string_view type, float sample_rate = 1.0f) { return false; } + virtual bool send(std::string_view key, uint64_t value, std::string_view type, float sample_rate = 1.0f) { return false; }src/stats/client.cpp (1)
135-142
: Complex lambda could be simplifiedThe inline lambda for counting forward slashes after the colon could be extracted to a named function or simplified for better readability.
- // Remove all forward slashes found after the port delimiter (colon) - host = std::string( - host.begin(), host.end() - [&colon_idx, &host]() { - const size_t slash_idx{host.find('/', /*pos=*/colon_idx + 1)}; - return slash_idx != std::string::npos ? host.length() - slash_idx : 0; - }()); + // Remove all forward slashes found after the port delimiter (colon) + const size_t slash_idx{host.find('/', /*pos=*/colon_idx + 1)}; + if (slash_idx != std::string::npos) { + host = host.substr(0, slash_idx); + }src/stats/rawsender.cpp (2)
182-192
: Consider extracting error codes to named constantsThe long list of error codes for detecting broken connections could be more maintainable as a helper function or a set of constants.
+ static bool IsConnectionError(int err_code) { + return err_code == WSAECONNABORTED || + err_code == WSAECONNREFUSED || + err_code == WSAECONNRESET || + err_code == WSAEHOSTUNREACH || + err_code == WSAENETDOWN || + err_code == WSAENETRESET || + err_code == WSAENETUNREACH || + err_code == WSAENOTCONN || + err_code == WSAEPIPE || + err_code == WSAETIMEDOUT; + } + if (m_sock->Send(reinterpret_cast<const char*>(msg.data()), msg.size(), send_flags) == SOCKET_ERROR) { const auto err_code = WSAGetLastError(); - if (err_code == WSAECONNABORTED || - err_code == WSAECONNREFUSED || - err_code == WSAECONNRESET || - err_code == WSAEHOSTUNREACH || - err_code == WSAENETDOWN || - err_code == WSAENETRESET || - err_code == WSAENETUNREACH || - err_code == WSAENOTCONN || - err_code == WSAEPIPE || - err_code == WSAETIMEDOUT) - { + if (IsConnectionError(err_code)) {
250-260
: Consider checking SendDirectly return value
QueueFlush
callsSendDirectly
but ignores the return value. While this might be intentional to avoid blocking on errors, consider at least logging failures for debugging purposes.void RawSender::QueueFlush(std::vector<RawMessage>& queue) { AssertLockNotHeld(cs); for (auto& msg : queue) { // Add delimiter to prevent unexpected concat if sends are consolidated if (m_use_tcp && !msg.empty() && msg.back() != m_batching_opts.second) { msg += m_batching_opts.second; } - SendDirectly(msg); + if (auto error = SendDirectly(msg)) { + LogPrint(BCLog::NET, "%s: Failed to send message: %s\n", __func__, error->original); + } } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
-
doc/release-notes-6837.md
(1 hunks) -
src/bench/checkblock.cpp
(1 hunks) -
src/compat/compat.h
(1 hunks) -
src/init.cpp
(2 hunks) -
src/stats/client.cpp
(4 hunks) -
src/stats/client.h
(2 hunks) -
src/stats/rawsender.cpp
(2 hunks) -
src/stats/rawsender.h
(4 hunks) -
src/test/util/setup_common.cpp
(1 hunks) -
test/functional/feature_stats.py
(1 hunks) -
test/functional/test_runner.py
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
doc/release-notes-6837.md
[grammar] ~17-~17: There might be a mistake here.
Context: ...tionand
-statsbatchsize` and flushed when connection is restored. This buffer is ...
(QB_NEW_EN)
[grammar] ~22-~22: There might be a mistake here.
Context: ...ew URL syntax supported by -statshost
. -statsport
will be removed in a futu...
(QB_NEW_EN)
[grammar] ~26-~26: There might be a mistake here.
Context: ... -statsport
value will be ignored.
(QB_NEW_EN)
🪛 markdownlint-cli2 (0.17.2)
doc/release-notes-6837.md
11-11: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
20-20: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
🪛 Ruff (0.12.2)
test/functional/feature_stats.py
44-44: Do not catch blind exception: Exception
(BLE001)
46-46: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
49-49: Use of assert
detected
(S101)
70-70: Avoid specifying long messages outside the exception class
(TRY003)
🪛 GitHub Actions: Clang Diff Format Check
src/stats/client.h
[error] 1-1: Clang format differences detected by clang-format-diff.py. The formatting check failed (exit code 1). Command: git diff -U0 origin/develop --
src/stats/rawsender.cpp
[error] 1-1: Clang format differences detected by clang-format-diff.py. The formatting check failed (exit code 1). Command: git diff -U0 origin/develop --
src/stats/client.cpp
[error] 1-1: Clang format differences detected by clang-format-diff.py. The formatting check failed (exit code 1). Command: git diff -U0 origin/develop --
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: x86_64-apple-darwin / Build depends
- GitHub Check: x86_64-w64-mingw32 / Build depends
- GitHub Check: x86_64-pc-linux-gnu_multiprocess / Build depends
- GitHub Check: arm-linux-gnueabihf / Build depends
- GitHub Check: x86_64-pc-linux-gnu_nowallet / Build depends
- GitHub Check: x86_64-pc-linux-gnu / Build depends
🔇 Additional comments (13)
test/functional/test_runner.py (1)
310-310
: Approve inclusion of feature_stats.py in BASE_SCRIPTS — test file present at test/functional/feature_stats.py defines StatsTest(BitcoinTestFramework).src/test/util/setup_common.cpp (1)
201-207
: Factory-based StatsdClient initialization with explicit failure — LGTMClear error propagation and scoped temporary keep things tidy. No changes requested.
src/stats/rawsender.h (2)
66-72
: Thread-safety annotations look consistent; LGTMThe EXCLUSIVE_LOCKS_REQUIRED annotations for Send/SendDirectly/QueueFlush/Connect/Reconect are consistent with cs usage and aid static analysis.
Also applies to: 76-84, 85-90
113-121
: Compile break: uses chrono literal ‘1s’ in header without literals namespaceHeaders shouldn’t rely on std::chrono_literals; this will fail to compile for users not importing the literal. Initialize with std::chrono::seconds{1} instead.
- std::chrono::seconds m_timeout{1s}; + std::chrono::seconds m_timeout{std::chrono::seconds{1}};⛔ Skipped due to learnings
Learnt from: kwvg PR: dashpay/dash#6815 File: src/chainlock/chainlock.h:68-69 Timestamp: 2025-08-13T16:10:32.972Z Learning: The Dash codebase extensively uses std::chrono_literals throughout, including in header files. Chrono literals like 0s, 1min, etc. are acceptable and preferred over explicit constructors like std::chrono::seconds::zero().
Learnt from: kwvg PR: dashpay/dash#6815 File: src/chainlock/chainlock.h:68-69 Timestamp: 2025-08-13T16:10:32.972Z Learning: The Dash codebase extensively uses std::chrono_literals throughout, including in header files. The pattern using namespace std::chrono_literals; is used globally in src/util/time.h, and chrono literals like 0s are commonly used for atomic member initialization in headers (e.g., src/net.h). Chrono literals like 0s, 1min, etc. are preferred over explicit constructors.
src/stats/client.h (1)
40-41
: Good API design with factory patternThe migration to a factory pattern with
util::Result
for error handling is a solid improvement. This provides better error propagation and a more modern C++ interface.src/stats/client.cpp (3)
104-108
: Good input validation for port rangeThe validation correctly ensures the port is within the valid range (1-65535) when explicitly set via
-statsport
.
120-159
: Excellent URL parsing implementationThe URL parsing logic is well-structured:
- Properly validates URL format and scheme
- Correctly handles both tcp:// and udp:// schemes
- Appropriately warns when URL port overrides -statsport
- Good error messages for malformed URLs
200-207
: Thread-safe random sampling implementationGood use of mutex protection for the random number generator in the sampling logic.
src/stats/rawsender.cpp (5)
70-100
: Well-structured connection logicThe
Connect()
method properly handles both UDP and TCP socket creation with appropriate error handling.
102-120
: Good TCP reconnection strategy with exponential backoffThe reconnection logic correctly implements exponential backoff with a maximum timeout, which is a best practice for network reconnection.
54-68
: Proper thread lifecycle managementThe destructor correctly handles thread interruption and joining for both the main queue thread and reconnection thread, followed by queue flushing to prevent data loss.
225-248
: Efficient batching implementationThe queue batching logic is well-implemented with proper size checks and delimiter handling.
122-154
: Verify TCP reconnection behavior in productionThe reconnection thread implementation looks correct, but TCP reconnection behavior under various network conditions should be thoroughly tested.
Could you verify that the TCP reconnection properly handles these scenarios:
- Server restart while client is running
- Network partition/recovery
- Long-duration disconnections
- Rapid connect/disconnect cycles
- Specifying invalid values will no longer result in silent disablement of the StatsD client and will now cause errors | ||
at startup. | ||
|
||
### New Features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix headings and minor grammar/spacing in release notes
Conform to repo’s setext heading style and tighten wording.
Apply:
-### New Features
+New Features
+------------
@@
- - Unlike with UDP, if a TCP connection is interrupted, stats will _not_ be dropped and instead buffered according to
- `-statsduration` and `-statsbatchsize` and flushed when connection is restored. This buffer is discarded every 5
+ - Unlike with UDP, if a TCP connection is interrupted, stats will not be dropped and instead buffered according to
+ `-statsduration` and `-statsbatchsize` and flushed when the connection is restored. This buffer is discarded every 5
minutes if connection is not restored by then.
@@
-### Deprecations
+Deprecations
+------------
@@
- - If both `-statsport` and `-statshost` with a URL specifying a port is supplied, the `-statsport` value will be
- ignored.
+ - If both `-statsport` and `-statshost` with a URL specifying a port are supplied, the `-statsport` value will be ignored.
Also applies to: 20-20, 16-18, 22-26
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)
11-11: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
🤖 Prompt for AI Agents
In doc/release-notes-6837.md around line 11, the "### New Features" ATX-style
heading must be converted to the repo's setext heading style and minor
grammar/spacing tightened; replace the ATX heading with a setext-level underline
(e.g., "New Features" followed by a line of hyphens), normalize surrounding
blank lines, fix any punctuation/spacing inconsistencies, and tighten wording as
needed; apply the same conversion and edits to the other affected regions (lines
16-18, 20-20, and 22-26).
except Exception as e: | ||
if self.running: | ||
raise AssertionError("Unexpected exception raised: " + type(e).__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid catching broad Exception; preserve traceback context
Catch concrete exceptions (e.g., OSError, UnicodeDecodeError) and chain the assertion for debuggability.
- except Exception as e:
- if self.running:
- raise AssertionError("Unexpected exception raised: " + type(e).__name__)
+ except (OSError, UnicodeDecodeError) as e:
+ if self.running:
+ raise AssertionError(f"Unexpected exception raised: {type(e).__name__}") from e
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
except Exception as e: | |
if self.running: | |
raise AssertionError("Unexpected exception raised: " + type(e).__name__) | |
except (OSError, UnicodeDecodeError) as e: | |
if self.running: | |
raise AssertionError(f"Unexpected exception raised: {type(e).__name__}") from e |
🧰 Tools
🪛 Ruff (0.12.2)
44-44: Do not catch blind exception: Exception
(BLE001)
46-46: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
🤖 Prompt for AI Agents
In test/functional/feature_stats.py around lines 44 to 46, the code catches a
broad Exception and raises an AssertionError without preserving traceback;
change the except to catch the specific exceptions you expect (for example
OSError, UnicodeDecodeError) instead of Exception, and when raising the
AssertionError chain the original exception (raise AssertionError("Unexpected
exception raised: " + type(e).__name__) from e) so the original traceback is
preserved for debugging.
1630cc3
to
9c7ec07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/stats/rawsender.h (1)
95-101
: Data race on m_sock/m_server: guard socket state with a dedicated mutex.m_sock and m_server are read/written from multiple threads (queue, reconnect, direct send) without synchronization, leading to UB. Add a mutex (e.g., cs_sock) and guard all accesses.
@@ - /* Socket used to communicate with host */ - std::unique_ptr<Sock> m_sock{nullptr}; - /* Socket address containing host information */ - std::pair<struct sockaddr_storage, socklen_t> m_server{{}, sizeof(struct sockaddr_storage)}; + /* Protect socket state (must be acquired before cs when both are needed) */ + mutable Mutex cs_sock; + /* Socket used to communicate with host */ + std::unique_ptr<Sock> m_sock GUARDED_BY(cs_sock){nullptr}; + /* Socket address containing host information */ + std::pair<struct sockaddr_storage, socklen_t> m_server GUARDED_BY(cs_sock){{}, sizeof(struct sockaddr_storage)};Follow-up: wrap all reads/writes to m_sock/m_server in src/stats/rawsender.cpp with LOCK(cs_sock)/WITH_LOCK(cs_sock, ...).
Also applies to: 125-135
src/stats/client.cpp (1)
209-218
: Bug: integer metrics are formatted with “%f”.Using “%f” for all numeric types emits counters/timers as floats (e.g., 1.000000), which is non-standard and may break parsers. Format integers as integers; floats as floats.
- RawMessage msg{strprintf("%s%s%s:%f|%s", m_prefix, key, m_suffix, value, type)}; + auto fmt_value = [&](){ + if constexpr (std::is_floating_point_v<T1>) { + return strprintf("%f", static_cast<double>(value)); + } else { + return ToString(static_cast<int64_t>(value)); + } + }(); + RawMessage msg{strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, fmt_value, type)};Alternatively, add a small helper near the class to avoid the lambda.
♻️ Duplicate comments (2)
doc/release-notes-6837.md (1)
16-18
: Minor grammar and clarity tweaks.Tighten phrasing and fix agreement.
- - Unlike with UDP, if a TCP connection is interrupted, stats will _not_ be dropped and instead buffered according to - `-statsduration` and `-statsbatchsize` and flushed when connection is restored. This buffer is discarded every 5 + - Unlike with UDP, if a TCP connection is interrupted, stats are not dropped; they are buffered according to + `-statsduration` and `-statsbatchsize` and flushed when the connection is restored. This buffer is discarded every 5 minutes if connection is not restored by then. @@ - - If both `-statsport` and `-statshost` with a URL specifying a port is supplied, the `-statsport` value will be - ignored. + - If both `-statsport` and `-statshost` with a URL specifying a port are supplied, the `-statsport` value is ignored.Also applies to: 25-26
test/functional/feature_stats.py (1)
44-47
: Avoid catching broad Exception; preserve traceback.Catch concrete exceptions and chain for debuggability.
- except Exception as e: - if self.running: - raise AssertionError("Unexpected exception raised: " + type(e).__name__) + except (OSError, UnicodeDecodeError) as e: + if self.running: + raise AssertionError(f"Unexpected exception raised: {type(e).__name__}") from e
🧹 Nitpick comments (4)
doc/release-notes-6837.md (1)
11-12
: Use setext headings (repo style) for sections.Convert “### New Features” and “### Deprecations” to setext style to satisfy markdownlint MD003.
-### New Features +New Features +------------ @@ -### Deprecations +Deprecations +------------Also applies to: 20-20
src/stats/rawsender.h (1)
125-135
: Minor: IPv6 log formatting.To avoid ambiguous “host:port” in logs for IPv6, bracket IPv6 hosts in ToStringHostPort().
- std::string ToStringHostPort() const; + std::string ToStringHostPort() const; // Consider [host]:port for IPv6I’ve proposed the concrete change in rawsender.cpp.
test/functional/feature_stats.py (1)
115-132
: Add TCP path coverage and reconnection buffering test.Current tests only cover UDP. Add a simple TCP StatsServer (or a minimal sink), simulate a brief disconnect, assert buffered resend after reconnection, and add IPv6 happy-path when test_ipv6_local() is re-enabled.
I can draft a TCP test harness and a reconnection test case mirroring test_conn().
src/stats/rawsender.cpp (1)
230-231
: Bracket IPv6 address in ToStringHostPort().Avoid ambiguous logs for IPv6.
-std::string RawSender::ToStringHostPort() const { return strprintf("%s:%d", m_host, m_port); } +std::string RawSender::ToStringHostPort() const { + // Bracket IPv6 for readability in logs + return (m_host.find(':') != std::string::npos && m_host.find(']') == std::string::npos) + ? strprintf("[%s]:%d", m_host, m_port) + : strprintf("%s:%d", m_host, m_port); +}
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
doc/release-notes-6837.md
(1 hunks)src/bench/checkblock.cpp
(2 hunks)src/compat/compat.h
(1 hunks)src/init.cpp
(2 hunks)src/stats/client.cpp
(4 hunks)src/stats/client.h
(2 hunks)src/stats/rawsender.cpp
(2 hunks)src/stats/rawsender.h
(4 hunks)src/test/util/setup_common.cpp
(1 hunks)test/functional/feature_stats.py
(1 hunks)test/functional/test_runner.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- test/functional/test_runner.py
- src/compat/compat.h
- src/bench/checkblock.cpp
- src/init.cpp
- src/test/util/setup_common.cpp
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}
: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/stats/client.cpp
src/stats/rawsender.cpp
src/stats/rawsender.h
src/stats/client.h
doc/**
📄 CodeRabbit inference engine (CLAUDE.md)
Unless specifically prompted, avoid making changes to the doc directory (documentation)
Files:
doc/release-notes-6837.md
test/functional/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Functional tests should be written in Python and placed in test/functional/
Files:
test/functional/feature_stats.py
🧬 Code graph analysis (4)
src/stats/rawsender.cpp (1)
src/stats/rawsender.h (1)
RawSender
(56-135)
src/stats/rawsender.h (1)
src/stats/rawsender.cpp (18)
RawSender
(21-52)RawSender
(54-68)Send
(161-172)Send
(161-161)SendDirectly
(174-228)SendDirectly
(174-174)ToStringHostPort
(230-230)ToStringHostPort
(230-230)Connect
(70-100)Connect
(70-70)Reconnect
(102-120)Reconnect
(102-102)ReconnectThread
(122-159)ReconnectThread
(122-122)QueueAdd
(232-255)QueueAdd
(232-232)QueueFlush
(257-269)QueueFlush
(257-257)
src/stats/client.h (1)
src/stats/client.cpp (18)
make
(92-172)make
(92-92)key
(53-53)key
(53-53)key
(54-54)key
(54-54)key
(55-55)key
(55-55)key
(56-56)key
(56-56)key
(57-57)key
(57-57)key
(58-58)key
(58-58)key
(61-61)key
(61-61)key
(62-62)key
(62-62)
test/functional/feature_stats.py (2)
test/functional/test_framework/netutil.py (1)
test_ipv6_local
(148-162)test/functional/test_framework/test_node.py (2)
assert_start_raises_init_error
(606-649)assert_debug_log
(444-472)
🪛 GitHub Actions: Clang Diff Format Check
src/stats/client.cpp
[error] 52-68: Clang-format differences found in src/stats/client.cpp. Run 'clang-format-diff.py -p1' to apply formatting changes.
src/stats/rawsender.cpp
[error] 185-203: Clang-format differences found in src/stats/rawsender.cpp. Run 'clang-format-diff.py -p1' to apply formatting changes.
src/stats/client.h
[error] 49-59: Clang-format differences found in src/stats/client.h. Run 'clang-format-diff.py -p1' to apply formatting changes.
🪛 markdownlint-cli2 (0.17.2)
doc/release-notes-6837.md
11-11: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
20-20: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
🪛 Ruff (0.12.2)
test/functional/feature_stats.py
44-44: Do not catch blind exception: Exception
(BLE001)
46-46: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
70-70: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: win64-build / Build source
- GitHub Check: linux64_multiprocess-build / Build source
- GitHub Check: linux64_tsan-build / Build source
- GitHub Check: arm-linux-build / Build source
- GitHub Check: mac-build / Build source
- GitHub Check: linux64_nowallet-build / Build source
- GitHub Check: linux64_ubsan-build / Build source
- GitHub Check: linux64_sqlite-build / Build source
- GitHub Check: linux64_fuzz-build / Build source
- GitHub Check: linux64-build / Build source
🔇 Additional comments (5)
src/stats/client.h (2)
40-60
: Public API surface looks sane.Factory-returned inactive client on empty -statshost avoids nullptr checks; string_view overloads are good for perf.
49-59
: Install clang-format and reformat src/stats/client.h
CI reported formatting issues but clang-format isn’t available; install clang-format locally and runclang-format --style=file -i src/stats/client.hthen commit the updated file.
src/stats/rawsender.cpp (1)
185-203
: Verify clang-format formatting: clang-format isn’t available in this environment; please rungit clang-format --extensions=cpp HEAD~1 -- src/stats/rawsender.cpp
locally to confirm the hunk meets project style.
src/stats/client.cpp (2)
92-171
: URL/host parsing looks good; port precedence log is helpful.Factory path and error messages align with release notes; bilingual errors propagate correctly.
52-68
: Manual clang-format verification required
Unable to run clang-format in this environment—please manually runclang-format --style=file
onsrc/stats/client.cpp
and address any formatting diffs flagged by CI.
Also add some basic docs for RawSender and privatize functions
Also we should initialize the Statsd client *after* setting g_socket_events_mode or risk a crash when we try to establish a TCP connection later.
Doesn't make sense to track them anymore as logic triggering SendDirectly is more involved and we can't reliably add both stats to know how many messages were actually sent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/stats/client.cpp (2)
201-209
: Sampling uses std::uniform_real_distribution with FastRandomContext (likely not a URBG) and will fail to compile; replace with FastRandomContext::randrange().FastRandomContext doesn’t model UniformRandomBitGenerator; passing it to standard distributions is invalid. Use randrange() and compare against a scaled threshold. Also, keep the WITH_LOCK pattern but return a bool value, not a return-from-function.
- if (never_send || (!always_send && - WITH_LOCK(cs, return sample_rate < std::uniform_real_distribution<float>(0.f, 1.f)(insecure_rand)))) { + if (never_send || (!always_send && + WITH_LOCK(cs, return insecure_rand.randrange(1'000'000ULL) >= static_cast<uint64_t>(sample_rate * 1'000'000ULL)))) { return true; }
210-215
: Format string type-mismatch: using “%f” for all T1 breaks integer sends.strprintf with “%f” requires double; passing integers is UB. Format value type-agnostically, e.g., via ToString(value).
- RawMessage msg{strprintf("%s%s%s:%f|%s", m_prefix, key, m_suffix, value, type)}; + RawMessage msg{strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, ToString(value), type)};
♻️ Duplicate comments (2)
doc/release-notes-6837.md (1)
11-18
: Unify heading style (setext) and tighten phrasingConvert ATX headings to the repo’s setext style and fix minor grammar/spacing. Matches prior feedback.
-### New Features +New Features +------------ - - Unlike with UDP, if a TCP connection is interrupted, stats will _not_ be dropped and instead buffered according to - `-statsduration` and `-statsbatchsize` and flushed when connection is restored. This buffer is discarded every 5 + - Unlike with UDP, if a TCP connection is interrupted, stats will not be dropped and instead buffered according to + `-statsduration` and `-statsbatchsize` and flushed when the connection is restored. This buffer is discarded every 5 minutes if connection is not restored by then. @@ -### Deprecations +Deprecations +------------ @@ - - If both `-statsport` and `-statshost` with a URL specifying a port is supplied, the `-statsport` value will be - ignored. + - If both `-statsport` and `-statshost` with a URL specifying a port are supplied, the `-statsport` value will be ignored.Also applies to: 20-26
test/functional/feature_stats.py (1)
44-46
: Don’t catch broad Exception; preserve tracebackLimit to expected exceptions and chain for debuggability. Mirrors earlier guidance.
- except Exception as e: - if self.running: - raise AssertionError("Unexpected exception raised: " + type(e).__name__) + except (OSError, UnicodeDecodeError) as e: + if self.running: + raise AssertionError(f"Unexpected exception raised: {type(e).__name__}") from e
🧹 Nitpick comments (8)
test/functional/feature_stats.py (2)
61-70
: Make queue polling respect the remaining deadlineAvoid blocking longer than the overall timeout when no messages arrive.
- def assert_msg_received(self, expected_msg: str, timeout: int = 30): + def assert_msg_received(self, expected_msg: str, timeout: int = 30): deadline = time.time() + timeout while time.time() < deadline: try: - msg = self.queue.get(timeout=5) + remaining = max(0.0, deadline - time.time()) + # Cap per-get wait to remaining time (and avoid 0 which makes get non-blocking). + msg = self.queue.get(timeout=max(0.1, min(5.0, remaining))) if expected_msg in msg: return except queue.Empty: continue raise AssertionError(f"Did not receive message containing '{expected_msg}' within {timeout} seconds")
15-15
: Simplify imports and avoid runtime-parametrized Queue typingUse a single import and a non-parameterized annotation to avoid potential runtime typing issues across Python versions.
-from queue import Queue +# Use the module form to reference Queue and Empty consistently- self.queue: Queue[str] = Queue() + self.queue: queue.Queue = queue.Queue()Also applies to: 23-23
src/stats/rawsender.cpp (3)
16-20
: Don’t rely on chrono literals in .cpp either
5min
requiresusing namespace std::chrono_literals;
. Make it self-contained to avoid hidden dependencies.-constexpr std::chrono::seconds RECONNECT_TIMEOUT_MAX{5min}; +constexpr std::chrono::seconds RECONNECT_TIMEOUT_MAX{std::chrono::minutes{5}};
125-163
: Reconnection “stale buffer” timestamp semantics can drop fresh messages
m_reconn_stats.m_timestamp
is set only on discard, not when items first enter the reconnection buffer. On first loop (timestamp==0), you treat the buffer as stale even if items were just added. Consider setting the timestamp when the reconnection queue first becomes non-empty (e.g., inSendDirectly()
when pushing the first item) and/or when you enqueue during send error handling.Minimal change sketch (update when first enqueue happens):
// inside the two places that enqueue to m_reconn_queue in SendDirectly() WITH_LOCK(cs, QueueAdd(m_reconn_queue, msg)); +WITH_LOCK(cs_net, if (m_reconn_stats.m_timestamp == 0) m_reconn_stats.m_timestamp = GetTime());
240-263
: Batch size type mixing
queue.back().size() + msg.size() >= batch_size
mixessize_t
withuint64_t
. Castbatch_size
tosize_t
to avoid signed/unsigned or width warnings on some platforms, and reserve with the same type.- const auto& [batch_size, batch_delim] = m_batching_opts; + const auto& [batch_size_u64, batch_delim] = m_batching_opts; + const auto batch_size = static_cast<size_t>(batch_size_u64);And adjust the two uses of
batch_size
accordingly.src/stats/client.cpp (3)
137-139
: Gate this notice behind -debug=net per PR intent.Switch LogPrintf to LogPrint(BCLog::NET, ...) to keep normal startup quiet.
- LogPrintf("%s: Supplied URL with port, ignoring -statsport\n", __func__); + LogPrint(BCLog::NET, "%s: Supplied URL with port, ignoring -statsport\n", __func__);
174-189
: Initialization log should also be gated behind -debug=net.Keep verbose network destination details under BCLog::NET per PR description.
- LogPrintf("%s: Initialized to transmit stats to %s:%d\n", __func__, host, port); + LogPrint(BCLog::NET, "%s: Initialized to transmit stats to %s:%d\n", __func__, host, port);
92-99
: StatsdClient is non-abstract and default-constructible
All virtual methods in StatsdClient have in-class definitions and there are no other constructors declared, so the compiler provides an implicit public default ctor. Thus,std::make_unique<StatsdClient>()
compiles and yields a valid no-op client. Optionally, you may prefer returningnullptr
to more explicitly signal “inactive.”
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
doc/release-notes-6837.md
(1 hunks)src/bench/checkblock.cpp
(2 hunks)src/compat/compat.h
(1 hunks)src/init.cpp
(3 hunks)src/stats/client.cpp
(4 hunks)src/stats/client.h
(2 hunks)src/stats/rawsender.cpp
(1 hunks)src/stats/rawsender.h
(3 hunks)src/test/util/setup_common.cpp
(1 hunks)test/functional/feature_stats.py
(1 hunks)test/functional/test_runner.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- src/test/util/setup_common.cpp
- src/bench/checkblock.cpp
- src/compat/compat.h
- test/functional/test_runner.py
🧰 Additional context used
📓 Path-based instructions (3)
src/**/*.{cpp,h,cc,cxx,hpp}
📄 CodeRabbit inference engine (CLAUDE.md)
src/**/*.{cpp,h,cc,cxx,hpp}
: Dash Core C++ codebase must be written in C++20 and require at least Clang 16 or GCC 11.1
Dash uses unordered_lru_cache for efficient caching with LRU eviction
Files:
src/stats/rawsender.cpp
src/init.cpp
src/stats/rawsender.h
src/stats/client.h
src/stats/client.cpp
test/functional/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Functional tests should be written in Python and placed in test/functional/
Files:
test/functional/feature_stats.py
doc/**
📄 CodeRabbit inference engine (CLAUDE.md)
Unless specifically prompted, avoid making changes to the doc directory (documentation)
Files:
doc/release-notes-6837.md
🧠 Learnings (8)
📚 Learning: 2025-09-09T21:36:31.495Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.495Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T21:36:11.805Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.805Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T21:36:31.495Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:31.495Z
Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T21:36:58.937Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.937Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T13:24:23.293Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:70-97
Timestamp: 2025-09-09T13:24:23.293Z
Learning: In RawSender class, m_sock and m_server members are accessed concurrently between SendDirectly() (which can be called from any thread) and Connect() (called by ReconnectThread), requiring proper synchronization even though Connect() has limited call sites.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T21:36:58.937Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:223-232
Timestamp: 2025-09-09T21:36:58.937Z
Learning: In RawSender class (src/stats/rawsender.cpp), socket access operations (m_sock and m_server) should be protected by the cs_net mutex, not avoiding synchronization entirely. While lock overhead concerns are valid in general, socket operations require proper synchronization via cs_net.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-09-09T21:36:11.805Z
Learnt from: kwvg
PR: dashpay/dash#6837
File: src/stats/rawsender.cpp:0-0
Timestamp: 2025-09-09T21:36:11.805Z
Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
Applied to files:
src/stats/rawsender.cpp
src/stats/rawsender.h
📚 Learning: 2025-07-29T14:32:48.369Z
Learnt from: kwvg
PR: dashpay/dash#6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.
Applied to files:
src/stats/rawsender.cpp
🧬 Code graph analysis (6)
src/stats/rawsender.cpp (3)
src/stats/rawsender.h (1)
RawSender
(57-136)src/stats/client.cpp (3)
Schedule
(191-194)Schedule
(191-191)scheduler
(53-53)src/util/time.h (1)
count_seconds
(56-56)
src/init.cpp (1)
src/stats/client.cpp (2)
make
(92-172)make
(92-92)
src/stats/rawsender.h (3)
src/stats/rawsender.cpp (22)
RawSender
(22-50)RawSender
(52-62)Schedule
(64-68)Schedule
(64-64)Send
(165-177)Send
(165-165)SendDirectly
(179-236)SendDirectly
(179-179)ToStringHostPort
(238-238)ToStringHostPort
(238-238)Connect
(70-102)Connect
(70-70)Reconnect
(104-123)Reconnect
(104-104)ReconnectThread
(125-163)ReconnectThread
(125-125)QueueAdd
(240-263)QueueAdd
(240-240)QueueFlush
(265-279)QueueFlush
(265-265)QueueThreadMain
(281-290)QueueThreadMain
(281-281)src/stats/client.cpp (2)
Schedule
(191-194)Schedule
(191-191)src/util/sock.cpp (8)
Send
(68-71)Send
(68-68)Connect
(78-81)Connect
(78-78)Sock
(46-46)Sock
(48-48)Sock
(50-54)Sock
(56-56)
test/functional/feature_stats.py (2)
test/functional/test_framework/netutil.py (1)
test_ipv6_local
(148-162)test/functional/test_framework/test_node.py (2)
assert_start_raises_init_error
(606-649)assert_debug_log
(444-472)
src/stats/client.h (2)
src/stats/client.cpp (21)
make
(92-172)make
(92-92)Schedule
(191-194)Schedule
(191-191)scheduler
(53-53)key
(55-55)key
(55-55)key
(56-56)key
(56-56)key
(57-57)key
(57-57)key
(58-58)key
(58-58)key
(59-59)key
(59-59)key
(60-60)key
(60-60)key
(62-62)key
(62-62)key
(63-63)key
(63-63)src/stats/rawsender.cpp (2)
Schedule
(64-68)Schedule
(64-64)
src/stats/client.cpp (3)
src/random.h (1)
max
(256-256)src/stats/rawsender.cpp (2)
Schedule
(64-68)Schedule
(64-64)src/node/interfaces.cpp (9)
Assert
(92-92)Assert
(93-93)Assert
(117-117)Assert
(259-259)Assert
(282-282)Assert
(404-404)Assert
(605-608)Assert
(817-817)Assert
(1123-1126)
🪛 GitHub Actions: Clang Diff Format Check
src/stats/rawsender.cpp
[error] 1-1: Clang-format differences detected in src/stats/rawsender.cpp. Run clang-format (or clang-format-diff.py) to fix formatting.
src/stats/client.h
[error] 1-1: Clang-format differences detected in src/stats/client.h. Run clang-format (or clang-format-diff.py) to fix formatting.
src/stats/client.cpp
[error] 1-1: Clang-format differences detected in src/stats/client.cpp. Run clang-format (or clang-format-diff.py) to fix formatting.
🪛 Ruff (0.12.2)
test/functional/feature_stats.py
44-44: Do not catch blind exception: Exception
(BLE001)
46-46: Within an except
clause, raise exceptions with raise ... from err
or raise ... from None
to distinguish them from errors in exception handling
(B904)
70-70: Avoid specifying long messages outside the exception class
(TRY003)
🪛 markdownlint-cli2 (0.17.2)
doc/release-notes-6837.md
11-11: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
20-20: Heading style
Expected: setext; Actual: atx
(MD003, heading-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: linux64_multiprocess-test / Test source
🔇 Additional comments (14)
src/init.cpp (3)
786-786
: Deprecate -statsport via hidden arg — LGTMHiding the flag while keeping it functional aligns with the deprecation plan and tests that verify precedence.
1648-1657
: Early StatsD init: confirm no dependency on later -dns gatingInitializing g_stats_client here avoids null access, but it precedes the point where fNameLookup/-dns is processed (Line 1833). Please confirm StatsdClient::make doesn’t rely on that toggle or other state set later; otherwise, consider deferring hostname DNS decisions to the sender’s Schedule path.
Would you like a quick grep script to locate any name-resolution in StatsdClient::make/RawSender for confirmation?
2305-2305
: Always calling Schedule() — LGTM, assuming it’s a no-op when inactiveIf Schedule() is a no-op for inactive clients, this is fine and keeps reconnection/queueing consistent.
src/stats/rawsender.h (2)
68-76
: Thread-safety contract on Send/SendDirectly is clear and correctThe annotations and split between queuing and direct send look good w.r.t.
cs
andcs_net
.
106-114
: Avoid header-only chrono literal; use plain seconds to prevent ADL issuesUsing
1s
in a header relies onstd::chrono_literals
being visible to every TU. Initialize with an integer to avoid that dependency.- /* Time between reconnection attempts */ - std::chrono::seconds m_timeout{1s}; + /* Time between reconnection attempts */ + std::chrono::seconds m_timeout{1};⛔ Skipped due to learnings
Learnt from: kwvg PR: dashpay/dash#6815 File: src/chainlock/chainlock.h:68-69 Timestamp: 2025-08-13T16:10:32.972Z Learning: The Dash codebase extensively uses std::chrono_literals throughout, including in header files. Chrono literals like 0s, 1min, etc. are acceptable and preferred over explicit constructors like std::chrono::seconds::zero().
Learnt from: kwvg PR: dashpay/dash#6815 File: src/chainlock/chainlock.h:68-69 Timestamp: 2025-08-13T16:10:32.972Z Learning: The Dash codebase extensively uses std::chrono_literals throughout, including in header files. The pattern using namespace std::chrono_literals; is used globally in src/util/time.h, and chrono literals like 0s are commonly used for atomic member initialization in headers (e.g., src/net.h). Chrono literals like 0s, 1min, etc. are preferred over explicit constructors.
Learnt from: kwvg PR: dashpay/dash#6837 File: src/stats/rawsender.cpp:0-0 Timestamp: 2025-09-09T21:36:31.495Z Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access is protected by the cs_net mutex. This mutex coordinates between the TCP send path in SendDirectly() and the reconnection operations in Connect()/Reconnect() methods, ensuring proper synchronization of socket state.
Learnt from: kwvg PR: dashpay/dash#6837 File: src/stats/rawsender.cpp:0-0 Timestamp: 2025-09-09T21:36:11.805Z Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks. The implementation correctly uses cs_net with GUARDED_BY annotations and EXCLUSIVE_LOCKS_REQUIRED to synchronize socket access between SendDirectly() and ReconnectThread().
Learnt from: kwvg PR: dashpay/dash#6837 File: src/stats/rawsender.cpp:0-0 Timestamp: 2025-09-09T21:36:31.495Z Learning: In RawSender class (src/stats/rawsender.cpp), m_sock access requires synchronization using the cs_net mutex. The race condition exists between SendDirectly() (callable from any thread) and Connect() (called by ReconnectThread), where both read and write m_sock concurrently. The cs_net mutex properly coordinates these network operations.
Learnt from: kwvg PR: dashpay/dash#6837 File: src/stats/rawsender.cpp:223-232 Timestamp: 2025-09-09T21:36:58.937Z Learning: In RawSender class (src/stats/rawsender.cpp), socket operations are properly synchronized using the cs_net mutex. The m_sock and m_server members are GUARDED_BY(cs_net), and methods like Connect(), SendDirectly(), and ReconnectThread() use appropriate locking with cs_net to prevent race conditions on socket access.
Learnt from: kwvg PR: dashpay/dash#6837 File: src/stats/rawsender.cpp:0-0 Timestamp: 2025-09-09T21:36:11.805Z Learning: In RawSender class (src/stats/rawsender.cpp), cs_net is the appropriate mutex for protecting socket access (m_sock) and network operations, not additional custom locks.
src/stats/rawsender.cpp (1)
1-1
: Fix clang-format diff checkCI reports clang-format differences for this file. Please run clang-format (or clang-format-diff.py -p1) on the touched hunks.
src/stats/client.h (2)
41-64
: API surface looks good; factory + string_view overloadsFactory-based init,
Schedule
, andstring_view
overloads are a solid improvement. No issues spotted.
1-1
: Fix clang-format diff checkCI flags formatting for this header as well. Please run clang-format on the changed regions.
src/stats/client.cpp (6)
44-87
: Impl structure looks solid.Nice encapsulation via a private implementation and clean surface on StatsdClient. Mutex + FastRandomContext state is scoped correctly.
100-109
: Port range validation is good.Covers 1..65535 and returns a bilingual error. No action needed.
151-159
: Prefix/suffix sanitizer looks good.Trims leading/trailing dots as expected. No action.
191-195
: Scheduler pass-through is clean.Assert + delegate to RawSender::Schedule() matches RawSender’s contract.
217-219
: Good: send-failure logs are gated to NET.Matches the “gate logs behind -debug=net” objective.
1-224
: Ensure clang-format is installed and reformat src/stats/client.cpp
CI formatting checks are failing because clang-format isn’t available in this environment; install the CI-matching clang-format, run:clang-format -i src/stats/client.cppand verify there are no formatting diffs to unblock the pipeline.
#include <logging.h> | ||
#include <random.h> | ||
#include <stats/rawsender.h> | ||
#include <sync.h> | ||
#include <util/check.h> | ||
#include <util/strencodings.h> | ||
#include <util/system.h> | ||
#include <util/translation.h> | ||
|
||
#include <cmath> | ||
#include <limits> | ||
#include <random> | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Missing include for std::clamp (compile error on some toolchains).
std::clamp is used later but isn’t included. Add it explicitly to avoid build breaks with stricter compilers.
#include <logging.h>
#include <random.h>
#include <stats/rawsender.h>
#include <sync.h>
#include <util/check.h>
#include <util/strencodings.h>
#include <util/system.h>
#include <util/translation.h>
#include <cmath>
+#include <algorithm>
#include <limits>
#include <random>
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
#include <logging.h> | |
#include <random.h> | |
#include <stats/rawsender.h> | |
#include <sync.h> | |
#include <util/check.h> | |
#include <util/strencodings.h> | |
#include <util/system.h> | |
#include <util/translation.h> | |
#include <cmath> | |
#include <limits> | |
#include <random> | |
#include <logging.h> | |
#include <random.h> | |
#include <stats/rawsender.h> | |
#include <sync.h> | |
#include <util/check.h> | |
#include <util/strencodings.h> | |
#include <util/system.h> | |
#include <util/translation.h> | |
#include <cmath> | |
#include <algorithm> | |
#include <limits> | |
#include <random> |
🤖 Prompt for AI Agents
In src/stats/client.cpp around lines 9 to 21, the file uses std::clamp later but
does not include <algorithm>, which causes compile failures on stricter
toolchains; add an explicit include for <algorithm> among the other headers at
the top of the file to ensure std::clamp is available and rebuild.
// Could be a URL, try to parse it. | ||
const size_t scheme_idx{host.find(URL_SCHEME_DELIMITER)}; | ||
if (scheme_idx != std::string::npos) { | ||
// Parse the scheme and trim it out of the URL if we succeed | ||
if (scheme_idx == 0) { | ||
return util::Error{_("No text before the scheme delimiter, malformed URL")}; | ||
} | ||
std::string scheme{ToLower(host.substr(/*pos=*/0, scheme_idx))}; | ||
if (scheme != "tcp" && scheme != "udp") { | ||
return util::Error{_("Unsupported URL scheme, must begin with tcp:// or udp://")}; | ||
} | ||
use_tcp = scheme == "tcp"; | ||
host = host.substr(scheme_idx + URL_SCHEME_DELIMITER.length()); | ||
|
||
// Strip trailing slashes and parse the port | ||
const size_t colon_idx{host.rfind(':')}; | ||
if (colon_idx != std::string::npos) { | ||
// Remove all forward slashes found after the port delimiter (colon) | ||
host = std::string( | ||
host.begin(), host.end() - [&colon_idx, &host]() { | ||
const size_t slash_idx{host.find('/', /*pos=*/colon_idx + 1)}; | ||
return slash_idx != std::string::npos ? host.length() - slash_idx : 0; | ||
}()); | ||
uint16_t port_url{0}; | ||
SplitHostPort(host, port_url, host); | ||
if (port_url != 0) { | ||
if (args.IsArgSet("-statsport")) { | ||
LogPrintf("%s: Supplied URL with port, ignoring -statsport\n", __func__); | ||
} | ||
port = port_url; | ||
} | ||
} else { | ||
// There was no port specified, remove everything after the first forward slash | ||
host = host.substr(/*pos=*/0, host.find("/")); | ||
} | ||
|
||
if (host.empty()) { | ||
return util::Error{_("No host specified, malformed URL")}; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
URL/host parsing: avoid aliasing in SplitHostPort and simplify path trimming; improve IPv6 edge handling.
- Passing the same variable as both input and output to SplitHostPort risks aliasing surprises depending on its implementation.
- The erase logic around slicing after the colon is hard to read; simpler substring ops are clearer.
- Consider handling bracketless IPv6 explicitly (require brackets when port is present), returning a helpful error.
- const size_t colon_idx{host.rfind(':')};
- if (colon_idx != std::string::npos) {
- // Remove all forward slashes found after the port delimiter (colon)
- host = std::string(
- host.begin(), host.end() - [&colon_idx, &host]() {
- const size_t slash_idx{host.find('/', /*pos=*/colon_idx + 1)};
- return slash_idx != std::string::npos ? host.length() - slash_idx : 0;
- }());
- uint16_t port_url{0};
- SplitHostPort(host, port_url, host);
+ const size_t colon_idx{host.rfind(':')};
+ if (colon_idx != std::string::npos) {
+ // Trim any path/query after the port
+ const size_t slash_idx{host.find('/', colon_idx + 1)};
+ if (slash_idx != std::string::npos) host.erase(slash_idx);
+ uint16_t port_url{0};
+ std::string host_in = host; // avoid aliasing input/output
+ SplitHostPort(host_in, port_url, host);
if (port_url != 0) {
- if (args.IsArgSet("-statsport")) {
- LogPrintf("%s: Supplied URL with port, ignoring -statsport\n", __func__);
- }
+ if (args.IsArgSet("-statsport")) {
+ LogPrint(BCLog::NET, "%s: Supplied URL with port, ignoring -statsport\n", __func__);
+ }
port = port_url;
}
} else {
// There was no port specified, remove everything after the first forward slash
host = host.substr(/*pos=*/0, host.find("/"));
}
Optionally, reject bracketless IPv6 with a port:
if (host.find(':') != std::string::npos && host.find(']') == std::string::npos && host.find('[') == std::string::npos && host.find(':', host.rfind(':')+1) == std::string::npos) {
return util::Error{_("IPv6 address with port must be enclosed in []")};
}
std::optional<bilingual_str> error_opt; | ||
auto statsd_ptr = std::make_unique<StatsdClientImpl>( | ||
host, port, use_tcp, args.GetIntArg("-statsbatchsize", DEFAULT_STATSD_BATCH_SIZE), | ||
args.GetIntArg("-statsduration", DEFAULT_STATSD_DURATION), | ||
sanitize_string(args.GetArg("-statsprefix", DEFAULT_STATSD_PREFIX)), | ||
sanitize_string(args.GetArg("-statssuffix", DEFAULT_STATSD_SUFFIX)), error_opt); | ||
if (error_opt.has_value()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Validate -statsbatchsize and -statsduration ranges (prevent negative/overflow misuse).
GetIntArg returns signed; negative values will wrap when cast to uint64_t. Either clamp to sane bounds or error out.
- auto statsd_ptr = std::make_unique<StatsdClientImpl>(
- host, port, use_tcp, args.GetIntArg("-statsbatchsize", DEFAULT_STATSD_BATCH_SIZE),
- args.GetIntArg("-statsduration", DEFAULT_STATSD_DURATION),
+ const int64_t batch_arg = args.GetIntArg("-statsbatchsize", DEFAULT_STATSD_BATCH_SIZE);
+ const int64_t dur_arg = args.GetIntArg("-statsduration", DEFAULT_STATSD_DURATION);
+ if (batch_arg < 1) return util::Error{_("statsbatchsize must be >= 1")};
+ if (dur_arg < 0) return util::Error{_("statsduration must be >= 0")};
+ auto statsd_ptr = std::make_unique<StatsdClientImpl>(
+ host, port, use_tcp, static_cast<uint64_t>(batch_arg),
+ static_cast<uint64_t>(dur_arg),
sanitize_string(args.GetArg("-statsprefix", DEFAULT_STATSD_PREFIX)),
sanitize_string(args.GetArg("-statssuffix", DEFAULT_STATSD_SUFFIX)), error_opt);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
std::optional<bilingual_str> error_opt; | |
auto statsd_ptr = std::make_unique<StatsdClientImpl>( | |
host, port, use_tcp, args.GetIntArg("-statsbatchsize", DEFAULT_STATSD_BATCH_SIZE), | |
args.GetIntArg("-statsduration", DEFAULT_STATSD_DURATION), | |
sanitize_string(args.GetArg("-statsprefix", DEFAULT_STATSD_PREFIX)), | |
sanitize_string(args.GetArg("-statssuffix", DEFAULT_STATSD_SUFFIX)), error_opt); | |
if (error_opt.has_value()) { | |
std::optional<bilingual_str> error_opt; | |
const int64_t batch_arg = args.GetIntArg("-statsbatchsize", DEFAULT_STATSD_BATCH_SIZE); | |
const int64_t dur_arg = args.GetIntArg("-statsduration", DEFAULT_STATSD_DURATION); | |
if (batch_arg < 1) return util::Error{_("statsbatchsize must be >= 1")}; | |
if (dur_arg < 0) return util::Error{_("statsduration must be >= 0")}; | |
auto statsd_ptr = std::make_unique<StatsdClientImpl>( | |
host, port, use_tcp, | |
static_cast<uint64_t>(batch_arg), | |
static_cast<uint64_t>(dur_arg), | |
sanitize_string(args.GetArg("-statsprefix", DEFAULT_STATSD_PREFIX)), | |
sanitize_string(args.GetArg("-statssuffix", DEFAULT_STATSD_SUFFIX)), | |
error_opt); | |
if (error_opt.has_value()) { |
🤖 Prompt for AI Agents
In src/stats/client.cpp around lines 161-167 the args.GetIntArg results for
-statsbatchsize and -statsduration are used where unsigned/64-bit values are
expected, which allows negative signed ints to wrap on cast; read each into a
signed type, validate they are within sane non-negative bounds (or clamp) and
check an upper limit to avoid overflow, and if invalid set error_opt with a
clear bilingual_str error and return/fail rather than passing a wrapped value;
only after validation perform a safe static_cast<uint64_t> (or use std::clamp)
before constructing StatsdClientImpl.
// If there are threads, interrupt and stop it | ||
if (m_reconn.joinable()) { | ||
m_reconn_interrupt(); | ||
m_reconn.join(); | ||
} | ||
// Flush queue of uncommitted messages | ||
QueueFlush(); | ||
QueueFlush(m_reconn_queue); | ||
QueueFlush(m_queue); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: QueueFlush mutates the same vector it iterates (self-reenqueue under TCP)
QueueFlush(m_reconn_queue)
can re-enqueue into m_reconn_queue
via SendDirectly()
on TCP-not-connected paths, invalidating the iteration and causing UB. Swap to locals before flushing.
- // Flush queue of uncommitted messages
- QueueFlush(m_reconn_queue);
- QueueFlush(m_queue);
+ // Flush queues safely: swap into locals to avoid mutating while iterating
+ std::vector<RawMessage> q_reconn, q_main;
+ {
+ LOCK(cs);
+ m_reconn_queue.swap(q_reconn);
+ m_queue.swap(q_main);
+ }
+ QueueFlush(q_reconn);
+ QueueFlush(q_main);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// If there are threads, interrupt and stop it | |
if (m_reconn.joinable()) { | |
m_reconn_interrupt(); | |
m_reconn.join(); | |
} | |
// Flush queue of uncommitted messages | |
QueueFlush(); | |
QueueFlush(m_reconn_queue); | |
QueueFlush(m_queue); | |
} | |
// If there are threads, interrupt and stop it | |
if (m_reconn.joinable()) { | |
m_reconn_interrupt(); | |
m_reconn.join(); | |
} | |
// Flush queues safely: swap into locals to avoid mutating while iterating | |
std::vector<RawMessage> q_reconn, q_main; | |
{ | |
LOCK(cs); | |
m_reconn_queue.swap(q_reconn); | |
m_queue.swap(q_main); | |
} | |
QueueFlush(q_reconn); | |
QueueFlush(q_main); | |
} |
🤖 Prompt for AI Agents
In src/stats/rawsender.cpp around lines 54 to 62, QueueFlush is called directly
on m_reconn_queue and m_queue but QueueFlush can call SendDirectly() which may
re-enqueue into the same vector and invalidate the iteration; to fix, swap each
member queue with a local empty container before calling QueueFlush (e.g.,
create a local vector, std::swap(local, m_reconn_queue), then call
QueueFlush(local)), so any re-enqueued items go into the original member queue
and do not affect the iteration; ensure this is done for both m_reconn_queue and
m_queue and preserve any necessary locking/ownership semantics around the swap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK 1e750cc
{ | ||
auto stats_client = StatsdClient::make(*m_node.args); | ||
if (!stats_client) { | ||
throw std::runtime_error{strprintf("Cannot init Statsd client (%s)", util::ErrorString(stats_client).original)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The app should start whatever stats_client
is initialized or not IMO.
It's not a critical failure.
nit: avoid code duplication between init.cpp & setup_common.cpp
argsman.AddArg("-statsduration=<ms>", strprintf("Specify the number of milliseconds between stats messages (default: %d)", DEFAULT_STATSD_DURATION), ArgsManager::ALLOW_ANY, OptionsCategory::STATSD); | ||
argsman.AddArg("-statshost=<ip>", strprintf("Specify statsd host (default: %s)", DEFAULT_STATSD_HOST), ArgsManager::ALLOW_ANY, OptionsCategory::STATSD); | ||
argsman.AddArg("-statsport=<port>", strprintf("Specify statsd port (default: %u)", DEFAULT_STATSD_PORT), ArgsManager::ALLOW_ANY, OptionsCategory::STATSD); | ||
hidden_args.emplace_back("-statsport"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO it should be different PRs; deprecation of statsport & refactorings.
std::string scheme{ToLower(host.substr(/*pos=*/0, scheme_idx))}; | ||
if (scheme != "udp") { | ||
return util::Error{_("Unsupported URL scheme, must begin with udp://")}; | ||
if (scheme != "tcp" && scheme != "udp") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why possible do we need tcp support actually?
stats daemon is just a utility to collect metrics; a single lost message is not an issue at all; there's no requirements for guaranteed delivery; no exact numbers are needed.
|
||
while (!m_reconn_interrupt) { | ||
std::deque<RawMessage> queue{}; | ||
std::vector<RawMessage> queue{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::deque
is a better choice for metrics; because metrics should be:
- minimal overhead (in average)
- minimum latency (for each request)
- approximitally correct
std::vector
call re-allocation of all internal data to the copy every time when size is over some certain threshold. For overall during a day (or even minute) it's completely fine; but in some particular moment it could be 50ms delay because need to re-allocation 100k objects to a new copy of data; and it breaks "low latency" requirement for metrics.
std::deque
is a better fit because overall overhead is bigger than std::vector
, but it doesn't have delays for ms in random moment of time
- The StatsD client now supports TCP hosts, TCP can be opted for with the new URL syntax supported by `-statshost` (e.g. | ||
`tcp://127.0.0.1:8125`). UDP (`udp://`) remains the default and will be assumed if a protocol is not specified. | ||
|
||
- Unlike with UDP, if a TCP connection is interrupted, stats will _not_ be dropped and instead buffered according to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what will happen if queue
of new messages is bigger than bandwidth
to send statsd?
Also, if it is limited by firewall, not sure how stats
service will behave. I vote to drop TCP support if there's no strong demand from users. @PastaPastaPasta @UdjinM6
Motivation
Continuing the work in dash#6267, this pull request aims to allow IPv6 connections, add TCP connection support and message buffering if a TCP connection drops. This should allow hosts that want stronger connection guarantees to their Statsd server to opt for TCP expecting lower stats loss as stats generated within ~5 minutes are preserved in the buffer.
To allow opting for TCP, support for URLs have been introduced for
statshost
and can be formatted astcp://127.0.0.1:8125
. Asstatshost
can also include the port,statsport
has been deprecated and will be removed in a future release of Dash Core.Additional Information
Downtime buffering is only available for TCP, behavior of UDP connections remain unaffected and messages that fail to send will remain dropped.
The success and failure stats in debug logging has been removed as reconnection logic makes them no longer a reliable means of measuring how many messages overall were generated at runtime.
To avoid noise, all
RawSender
logs andStatsdClient
send failure logs are now gated behind-debug=net
.This does not affect startup validation as errors are no longer ignored and trigger an
InitError()
. Connection failure will not halt startup but lookup failure will, this is becauseLookupHost()
is used to identify if the connection is IPv4/IPv6.After startup, subsequent lookup failures will simply result in retrying at an interval up to every 5 minutes (not applicable for UDP).
Breaking Changes
See release notes.
Checklist