-
Notifications
You must be signed in to change notification settings - Fork 1.2k
perf: using asynchronous worker to validate BLS signatures in quorum commitments #6692
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Warning Rate limit exceeded@knst has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 3 minutes and 56 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
WalkthroughThe changes introduce asynchronous BLS signature verification for quorum commitments by adding a Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15 minutes ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Inline review comments failed to post. This is likely due to GitHub's limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/llmq/blockprocessor.cpp (1)
208-220
: Consider adding error logging when signature verification fails.The asynchronous signature verification implementation is correct and follows proper patterns. However, when
queue_control.Wait()
returns false, it would be helpful to log which commitment(s) failed verification for debugging purposes.Consider adding a log message before returning false:
if (!queue_control.Wait()) { // at least one check failed + LogPrintf("[ProcessBlock] BLS signature verification failed for block at height %d\n", pindex->nHeight); return false; }
src/llmq/commitment.cpp (1)
31-95
: Consider refactoring for improved modularity.The implementation correctly handles both synchronous and asynchronous BLS signature verification. However, the method is quite long (64 lines) and handles multiple responsibilities.
Consider extracting helper methods to improve readability and maintainability:
bool CFinalCommitment::VerifySignatureAsync(CDeterministicMNManager& dmnman, CQuorumSnapshotManager& qsnapman, gsl::not_null<const CBlockIndex*> pQuorumBaseBlockIndex, CCheckQueueControl<utils::BlsCheck>* queue_control) const { auto members = utils::GetAllQuorumMembers(llmqType, dmnman, qsnapman, pQuorumBaseBlockIndex); const auto& llmq_params_opt = Params().GetLLMQ(llmqType); if (!llmq_params_opt.has_value()) { LogPrint(BCLog::LLMQ, "CFinalCommitment -- q[%s] invalid llmqType=%d\n", quorumHash.ToString(), ToUnderlying(llmqType)); return false; } const auto& llmq_params = llmq_params_opt.value(); uint256 commitmentHash = BuildCommitmentHash(llmq_params.type, quorumHash, validMembers, quorumPublicKey, quorumVvecHash); LogMemberDetails(members, commitmentHash); if (!VerifyMemberSignatures(llmq_params, members, commitmentHash, queue_control)) { return false; } if (!VerifyQuorumSignature(commitmentHash, queue_control)) { return false; } return true; } private: bool CFinalCommitment::VerifyMemberSignatures(const Consensus::LLMQParams& llmq_params, const std::vector<CDeterministicMNCPtr>& members, const uint256& commitmentHash, CCheckQueueControl<utils::BlsCheck>* queue_control) const { // Implementation for member signature verification } bool CFinalCommitment::VerifyQuorumSignature(const uint256& commitmentHash, CCheckQueueControl<utils::BlsCheck>* queue_control) const { // Implementation for quorum signature verification }
🛑 Comments failed to post (1)
src/llmq/commitment.h (1)
34-37:
⚠️ Potential issueFix namespace formatting issue identified by pipeline.
The nested namespace declaration needs proper formatting according to clang-format requirements.
Apply this diff to fix the formatting:
-namespace utils -{ -struct BlsCheck; -} // namespace utils +namespace utils { +struct BlsCheck; +} // namespace utils📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.namespace utils { struct BlsCheck; } // namespace utils
🤖 Prompt for AI Agents
In src/llmq/commitment.h around lines 34 to 37, the nested namespace declaration for utils is not properly formatted according to clang-format standards. Adjust the namespace declaration to use the correct syntax and indentation style, ensuring the opening and closing braces align properly and the comment indicating the namespace closure is correctly placed.
just to be sure that CCheckQueue works at all (and not just return true) I modified code a bit and it failed as expected: diff --git a/src/llmq/commitment.cpp b/src/llmq/commitment.cpp
index c2cbe9b35c..b242b4c988 100644
--- a/src/llmq/commitment.cpp
+++ b/src/llmq/commitment.cpp
@@ -69,7 +69,12 @@ bool CFinalCommitment::VerifySignatureAsync(CDeterministicMNManager& dmnman, CQu
strprintf("CFinalCommitment -- q[%s] invalid aggregated members signature", quorumHash.ToString())};
if (queue_control) {
std::vector<utils::BlsCheck> vChecks;
- vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string);
+ static int counter{0};
+ if (++counter == 42) {
+ vChecks.emplace_back(quorumSig, memberPubKeys, commitmentHash, members_id_string);
+ } else {
+ vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string);
+ }
queue_control->Add(vChecks);
} else {
if (!membersSig.VerifySecureAggregated(memberPubKeys, commitmentHash)) { error:
state is expected to don't be set, same issue on
|
I squashed commits together because commits has changed same piece of code several times and moved between files. Also see #6724 and #6692 (comment) for extra tests done for this PR. @UdjinM6 please help with review, PR is ready now (assuming CI will succeed) |
c9ef70a tests: add is_mature for quorum generation logs (Konstantin Akimov) 59060b5 fmt: order imports and fix gap in feature_llmq_dkgerrors.py (Konstantin Akimov) 58377f8 test: added functional tests for invalid CQuorumCommitment (Konstantin Akimov) bb0b8b0 test: add serialization/deserialization of CFinalCommitmentPayload (Konstantin Akimov) Pull request description: ## Issue being fixed or feature implemented As I noticed implementing #6692 if BlsChecker works incorrectly it won't be caught by unit or functional tests. See also #6692 (comment) how 6692 has been tested without this PR. ## What was done? This PR introduces new functional tests to validated that `llmqType`, `membersSig`, `quorumSig` and `quorumPublicKey` are indeed validated by Dash Core as part of consensus. ## How Has This Been Tested? See changes in `feature_llmq_dkgerrors.py` ## Breaking Changes N/A ## Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [x] I have added or updated relevant unit/integration/functional/e2e tests - [ ] I have made corresponding changes to the documentation - [x] I have assigned this pull request to a milestone ACKs for top commit: kwvg: utACK c9ef70a UdjinM6: utACK c9ef70a Tree-SHA512: ad61f8c845f6681765105224b2a84e0b206791e2c9a786433b9aa91018ab44c1fa764528196fd079f42f08a55794756ba8c9249c6eb10af6fe97c33fa4757f44
This pull request has conflicts, please rebase. |
It introduces new commandline argument -parbls to set up amount of parallel threads for BLS validation New parallel BlsChecker validates asynchonously quorumSig and membersSig in Quorum Commitment
This pull request has conflicts, please rebase. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see ~15% speedup for ~20k blocks on mainnet on m1pro (7 threads). Good job! 👍
light ACK 3aec7a5
Co-authored-by: UdjinM6 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
utACK c047c58
if (!queue_control.Wait()) { | ||
// at least one check failed | ||
return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-invalid"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!queue_control.Wait()) { | |
// at least one check failed | |
return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-invalid"); | |
} | |
if (!queue_control.Wait()) { | |
LogPrint(BCLog::LLMQ, "CQuorumBlockProcessor: BLS verification failed for block %s\n", blockHash.ToString()); | |
return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-qc-invalid"); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated log, see: https://github.com/dashpay/dash/pull/6692/files#diff-cf3d9716cca5e57a6033574399f5be8a184f017da2c02c414ae2771c101a2339R952
LogPrint(BCLog::LLMQ, "%s\n", m_id_string);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is that log? Seems very non-clear / non-detailed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See a caller, which create an object of BlsCheck:
std::string members_id_string{
strprintf("CFinalCommitment -- q[%s] invalid aggregated members signature", quorumHash.ToString())};
if (queue_control) {
std::vector<utils::BlsCheck> vChecks;
vChecks.emplace_back(membersSig, memberPubKeys, commitmentHash, members_id_string);
queue_control->Add(vChecks);
} else {
if (!membersSig.VerifySecureAggregated(memberPubKeys, commitmentHash)) {
LogPrint(BCLog::LLMQ, "%s\n", members_id_string);
return false;
}
}
(and 2nd below)
So, there will be CFinalCommitment -- q[%s] invalid aggregated members signature
logged or "CFinalCommitment -- q[%s] invalid quorum signature"
for (const auto& [_, qc] : qcs) { | ||
if (qc.IsNull()) continue; | ||
const auto* pQuorumBaseBlockIndex = m_chainstate.m_blockman.LookupBlockIndex(qc.quorumHash); | ||
qc.VerifySignatureAsync(m_dmnman, m_qsnapman, pQuorumBaseBlockIndex, &queue_control); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I recall in the early days of testing and benchmarking BLS, we found it to be more efficient to aggregate then verify, rather than verify asynchronously or something like that. Did you investigate if this was possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it maybe faster, but I can't be 100% sure that I won't introduce security issue and my solution will be safe enough
std::vector<CBLSPublicKey> memberPubKeys; | ||
for (const auto i : irange::range(members.size())) { | ||
if (!signers[i]) { | ||
continue; | ||
} | ||
memberPubKeys.emplace_back(members[i]->pdmnState->pubKeyOperator.Get()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably reserve here; we know that for a qc to be valid, we need at least 80% (llmq_50_60 has minsize 40). As such, we can pretty safely reserve the entire size. For the LLMQ_400 this may reduce the number of allocations from ~10-ish reallocations to just a single one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please notice, that this code is not new,but moved in commit 219f223
I may implement this optimization in follow-up PR
src/llmq/options.h
Outdated
@@ -24,6 +24,11 @@ enum class QvvecSyncMode { | |||
OnlyIfTypeMember = 1, | |||
}; | |||
|
|||
/** Maximum number of dedicated script-checking threads allowed */ | |||
static const int MAX_BLSCHECK_THREADS = 31; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems excessive :D any testing that shows we need so many max threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I think 31 may be not big enough value, because typical amount of commitments in one block: 0, 2, 32.
1 commitment requires 2 BLS validations, and these validations take different time: one validation is faster (quorum's sig, because only 1 public key, no aggregation), other is slower (member's sig, because multiple public keys which are aggregated).
So, optimum point is somewhere between 32 threads and 64 threads extra threads, but I don't have a machine with 64 threads to do tests :)
I have chosen 31 as "sane" value, but probably it's not the best one. Any recommendations?
src/llmq/utils.cpp
Outdated
bool BlsCheck::operator()() | ||
{ | ||
if (m_pubkeys.size() > 1) { | ||
if (!m_sig.VerifySecureAggregated(m_pubkeys, m_msg_hash)) { | ||
LogPrint(BCLog::LLMQ, "%s\n", m_id_string); | ||
return false; | ||
} | ||
} else if (m_pubkeys.size() == 1) { | ||
if (!m_sig.VerifyInsecure(m_pubkeys.back(), m_msg_hash)) { | ||
LogPrint(BCLog::LLMQ, "%s\n", m_id_string); | ||
return false; | ||
} | ||
} else { | ||
// It is supposed to be at least one public key! | ||
return false; | ||
} | ||
return true; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think we could return an enum or std::expected to represent the error state if something goes wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VerifyInsecure
, and VerifySecureAggregated
returns bool, not std::expected; std::expected won't add any extra value here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not? there are three error paths here - "aggregated verification failed" - "verification failed" - "no public key set"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to put here exception then, because this return false
is more like assert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logically, we should never get to this } else {
; I added return false
just in case of any futher mis-usage of BlsCheck by some possible new code
Blocks could have 0, 2 or 32 commitments currently; further benchmarking is needed to find a point where is a balance point, but likely it's somewhere between 32 and 64; because each quorum commitment have 2 BLS signatures
Issue being fixed or feature implemented
During blocks validation, quorum commitments are processed in single thread.
While some blocks have up to 32 commitments (blocks which have rotation quorums commitment), each quorum commitment have hundreds public keys to validate and 2 signature (quorum signature and public key signature). It takes up to 30% in total indexing time and up to 1 second for heavy blocks.
What was done?
CCheckQueue
which is used for validation ECDSA signatures is used now for validation of BLS signatures in quorum commitments.Quorum signature and members signatures are validated simultaneously now which makes performance improvement even for blocks which has only 1 commitment.
How Has This Been Tested?
Invalidated + reconsidered 15k blocks (~1 months worth)
This PR makes validation of Quorum Commitment 3.5x times faster; overall indexing 25% faster on my 12 cores environment.
PR:

develop:

Breaking Changes
N/A
Checklist: