Introduction of a Validator Score #21
Replies: 9 comments 73 replies
-
Fundamentally, we advocate some form of automated performance evaluation. Security / Maintenance: Bugs: |
Beta Was this translation helpful? Give feedback.
-
I agree too, but I guess this system introduce additional risks for honest validators that can be penalized for a simple system failure. To minimize this risk I suggest to take this improvement as opportunity to introduce an automatic validator fail over. As high level overview, this feature allows a node to be active as "backup validator". Ideally, two validatiors should partecipate to the committee controlling the same stake but only one of them partecipate to the consensus. If the active fails, the backup becomes active. This feature double the validator costs but cuts the failure risks, so let the automated penalty system to be acceptable. |
Beta Was this translation helpful? Give feedback.
-
Agreed that an automated system is required. The system described here if implemented will need to be self checked, meaning that the partial scores will need to conform to some degree, if one validator gives a poor score to another while all the rest give it a high score then something went wrong with the reporter. What if consensus cannot be reached on the partial scoring mechanism? It will also need to be tamper proof on the side of the reporter or this might give an incentive to "lie" on the report. In general an automated system should probably rely on onchain data, more than active monitoring by several peers, how possible that is in Iota I am not sure yet, but you can see good example of this on Ethereum with Rated even though their scoring does not affect rewards. Cant we rely on onchain data with a delayed effect? meaning if your calculated score from onchain data is low then you get less rewards on the following epoch, not the current one. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the proposal—automated scoring is a natural next step, but must be carefully aligned with IOTA’s validator cooperative model and decentralization goals. ⸻
⸻
⸻
Let’s avoid solutions looking for a problem. ⸻
⸻
⸻
Without data, this remains speculative. ⸻
⸻ We support performance visibility. But penalties must come last—not first. Let’s build trust with data, not assumptions. |
Beta Was this translation helpful? Give feedback.
-
This is a fantastic discussion with many valid points. We strongly agree with the core sentiment expressed by @starfish-one and the practical, operator-focused considerations from @dlt-green A clear consensus seems to be forming: radical transparency and empowering the community should be the priority, and protocol-level penalties are a serious measure that should be a last resort. Building on this, we'd like to propose a concrete framework that puts this philosophy into action. A Proposal for a Community-Driven Performance Framework
The next logical step is to standardize the rich data from community tools (like the excellent dashboards from @dlt-green) into a simple, public-facing Health Score. This would be displayed as an intuitive letter grade (A-F) with color-coding in the official IOTA Explorer. The score should represent a monthly average, with a daily drill-down for full transparency. To show our commitment to this idea, we at Tokenlabs have already started building a proof-of-concept for this "Quality Certificate" dashboard. We intend for this to be a true community effort from day one.
To further foster collaboration over competition, we propose that a dedicated section be added to the official IOTA documentation to act as a shared Knowledge Base, detailing solutions for common validator issues and helping everyone achieve an 'A' rating. The community, armed with these tools, becomes the ultimate regulator. Let's build this framework of transparency and support first. Best regards, The Tokenlabs Team |
Beta Was this translation helpful? Give feedback.
-
@oliviasaa We like your approach to this issue and would support it in principle. However, we would like to emphasize again that we should also consider eliminating the validator's commission, not the staker rewards (in first time). It will depend on the details, but we would fundamentally support this approach. In our case, this will also be decided via a governance vote in the future, but initial feedback on this is positive. We also find the possibility of deciding on thresholds together very positive. Initially, this factor, which we will also integrate into our analytics, should first run in a dry run on the mainnet, so to speak, so that any necessary adjustments can be made before the activation. |
Beta Was this translation helpful? Give feedback.
-
Great discussion. Let me briefly summarize the direction that seems to emerge:
|
Beta Was this translation helpful? Give feedback.
-
Comment on reducing the voting power of faulty nodes. Currently, the protocol assumes static stake availability: if more than 1/3 of stake goes offline, the network halts. This is a well-known limitation of BFT-style protocols, and a significant body of research explores dynamic availability—where voting power is adjusted on the fly based on live participation. However, dynamically adjusting voting power always comes at the cost of safety. In a partitioned network, each partition might adjust independently and resume consensus, leading to forks. Still, the fact remains: if fewer than 66% of validators are online, consensus halts. It’s therefore crucial to have a path for removing voting power from persistently offline validators—not immediately, but based on observable and stable conditions. We propose a simple rule: If a validator has been offline for two consecutive epochs, and more than 87.5% of the total stake participated during these epochs (the same threshold used for protocol upgrades), then its voting power is removed. This strikes a balance: |
Beta Was this translation helpful? Give feedback.
-
Encapsulate is in favor of an automated, transparent performance score tied to rewards. Other ecosystems (e.g., Cosmos) already automate penalties for provable faults like double-signing and sustained downtime—good precedent. Same here: weight provable signals highest and keep “soft”/unprovable metrics low-impact to avoid gaming. What we’d like to see |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Motivation
Validators are the backbone of the IOTA network. The performance of each validator directly influences the network’s overall efficiency and usability. Therefore, it is essential to incentivize validators to operate reliably and effectively by tying their rewards to their observed performance.
Currently, the only mechanism for penalizing underperforming validators is a manual reporting system. This system requires a quorum of validators to explicitly report a misbehaving peer in order to reduce their rewards. Such an approach is impractical, as it demands continuous manual monitoring, which is an unreasonable burden for most operators. Moreover, no standard criteria exist to guide reporting decisions, resulting in inconsistent and arbitrary thresholds set independently by each validator.
We propose an automated and standardized system for monitoring validator behavior, culminating in a commonly agreed score that reflects each validator’s performance during an epoch. These scores would directly influence the rewards distributed at the epoch’s end.
Specification
Performance Metrics, Proofs, And Partial Scores
Each validator will monitor its peers throughout an epoch, collecting performance metrics and computing a local partial score for every other validator. Regardless of the exact set of metrics used, they are divided into two categories: provable and unprovable metrics:
misbehavior_reports
fields of their newly created blocks. Whenever a block containing aMisbehaviorReport
is committed, validators update their count for the corresponding misbehaviour.Aggregating Scores
At the end of each epoch, validators broadcast their locally computed partial scores for all other validators. This is done by replacing the current
EndOfPublish
transaction with a newEndOfPublishV2
format that contains the computed vector of partial scores.Once a quorum of
EndOfPublishV2
transactions has been collected, the epoch can advance. The stake-weighted median of the partial scores submitted via theEndOfPublishV2
transactions is then taken and combined with the provable metric counts (based on the proofs), resulting in a deterministic aggregated score.Adjusting Rewards
The aggregated scores are passed to the advance_epoch function, where they are used to adjust validator rewards using the following formula:
adjusted_rewards[i] = unadjusted_rewards[i] * aggregated_scores[i] / max_score
Where
max_score
is the maximum achievable score. As in the current protocol, the difference between adjusted and unadjusted rewards is burned.Rationale
Performance Metrics
The categorization of metrics as provable or unprovable allows them to be treated differently. Unprovable metrics are highly gameable and should not lead to severe penalties. Provable metrics, if correctly designed, offer a reliable estimation of validator performance and potential malicious behaviour, and can therefore be used as part of a strong incentivization mechanism.
Since proofs are embedded in committed blocks, validators already have a common view of all provable metrics. Thus, there is no need to report any count or score relative to provable metrics at the epoch end.
Unprovable metrics, on the other hand, remain entirely local to each validator. Therefore, they must be shared and agreed upon through the consensus mechanism.
Aggregating Scores
The use of the
EndOfPublish
transaction to collect local scores is ideal, as it is the final message sent by each validator before closing the epoch. This timing ensures that the latest score data is captured for reward adjustment at the epoch’s end, without delays on the epoch advancement.Additionally, by building on an already existing mechanism that requires a quorum of validators, we ensure that the final score used for reward adjustment is based on a quorum’s local scores.
The choice of a stake weighted median to aggregate local scores is both natural and necessary. It ensures that validators cannot manipulate the outcome by splitting their stake across multiple identities, as each validator’s influence is proportional to their stake. Additionally, the median is a robust statistical measure that resists distortion from extreme values, unlike the mean, which can be skewed by outliers or intentionally biased scores.
Adjusting Rewards
Once consensus is reached on the aggregated score, adjusting rewards becomes straightforward. The aggregated score incorporates all locally computed metrics and is designed to ensure incentive compatibility. As such, it can be directly applied as a multiplicative factor to the unadjusted rewards.
Reference Implementation
An initial set of metrics has already been implemented in the iota repository, along with a simple scoring function that serves as a placeholder for a more complete version. This reference implementation is available in the (already merged) PR#7604 and PR#7921 The remaining components required to achieve consensus on an aggregated score and to adjust rewards do not yet have a reference implementation.
Backwards Compatibility
The version change in the
EndOfPublish
message is not backward compatible and must be implemented as a protocol upgrade, with a feature flag to enable the new functionality. All other changes are either local to the node (as storing and counting metrics) or use already existing types and fields (as themisbehavior_reports
field in the block or theMisbehaviorReport
type). Those local changes should not cause any node behaviour or agreement problems.
Beta Was this translation helpful? Give feedback.
All reactions