-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
go/control: Show last consensus height seen by block history #6013
base: master
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for oasisprotocol-oasis-core canceled.
|
// Fetch last consensus height seen by block history. | ||
height, err := rt.History().LastConsensusHeight() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be useful for oasis-node to report the current reindex round
Notice I report consensus height instead of runtime round. I would argue this is also more appropriate, since reindex iterates over the block height...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show the last height and the last round?
Would percent of indexed blocks help the node operators?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, all of these would be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show the last height and the last round?
I have added those. Have a bit of a mixed feelings about percentage...
- Do we want to add it to consensus sync and storage syncing as well (to be consistent)?
- Should we only show it if during reindex/syncing?
- The value is not deterministic so we cannot expose it as status,
[reindex/syncing]_progress: 90.1%
looks good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we want to add it to consensus sync and storage syncing as well (to be consistent)?
I said it would be nice to have percentages, so that you know how far are you.
- Should we only show it if during reindex/syncing?
Probably no. If we could show 100%, that would mean that the syncing is not in progress.
- The value is not deterministic so we cannot expose it as status
Not sure what you are trying to say here.
go/control/api/api.go
Outdated
// BlockHistoryLastConsensusHeight is last consensus height seen by block history. | ||
BlockHistoryLastConsensusHeight int64 `json:"block_history_last_consensus_height"` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so that the node operator could estimate when the reindex will finish.
I am not sure current node control status is helpful unless node operator is very familiar with oasis?
Hackish alternative may be to only show it if during "history reindex" status right below it, giving more context to the operator:
diff --git a/go/worker/common/api/api.go b/go/worker/common/api/api.go
index 41e0a46ff..0e050b421 100644
--- a/go/worker/common/api/api.go
+++ b/go/worker/common/api/api.go
@@ -107,6 +107,9 @@ type Status struct {
// Status is a concise status of the committee node.
Status StatusState `json:"status"`
+ // HistoryReindexHeight is last consensus height seen by block history.
+ HistoryReindexHeight *int64 `json:"history_reindex_height,omitempty"`
+
// ActiveVersion is the currently active version.
ActiveVersion *version.Version `json:"active_version"`
diff --git a/go/worker/common/committee/node.go b/go/worker/common/committee/node.go
index 9f06bcd15..668e43365 100644
--- a/go/worker/common/committee/node.go
+++ b/go/worker/common/committee/node.go
@@ -275,6 +275,20 @@ func (n *Node) GetStatus() (*api.Status, error) {
var status api.Status
status.Status = n.getStatusStateLocked()
+ // Include last consensus height if during history reindex.
+ if status.Status == api.StatusStateWaitingHistoryReindex {
+ height, err := n.Runtime.History().LastConsensusHeight()
+ if err != nil {
+ n.logger.Warn("failed to retrieve last consensus height seen by block history",
+ "err", err,
+ "id", n.Runtime.ID(),
+ )
+ } else {
+ status.HistoryReindexHeight = &height
+
+ }
+ }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure current node control status is helpful unless node operator is very familiar with oasis?
The CLI can provide a nicer computed overlay from this information. But it would be good if the node could estimate the average reindex speed and then an ETA as the range of blocks to reindex is known.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it would be good if the node could estimate the average reindex speed and then an ETA as the range of blocks to reindex is known.
I would prefer to tackle ETA in another PR here is why:
- consensus reindex >= storage sync >> history reindex (time needed for my testnet node to sync from start)
- ETA also depends if you are doing reindex during consensus sync or if you start with already synced node. Need to do some more experiments just to be sure...
47f3130
to
1368dfc
Compare
go/control/api/api.go
Outdated
@@ -189,6 +189,9 @@ type RuntimeStatus struct { | |||
// Storage contains the storage worker status in case this node is a storage node. | |||
Storage *storageWorker.Status `json:"storage,omitempty"` | |||
|
|||
// BlockHistoryLastConsensusHeight is last consensus height seen by block history. | |||
BlockHistoryLastConsensusHeight int64 `json:"block_history_last_consensus_height"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite long field name. I would try to find a shorter name or move it under history.
Should this field be grouped with LastRetainedRound
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should shorten it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think the cleanest way would probably be to show last round and height as seen by the block history right below storage
field. I am thinking about exposing this status from the roothash/api/BlockHistory
interface (see):
"latest_round": 10160924,
"latest_hash": "85867c19bd6ad43521432e187962e3bb19a9731c79cd64f195821bfd90bd7b24",
"latest_time": "2025-01-31T11:08:05+01:00",
"latest_state_root": {
"ns": "000000000000000000000000000000000000000000000000a6d1e3ebf60dff6c",
"version": 10160924,
"root_type": 1,
"hash": "a914f8c8baf05a47757c1d7f7caecebdf614c8b3d1e20aa33d6855ec96a50ba1"
},
"genesis_round": 2995927,
"genesis_hash": "c9f3ca654531b775d944c85e1f00e76944aaf5de1902acfa082ff76e852dba5e",
"last_retained_round": 10145927,
"last_retained_hash": "fad686c7eb077ec1a1a5a1ba1c27027a3788b003c94130e78c7dfb25a1d0e8cd",
// other fields
"storage": {
"status": "syncing rounds",
"last_finalized_round": 10160923
},
"history": {
"last_round": 10160923,
"last_height": 25058099
},
Should this field be grouped with LastRetainedRound?
LastRetainedRound
is actually earliest (not latest) runtime block that is also synced to the storage.
i. You may still get an error and show the latest retained round from the block history even if you don't have it synced to the storage... See. Is this a bug?- History reindex is actually happening and finishes before storage round syncing so ideally this should be separate grouping.
I am thinking could we move last_retained_round
and last_retained_hash
to storage
? This is breaking obviously. More importantly the storage
status is received from the storage committee worker, whilst LastRetainedRound
and hash are technically overwritten, directly from the runtime storage (see, described in 1.i). As from the docstring the former is synced + finalized, whilst if reading from the storage directly it is only guaranteed to be synced? Not sure what is the difference (if any)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History reindex is actually happening and finishes before storage round syncing so ideally this should be separate grouping.
Maybe we show status of both. One part should probably be under storage
, other under history
. Although I'm not sure if users will understand the difference.
Should this field be grouped with
LastRetainedRound
?
The code which fetches these fields should also be together.
This is breaking
Breaking for who? I think it is not a problem if we change the status. If node operators depend on it, they can check the changelog and modify their code.
// Fetch last consensus height seen by block history. | ||
height, err := rt.History().LastConsensusHeight() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we show the last height and the last round?
Would percent of indexed blocks help the node operators?
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #6013 +/- ##
==========================================
- Coverage 65.12% 64.64% -0.49%
==========================================
Files 631 631
Lines 64419 64427 +8
==========================================
- Hits 41956 41647 -309
- Misses 17549 17867 +318
+ Partials 4914 4913 -1 ☔ View full report in Codecov by Sentry. |
1368dfc
to
ff6fd84
Compare
// HistoryStatus is the status as observed by the block history. | ||
type HistoryStatus struct { | ||
// LastRound is the round of the latest block. | ||
LastRound uint64 `json:"last_round"` | ||
// LastHeight is the consensus height of the latest block. | ||
LastHeight int64 `json:"last_height"` | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want a status like we have for storage
? e.g. initializing, reindexing blocks, syncing blocks?
If so this logic would have to be moved to consensus roothash, or we add a method to query it. Albeit, feels weird adding this functionality to roothash/api/Backend
...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want a status like we have for
storage
? e.g. initializing, reindexing blocks, syncing blocks?
Yes, I would like to have percentages (and maybe optional status).
From you example, I would like to have at least 3 fields, two that say which range is reindexBlocks
working on (800 - 1000), and percentages of reindexed blocks (30/200 = 15%). When this is done, the node will see that too many blocks are still to be indexed, so it will update status by updating range (1000 - 1100) and percentage (0/100 = 0%).
Having field LastRound
might be a problem when you start doing things in parallel, as some lower rounds might not be processed yet. Range might be better.
@@ -0,0 +1,8 @@ | |||
go/oasis_node: Add runtime block history status message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go/oasis_node: Add runtime block history status message | |
go/oasis-node: Add runtime block history status message |
go/oasis_node: Add runtime block history status message | ||
|
||
A new field `history` has been added to the `oasis-node control status` | ||
output under the runtime status section. This field displays the latest round |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
output under the runtime status section. This field displays the latest round | |
output under the runtime status section. This field displays the round |
@@ -48,6 +48,8 @@ type BlockHistory interface { | |||
|
|||
// LastConsensusHeight returns the last consensus height which was seen | |||
// by block history. | |||
// | |||
// This method can return height for block not yet synced to storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole comment is a bit confusing, as the first part says seen while the second says that the block (which one, consensus or runtime?) doesn't need to be synced.
Can we change to
// This method can return height for block not yet synced to storage. | |
// LastConsensusHeight returns the consensus height of the last committed block. |
@@ -57,6 +59,7 @@ type BlockHistory interface { | |||
GetCommittedBlock(ctx context.Context, round uint64) (*block.Block, error) | |||
|
|||
// GetBlock returns the block at a specific round. | |||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be consistent and add new line in all places or nowhere.
// Fetch last consensus height seen by block history. | ||
height, err := rt.History().LastConsensusHeight() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Do we want to add it to consensus sync and storage syncing as well (to be consistent)?
I said it would be nice to have percentages, so that you know how far are you.
- Should we only show it if during reindex/syncing?
Probably no. If we could show 100%, that would mean that the syncing is not in progress.
- The value is not deterministic so we cannot expose it as status
Not sure what you are trying to say here.
go/control/api/api.go
Outdated
@@ -189,6 +189,9 @@ type RuntimeStatus struct { | |||
// Storage contains the storage worker status in case this node is a storage node. | |||
Storage *storageWorker.Status `json:"storage,omitempty"` | |||
|
|||
// BlockHistoryLastConsensusHeight is last consensus height seen by block history. | |||
BlockHistoryLastConsensusHeight int64 `json:"block_history_last_consensus_height"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
History reindex is actually happening and finishes before storage round syncing so ideally this should be separate grouping.
Maybe we show status of both. One part should probably be under storage
, other under history
. Although I'm not sure if users will understand the difference.
Should this field be grouped with
LastRetainedRound
?
The code which fetches these fields should also be together.
This is breaking
Breaking for who? I think it is not a problem if we change the status. If node operators depend on it, they can check the changelog and modify their code.
// HistoryStatus is the status as observed by the block history. | ||
type HistoryStatus struct { | ||
// LastRound is the round of the latest block. | ||
LastRound uint64 `json:"last_round"` | ||
// LastHeight is the consensus height of the latest block. | ||
LastHeight int64 `json:"last_height"` | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also want a status like we have for
storage
? e.g. initializing, reindexing blocks, syncing blocks?
Yes, I would like to have percentages (and maybe optional status).
From you example, I would like to have at least 3 fields, two that say which range is reindexBlocks
working on (800 - 1000), and percentages of reindexed blocks (30/200 = 15%). When this is done, the node will see that too many blocks are still to be indexed, so it will update status by updating range (1000 - 1100) and percentage (0/100 = 0%).
Having field LastRound
might be a problem when you start doing things in parallel, as some lower rounds might not be processed yet. Range might be better.
What & Why
#5998
How to test
data/runtime/[runtimeID]
(this way we trigger reindex)oasis-node control status -a unix:path/to/data/internal.sock
and observe whilst history reindex is happening, consensus height should be increasing e.g.: