Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Health endpoint #836

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Adding Health endpoint #836

wants to merge 3 commits into from

Conversation

otherview
Copy link
Member

Description

A health check service that lives inside the node as an API endpoint.

I think it's a good illustrative start for a PR, there are a few tech choices made that I consider would be worth discussing, such as:

  • Should this endpoint live in the Admin API ?
  • Should this be a singleton service ?
  • Is the naming on point ?

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • New and existing E2E tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have not added any vulnerable dependencies to my code

health/health.go Outdated
h.lock.Lock()
defer h.lock.Unlock()

h.chainSynced = true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if, once the node is synched, for some reason, it gets out of synch for a period of time and need to re-sync? Would the health status still signaling the node is in sync?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I've modified the ChainSync to ChainSyncStatus and marked as unhealthy if it's syncing.

There is a concern that this may make the ChainSync Status to flicker from Sync to UnSync a bit too often.

}

// todo review time slots
healthy := time.Since(h.newBestBlock) >= 10*time.Second &&
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the first condition correct? Why are we considering healthy a node whose latest best block was seen more then 10 seconds ago?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right, that's why the 10sec is marked as a todo, what would make sense for a this time gap you think ?

health/health.go Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Oct 29, 2024

Codecov Report

Attention: Patch coverage is 12.16216% with 65 lines in your changes missing coverage. Please review.

Project coverage is 60.38%. Comparing base (b6380d5) to head (df30451).

Files with missing lines Patch % Lines
health/health.go 20.68% 23 Missing ⚠️
api/health/health.go 0.00% 22 Missing ⚠️
cmd/thor/main.go 0.00% 10 Missing ⚠️
comm/communicator.go 0.00% 4 Missing ⚠️
api/api.go 0.00% 2 Missing ⚠️
cmd/thor/node/node.go 0.00% 2 Missing ⚠️
cmd/thor/utils.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #836      +/-   ##
==========================================
- Coverage   60.52%   60.38%   -0.15%     
==========================================
  Files         213      215       +2     
  Lines       22931    23000      +69     
==========================================
+ Hits        13879    13888       +9     
- Misses       7916     7976      +60     
  Partials     1136     1136              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@otherview otherview marked this pull request as ready for review October 29, 2024 10:49
@otherview otherview requested a review from a team as a code owner October 29, 2024 10:49
@leszek-vechain
Copy link
Contributor

do we have any tests for this new endpoint ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants