Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added keepalive timout so that connections are not dropped. #2763

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

christos-h
Copy link
Contributor

Motivation

Connections with validators are dropped after 1 minute due to connections being inactive.

Proposal

Use gRPC keep-alive to send ping frames every 40 seconds to keep the connections alive.

This is hard-coded for now. Future improvement involves client <> server negotation of the keepalive timeout value per-validator.

Test Plan

Manually tested. CI will catch regressions.

Release Plan

  • These changes should be backported to the latest devnet branch, then
    • be released in a new SDK,
    • be released in a client hotfix.
  • These changes should be backported to the latest testnet branch, then
    • be released in a new SDK,
    • be released in a client hotfix.

Links

https://grpc.io/docs/guides/keepalive/#keepalive-configuration-specification

@christos-h christos-h requested a review from Twey October 31, 2024 13:53
@christos-h christos-h force-pushed the grpc-client-keep-alive branch from 52bead2 to b298a84 Compare October 31, 2024 14:14
@@ -42,6 +43,7 @@ cfg_if::cfg_if! {
if let Some(timeout) = options.timeout {
endpoint = endpoint.timeout(timeout);
}
endpoint = endpoint.keep_alive_timeout(Duration::from_millis(KEEPALIVE_TIME_MS));
Copy link
Contributor

@ma2bd ma2bd Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it another option?

Copy link
Contributor

@ma2bd ma2bd Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we could copy paste the code for options.timeout

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentionally not an option. The reason is that this should be negotiated on a per-validator basis. The reason is that different ingresses have different timeouts.

However after further discussion with @Twey , we think a better implementation may be to have the keepalive originate from the server - hence why this is still a draft.

The server-side option will be configurable.

@christos-h christos-h force-pushed the grpc-client-keep-alive branch 2 times, most recently from c91b697 to ff122c8 Compare December 16, 2024 21:41
@christos-h christos-h force-pushed the grpc-client-keep-alive branch from ff122c8 to c62b1f4 Compare January 6, 2025 15:46
@christos-h christos-h force-pushed the grpc-client-keep-alive branch from c62b1f4 to 62a297a Compare January 6, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants