-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A80: gRPC Metrics for TCP connection #428
Open
nanahpang
wants to merge
21
commits into
grpc:master
Choose a base branch
from
nanahpang:add-proposal
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 12 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
dedfb16
Create A80-grpc-metrics-for-tcp-connection
nanahpang ffaeb22
Update A80-grpc-metrics-for-tcp-connection
nanahpang d413291
Update A80-grpc-metrics-for-tcp-connection
nanahpang 5b5ba3f
Update A80-grpc-metrics-for-tcp-connection
nanahpang 583e6b3
Update and rename A80-grpc-metrics-for-tcp-connection to A80-grpc-met…
nanahpang 8aa21c1
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 9f8038c
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 59ab138
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang ce27a69
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang d239c39
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 0726f6e
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 3bfe76b
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 2ccf768
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 83ac908
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 2a11aea
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang b6dc6d9
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 0aceebe
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 052d5cf
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 7e5bc86
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang 092fbc1
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang bd18940
Update A80-grpc-metrics-for-tcp-connection.md
nanahpang File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
A80: gRPC Metrics for TCP connection | ||
---- | ||
* Author(s): Yash Tibrewal (@yashykt), Nana Pang (@nanahpang), Yousuk Seung (@yousukseung) | ||
* Approver: Craig Tiller (@ctiller), Mark Roth (@markdroth) | ||
* Status: {Draft, In Review, Ready for Implementation, Implemented} | ||
* language: {...} | ||
* Last updated: 2024-04-18 | ||
* Discussion at: https://groups.google.com/g/grpc-io/c/AyT0LVgoqFs | ||
|
||
## Abstract | ||
|
||
This document proposes adding new TCP connection metrics to gRPC for improved network analysis and debugging. | ||
|
||
## Background | ||
|
||
To improve the network debugging capabilities for gRPC users, we propose adding per-connection TCP metrics in gRPC. The metrics will utilize the metrics framework outlined in [A79]. | ||
|
||
### Related Proposals: | ||
* [A79]: gRPC Non-Per-Call Metrics Framework | ||
|
||
[A79]: A79-non-per-call-metrics-architecture.md | ||
|
||
## Proposal | ||
|
||
This document proposes changes to the following gRPC components. | ||
|
||
#### Per-Connection TCP Metrics | ||
|
||
We will provide the following metrics: | ||
- `grpc.tcp.min_rtt` | ||
- `grpc.tcp.delivery_rate` | ||
- `grpc.tcp.packets_sent` | ||
- `grpc.tcp.packets_retransmitted` | ||
- `grpc.tcp.packets_spurious_retransmitted` | ||
|
||
The metrics will have label: | ||
|
||
| Name | Disposition | Description | | ||
| ----------- | ----------- | ----------- | | ||
| grpc.tcp.peer_address | optional | Store the peer address info in URI format such as `ipv4:1.2.3.4:567`. | | ||
| grpc.tcp.local_address | optional | Store the local address info in URI format such as `ipv4:1.2.3.4:567`. | | ||
|
||
The metrics will be exported as: | ||
|
||
| Name | Type | Unit | Labels | Description | | ||
| ------------- | ----- | ----- | ------- | ----------- | | ||
| grpc.tcp.min_rtt | Histogram (double) | s | grpc.tcp.peer_address, grpc.tcp.local_address | Records TCP's current estimate of minimum round trip time (RTT), typically used as an indication of the network health between two endpoints. | | ||
| grpc.tcp.delivery_rate | Histogram (double) | bit/s | grpc.tcp.peer_address, grpc.tcp.local_address | Records latest throughput measured of the TCP connection. | | ||
| grpc.tcp.packets_sent | Counter (uint64) | {packet} | grpc.tcp.peer_address, grpc.tcp.local_address | Records total packets TCP sends in the calculation period. | | ||
| grpc.tcp.packets_retransmitted | Counter (uint64) | {packet} | grpc.tcp.peer_address, grpc.tcp.local_address | Records total packets lost in the calculation period, including lost or spuriously retransmitted packets. | | ||
| grpc.tcp.packets_spurious_retransmitted | Counter (uint64) | {packet} | grpc.tcp.peer_address, grpc.tcp.local_address | Records total packets spuriously retransmitted packets in the calculation period. These are retransmissions that TCP later discovered unnecessary.| | ||
|
||
The metrics are acquired by enabling the `SO_TIMESTAMPING` option in the kernel's TCP stack via the `setsocketopt(fd, SOL_SOCKET, SO_TIMESTAMPING, &val, sizeof(val))` system call. This configuration allows the kernel to capture packet timestamps during transmission and subsequently provide relevant socket information when `getsockopt(TCP_INFO)` is invoked. | ||
|
||
#### Reference: | ||
* Fathom: https://dl.acm.org/doi/pdf/10.1145/3603269.3604815 | ||
* Kernel TCP Timestamping: https://www.kernel.org/doc/Documentation/networking/timestamping.rst | ||
|
||
### Metric Stability | ||
|
||
All metrics added in this proposal will start as experimental. The long term goal will be to | ||
de-experimentalize them and have them be on by default, but the exact | ||
criteria for that change are TBD. | ||
|
||
### Temporary environment variable protection | ||
|
||
This proposal does not include any features enabled via external I/O, so | ||
it does not need environment variable protection. | ||
|
||
## Implementation | ||
|
||
Will be implemented in C-core, and currently have no plans to implement in other languages. | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be
###
instead of####
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thanks for the suggestion.