Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unix sockets support #1436

Merged
merged 11 commits into from
Dec 6, 2024
Merged

Unix sockets support #1436

merged 11 commits into from
Dec 6, 2024

Conversation

grcevski
Copy link
Contributor

@grcevski grcevski commented Dec 5, 2024

This PR adds support for running TCP and HTTP connections over unix sockets. These are different than the regular TCP sockets because they run streams across mapped files on disk. The setup is always on the same host, so the context propagation can be done with our black box context propagation.

Couple of things related to this addition:

  1. I managed to find Linux kernel APIs which specifically work with unix sockets, rather than using probes on read/write/readv/writev/recvfrom. This makes the implementation performant.
  2. I had to come up with a way to correlate the sending and receiving side. After bunch of research I realized that we can use the inode numbers of these sockets and correlate them. When these unix_socket APIs are used we can cast the struct sock pointers to struct unix_sock, which allows us to see the peer socket.
  3. I overloaded the meaning of the connection information we use everywhere for keys with using the inode number and the peer inode number as pairs.
  4. There is a slight complication with the initial send request that the client makes over the unix socket. Until the first send completes the peer inode number appears to be 0, so we don't have it. I had to add code to fix it up after the receiving side replies and they start talking.

TODO:

  • Integration tests

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 83.33333% with 2 lines in your changes missing coverage. Please review.

Project coverage is 81.01%. Comparing base (8f545d9) to head (f50b92b).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
pkg/internal/ebpf/common/tcp_detect_transform.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1436      +/-   ##
==========================================
+ Coverage   80.88%   81.01%   +0.13%     
==========================================
  Files         149      149              
  Lines       15117    15130      +13     
==========================================
+ Hits        12227    12258      +31     
+ Misses       2290     2278      -12     
+ Partials      600      594       -6     
Flag Coverage Δ
integration-test 59.77% <83.33%> (+0.57%) ⬆️
k8s-integration-test 59.42% <83.33%> (+0.06%) ⬆️
oats-test 33.63% <83.33%> (+0.05%) ⬆️
unittests 51.66% <0.00%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@rafaelroquetto rafaelroquetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! Good stuff! LGTM 🥇

Comment on lines 17 to 25
struct sock *sk;
BPF_CORE_READ_INTO(&sk, sock, sk);

if (!sk) {
return 0;
}

struct unix_sock *usock;
BPF_CORE_READ_INTO(&usock, sock, sk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-rhetorical question: why is the double read required? Would something like the following not work?

const struct unix_sock *usock = BPF_CORE_READ(sock, sk);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for catching it, it's a left over from some earlier changes.


u8 *buf = iovec_memory();
if (buf) {
copied_len = read_iovec_ctx(iov_ctx, buf, copied_len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd not reuse the variable here, I found it a bit confusing - for instance, it begs the question: after a successful the call to read_iovec_ctx, should the new copied len match the older one? If not, why are they expected to be different?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can split it in another variable, but iovecs read with a loop and it's limited in how many iterations we do, so we may not read the full buffer.

unsigned long peer_inode_number = 0;
BPF_CORE_READ_INTO(&inode_number, sock, sk, sk_socket, file, f_inode, i_ino);

struct unix_sock *usock = unix_sock_from_socket(sock);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case, is this equivalent to:

struct unix_sock *usock = unix_sock_from_sk(sk);

?
If so, then that will save us a few extra reads

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, another code iteration I guess, we don't need to re-read it.

@grcevski grcevski merged commit 21f9154 into grafana:main Dec 6, 2024
15 checks passed
@grcevski grcevski deleted the unix_sock branch December 6, 2024 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants