Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance: upstream and downstream will never run concurrently #1327

Open
howardjohn opened this issue Sep 26, 2024 · 4 comments
Open

performance: upstream and downstream will never run concurrently #1327

howardjohn opened this issue Sep 26, 2024 · 4 comments

Comments

@howardjohn
Copy link
Member

copy_bidirectional uses tokio::join for the copy from upstream->downstream and vis-versa. Join is not concurrent, so its impossible for these two to happen at the same time (utilizing threads).

Intuitively it seems like this should be helpful. On a simple call-response workload, no, but if there is continuous flow of data in both directions it should.

I put together a prototype, however, and do not see any benefits:

HBONE
Master
DEST           CLIENT  QPS   CONS  DUR  PAYLOAD  SUCCESS  THROUGHPUT   P50      P90      P99
fortio-server  fortio  0     1     5    0        96743    19348.29qps  0.048ms  0.066ms  0.113ms
fortio-server  fortio  0     1     5    1024     83755    16750.66qps  0.055ms  0.076ms  0.138ms
fortio-server  fortio  2000  1     5    0        9998     1999.53qps   0.094ms  0.143ms  0.277ms
fortio-server  fortio  2000  1     5    1024     10000    1999.89qps   0.106ms  0.156ms  0.289ms
fortio-server  fortio  0     2     5    0        146377   29274.81qps  0.061ms  0.093ms  0.180ms
fortio-server  fortio  0     2     5    1024     131639   26327.32qps  0.071ms  0.096ms  0.176ms
fortio-server  fortio  2000  2     5    0        10000    1999.50qps   0.118ms  0.175ms  0.331ms
fortio-server  fortio  2000  2     5    1024     10000    1999.59qps   0.134ms  0.198ms  0.358ms
fortio-server  fortio  0     4     5    0        222310   44460.78qps  0.085ms  0.117ms  0.192ms
fortio-server  fortio  0     4     5    1024     179137   35826.45qps  0.105ms  0.148ms  0.264ms
fortio-server  fortio  2000  4     5    0        9996     1998.58qps   0.125ms  0.183ms  0.304ms
fortio-server  fortio  2000  4     5    1024     9998     1998.93qps   0.150ms  0.236ms  0.606ms
fortio-server  fortio  0     64    5    0        407907   81573.11qps  0.744ms  1.430ms  1.965ms
fortio-server  fortio  0     64    5    1024     265418   53074.42qps  1.335ms  1.896ms  2.745ms
fortio-server  fortio  2000  64    5    0        9984     1987.19qps   0.142ms  0.200ms  0.296ms
fortio-server  fortio  2000  64    5    1024     9984     1987.05qps   0.177ms  0.258ms  0.385ms
ID   Interval          Transfer      Bitrate
[  0]   0.00..10.00 sec  10.46 GiB   8.99 Gbits/sec        sender
[  0]   0.00..10.00 sec  10.41 GiB   8.94 Gbits/sec        receiver

Spawning
DEST           CLIENT  QPS   CONS  DUR  PAYLOAD  SUCCESS  THROUGHPUT   P50      P90      P99
fortio-server  fortio  0     1     5    0        95146    19028.90qps  0.048ms  0.067ms  0.112ms
fortio-server  fortio  0     1     5    1024     81596    16318.86qps  0.057ms  0.078ms  0.137ms
fortio-server  fortio  2000  1     5    0        9998     1999.40qps   0.094ms  0.139ms  0.250ms
fortio-server  fortio  2000  1     5    1024     9998     1999.54qps   0.107ms  0.154ms  0.271ms
fortio-server  fortio  0     2     5    0        156544   31306.96qps  0.060ms  0.085ms  0.138ms
fortio-server  fortio  0     2     5    1024     128630   25725.32qps  0.073ms  0.099ms  0.186ms
fortio-server  fortio  2000  2     5    0        9998     1999.36qps   0.121ms  0.179ms  0.316ms
fortio-server  fortio  2000  2     5    1024     10000    1999.42qps   0.135ms  0.200ms  0.371ms
fortio-server  fortio  0     4     5    0        218797   43758.22qps  0.086ms  0.119ms  0.196ms
fortio-server  fortio  0     4     5    1024     182502   36499.22qps  0.102ms  0.145ms  0.268ms
fortio-server  fortio  2000  4     5    0        9998     1998.75qps   0.137ms  0.235ms  0.479ms
fortio-server  fortio  2000  4     5    1024     9998     1998.85qps   0.164ms  0.287ms  0.631ms
fortio-server  fortio  0     64    5    0        400404   80069.12qps  0.755ms  1.469ms  1.973ms
fortio-server  fortio  0     64    5    1024     286835   57358.42qps  1.231ms  1.860ms  2.070ms
fortio-server  fortio  2000  64    5    0        9984     1987.07qps   0.144ms  0.212ms  0.364ms
fortio-server  fortio  2000  64    5    1024     9984     1986.99qps   0.180ms  0.258ms  0.411ms
[  0]   0.00..10.00 sec  10.77 GiB   9.25 Gbits/sec        sender
[  0]   0.00..10.00 sec  10.71 GiB   9.20 Gbits/sec        receiver



TCP
Master
DEST           CLIENT  QPS   CONS  DUR  PAYLOAD  SUCCESS  THROUGHPUT    P50      P90      P99
fortio-server  fortio  0     1     3    0        93706    31234.58qps   0.029ms  0.040ms  0.075ms
fortio-server  fortio  0     1     3    64000    18341    6113.28qps    0.138ms  0.267ms  0.395ms
fortio-server  fortio  2000  1     3    0        5998     1998.92qps    0.062ms  0.103ms  0.245ms
fortio-server  fortio  2000  1     3    64000    5998     1999.05qps    0.181ms  0.364ms  0.595ms
fortio-server  fortio  0     64    3    0        368849   122926.89qps  0.507ms  0.682ms  1.397ms
fortio-server  fortio  0     64    3    64000    43652    14530.98qps   3.361ms  9.953ms  19.560ms
fortio-server  fortio  2000  64    3    0        5952     1978.43qps    0.098ms  0.144ms  0.316ms
fortio-server  fortio  2000  64    3    64000    5952     1978.35qps    0.284ms  0.645ms  1.627ms

[SUM]   0.00..10.00 sec  25.64 GiB   22.02 Gbits/sec        sender
[SUM]   0.00..9.97 sec  27.77 GiB   23.93 Gbits/sec        receiver

Spawning
DEST           CLIENT  QPS   CONS  DUR  PAYLOAD  SUCCESS  THROUGHPUT    P50      P90      P99
fortio-server  fortio  0     1     3    0        86532    28843.36qps   0.031ms  0.044ms  0.094ms
fortio-server  fortio  0     1     3    64000    14891    4963.41qps    0.164ms  0.330ms  0.640ms
fortio-server  fortio  2000  1     3    0        5998     1998.99qps    0.064ms  0.106ms  0.196ms
fortio-server  fortio  2000  1     3    64000    6000     1999.32qps    0.185ms  0.376ms  0.678ms
fortio-server  fortio  0     64    3    0        360397   120090.70qps  0.508ms  0.698ms  1.640ms
fortio-server  fortio  0     64    3    64000    44472    14802.99qps   3.186ms  9.915ms  19.528ms
fortio-server  fortio  2000  64    3    0        5952     1978.83qps    0.093ms  0.130ms  0.192ms
fortio-server  fortio  2000  64    3    64000    5952     1978.39qps    0.273ms  0.568ms  0.919ms

[SUM]   0.00..10.00 sec  25.04 GiB   21.51 Gbits/sec        sender
[SUM]   0.00..9.96 sec  28.12 GiB   24.25 Gbits/sec        receiver

We should investigate more

@ilrudie
Copy link
Contributor

ilrudie commented Sep 27, 2024

Did you push the prototype code?

@bleggett
Copy link
Contributor

bleggett commented Oct 1, 2024

It shouldn't be terribly hard to use spawn and join on the handles instead, or whatever, I guess, if we want real parallelism.

A little more expensive for trivial cases, but probably worth it for all the others.

@ilrudie
Copy link
Contributor

ilrudie commented Oct 1, 2024

I think we can collect multiple handles from tokio::spawn and then await them. I presume something like that is what @howardjohn already tried but didn't see any benefit from.

@bleggett
Copy link
Contributor

bleggett commented Oct 1, 2024

I think we can collect multiple handles from tokio::spawn and then await them. I presume something like that is what @howardjohn already tried but didn't see any benefit from.

Yeah - I misread. It probably doesn't make much difference because we already spawn the per-workload handler in a thread and are not remotely CPU bound even under load - distributing this specific operation across threads won't help much (and might make it easier for a greedy workload to starve other workloads on the node).

in general sticking with a one-thread-per-conn-handler-instance model seems best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants