Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flow Control #631

Open
bmaurer opened this issue Apr 27, 2024 · 15 comments
Open

Flow Control #631

bmaurer opened this issue Apr 27, 2024 · 15 comments

Comments

@bmaurer
Copy link

bmaurer commented Apr 27, 2024

I've noticed that ET struggles with Flow Control. If you run

cat /dev/zero | base64 -w 0

it is not possible to send a ^C to the server. Using SSH directly it works fine.

@bmaurer
Copy link
Author

bmaurer commented Apr 27, 2024

$ et --version
et version 6.2.1

@MisterTea
Copy link
Owner

ET has flow control but it's based on lines-per-second (currently set to 1024 lines-per-second):

https://github.com/MisterTea/EternalTerminal/blob/master/src/terminal/UserTerminalHandler.cpp#L110

In your example, there are no newlines so the flow control doesn't kick in. Feel free to change the logic to do roughly 80k characters per second and that would fix the example you posted.

@bmaurer
Copy link
Author

bmaurer commented May 4, 2024

@MisterTea This also repros if you just run cat /dev/zero | base64, which has plenty of newlines.

@MisterTea
Copy link
Owner

Oh, interesting! What happens if you compile with a much lower lines per second?

@bmaurer
Copy link
Author

bmaurer commented May 7, 2024

Haven't had a chance to try out an end to end compile. This python script might be helpful though:

from datetime import datetime
while True:
  print(( datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f') + "\n") * 1000)

When run through ET, you can see that the time that gets printed out starts lagging real time. This suggests that ET is not putting back pressure on the stdout of the underlying process. OTOH with SSH directly, everything is in real time.

Intuitively, if you have a rate limit (ideally this would not be necessary. 1024 lines / second is rather slow if you cat a long file) you also need to put backpressure on the remote end. If you allow buffering, you are always going to be seeing old output, which means that a control-C can't take effect instantly.

@MisterTea
Copy link
Owner

Interesting, I'll have to try that later this week. It should be blocking the read() call which does put back pressure on the process writing the output, but maybe there is something else buffering in the kernel? I'll explore later on this week and report back.

@bmaurer
Copy link
Author

bmaurer commented Aug 16, 2024

Any findings here?

@bmaurer
Copy link
Author

bmaurer commented Sep 27, 2024

@MisterTea checking in on this again. We got a few other complaints about it internally at meta

@jshort
Copy link
Collaborator

jshort commented Nov 6, 2024

@MisterTea so we instantiate UTH with noratelimit == true so I don't even think the 1024 lines/s come into play.

if (FD_ISSET(masterFd, &rfd) && (noratelimit || outputPerSecond < 1024)) {

@MisterTea
Copy link
Owner

@jshort @bmaurer Yes, the problem is that noratelimit is set to true because I never got around to tuning it. If 1024 lines/sec is good for most people, we can set that to false (or even better, tie it to a command line flag that defaults to false) and we get flow control. I verified locally that I can cat /dev/zero | base64 and ctrl-c just fine

@bmaurer
Copy link
Author

bmaurer commented Nov 7, 2024

I don't have the full context here of how that rate limit works, but it seems a bit sus to me that raw SSH doesn't have a rate limit of this form AFAIK. It feels like we're missing something that appropriately applies back pressure so that we don't have to hard code a magic constant like 1024 lines per second. That said, the current rate limit might be better than nothing

@MisterTea
Copy link
Owner

If ssh has something better, please copy that logic to et and bring it over!

@bmaurer
Copy link
Author

bmaurer commented Nov 7, 2024

I think this is a question of building in flow control and back pressure. Consider the following two programs:

bmaurer@bmaurer-mbp ~ % cat print.py
from datetime import datetime
import time
while True:
  print(( datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f') + "\n") * 1000)
  time.sleep(0.01)
bmaurer@bmaurer-mbp ~ % cat read.py
import sys
import time

for line in sys.stdin:
  print(line.strip())
  time.sleep(0.02)
Here's what happens when you run `(python3 print.py | python3 read.py) | uniq -c`
1000 2024-11-07 13:58:56.902309
   1
1000 2024-11-07 13:58:56.914864
   1
1000 2024-11-07 13:58:56.927435
   1
1000 2024-11-07 13:59:04.074071
   1
1000 2024-11-07 13:59:32.693990
   1
1000 2024-11-07 13:59:54.138414
   1
1000 2024-11-07 14:00:15.592981
   1
1000 2024-11-07 14:00:44.261028
   1
1000 2024-11-07 14:01:05.678402
   1
1000 2024-11-07 14:01:27.292877
   1
1000 2024-11-07 14:01:49.064790
   1

How does print.py know to pause for so long between the blocks of output so that it can keep up? Read.py only reads a limited buffer of data when it reads stdin. When print.py tries to write() to the file descriptor for stdout, there's only a limited buffer size. This means that those calls sall, causing the space between the timestamps. Note that the first three blocks complete quite quickly (they're all in the second 13:58:56), but after that there is spacing between them.

There's no "magic" rate limit here. Just a fixed buffer size.

I think ET needs to do something similar. If the client is not reading from the socket at the rate the server is producing it, we must eventually produce back pressure on the server.

@MisterTea
Copy link
Owner

That makes sense. The challenge is that there are many steps where we could be missing backpressure. The flow is something like this:

terminal output -> etterminal -> etserver -> et -> console

We do have blocking operations all over the place that will stall, causing backpressure. It's not clear where we have a buffer that is filling up.

@bmaurer
Copy link
Author

bmaurer commented Nov 8, 2024

One approach to debug could be running et -c "python print.py" | sleep 100000. The goal of this setup would be that print.py should experience backpressure. A debugger or printfs be able to reveal where in the chain above there's a buffer being filled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants