Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ext_proc] let ext_proc filter follow the grpc spec, wait for trailers before continue the filter chain upon last main stream end_stream=true event. #37088

Open
stevenzzzz opened this issue Nov 10, 2024 · 6 comments
Assignees
Labels
area/ext_proc bug no stalebot Disables stalebot from closing an issue

Comments

@stevenzzzz
Copy link
Contributor

stevenzzzz commented Nov 10, 2024

Title: ext_proc side grpc stream should follow grpc spec to wait for trailers to end a session.

Description:

In Grpc, the trailers carries the grpc-status header terminates a rpc stream. see https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses for the spec and https://carlmastrangelo.com/blog/why-does-grpc-insist-on-trailers for why.

Our current grpc flow in ext_proc filter ignores the trailers by closeStream and resetStream all together, the later resetStream call would signal the remote server a CANCEL, while clean the sidestream and ignore any possible trailers that might have been sent by ext_proc server.

Instead, for the last de/encode{Headers,data} call, ext_proc filter should fire a half-close (closeStream), and wait until the trailers come back, then call continueProcessing().

In PR #37083 we moved one step towards this goal: split half-close from reset(), but we would like to move the half-close earlier to the filter callbacks, before onDestroy gets called.

Noteworthy that there are special case in GRPC: Headers only response carries grpc-status in response headers, in which case we should proactively close the side stream as well.

@stevenzzzz stevenzzzz added bug triage Issue requires triage labels Nov 10, 2024
@yanavlasov yanavlasov added area/ext_proc and removed triage Issue requires triage labels Nov 11, 2024
Copy link

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Dec 13, 2024
@stevenzzzz
Copy link
Contributor Author

hey @tyxia/ @yanavlasov could you put a
"no stalebot" mark on this issue?

@tyxia tyxia added no stalebot Disables stalebot from closing an issue and removed stale stalebot believes this issue/PR has not been touched recently labels Dec 13, 2024
@stevenzzzz stevenzzzz changed the title [ext_proc] let ext_proc filter follows the grpc spec, wait for trailers before continue the filter chain upon last main stream end_stream=true event. [ext_proc] let ext_proc filter follow the grpc spec, wait for trailers before continue the filter chain upon last main stream end_stream=true event. Dec 30, 2024
@yanjunxiang-google
Copy link
Contributor

Hi, @stevenzzzz ,
The below comments is problematic to me:

Instead, for the last de/encode{Headers,data} call, ext_proc filter should fire a half-close (closeStream), and wait until the trailers come back, then call continueProcessing().

If a HTTP request does not have trailers, i.e, if it only contains body, then after ext_proc filter receives the last chunk body response from the ext_proc server (end_of_stream = true in this case), it will terminate the external processing and call continueProcessing(). That's what's the ext_proc filter is designed today. I am a little bit confusing why ext_proc filter need to wait for a trailers coming back.

@yanjunxiang-google
Copy link
Contributor

yanjunxiang-google commented Jan 3, 2025

BTW, the gRPC channel between Envoy and the ext_proc server provides the transportation for the actual HTTP request/response in the main stream. The ext_proc filter state machine ties to the content of the mainstream HTTP request/response. It does not depend on the gRPC channel status, i.e, whether the gRPC channel had sent the trailers or not, for its state transitions.

@stevenzzzz
Copy link
Contributor Author

Thanks Yanjun.

If a HTTP request does not have trailers, i.e, if it only contains body, then after ext_proc filter receives the last chunk body response from the ext_proc server (end_of_stream = true in this case), it will terminate the external processing and call continueProcessing(). That's what's the ext_proc filter is designed today. I am a little bit confusing why ext_proc filter need to wait for a trailers coming back.

Noteworthy the trailers mentioned is not for the "upstream reqest", but the "side-stream" grpc request's trailers.
a well behaved Grpc server impl should send trailers to end the request from the Server side. OTOH, for streaming GRPC, it's client's duty to tell GRPC Server that there is no more data by sending a half-close (empty frame with end_stream=true).

@stevenzzzz
Copy link
Contributor Author

stevenzzzz commented Jan 3, 2025

BTW, the gRPC channel between Envoy and the ext_proc server provides the transportation for the actual HTTP request/response in the main stream. The ext_proc filter state machine ties to the content of the mainstream HTTP request/response. It does not depend on the gRPC channel status, i.e, whether the gRPC channel had sent the trailers or not, for its state transitions.

I am not sure I agree.
I would expect the grpc-client in ext_proc filter to follow the grpc-spec and be a "good client" as well.
as a matter of fact, Envoy code is ready for grpc comms.

See Router::Filter::onUpstreamTrailers for example: https://github.com/envoyproxy/envoy/blob/main/source/common/router/router.cc#L1743-L1759
Envoy only inc the success/error counter on trailers frame, if it's a grpc request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ext_proc bug no stalebot Disables stalebot from closing an issue
Projects
None yet
Development

No branches or pull requests

4 participants