-
Notifications
You must be signed in to change notification settings - Fork 74
Added the ability for plugins to receive the request headers and modify them #760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shmuelk The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Welcome @shmuelk! |
Hi @shmuelk. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/ok-to-test |
@@ -70,7 +70,9 @@ func (s *StreamingServer) HandleRequestBody( | |||
Model: model, | |||
ResolvedTargetModel: modelName, | |||
Critical: modelObj.Spec.Criticality != nil && *modelObj.Spec.Criticality == v1alpha2.Critical, | |||
Headers: reqCtx.RequestHeaders, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines 73 and 75 are duplicates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, fixed
pkg/epp/scheduling/types/types.go
Outdated
@@ -32,6 +32,7 @@ type LLMRequest struct { | |||
// Target models is a map of target model name to weight. | |||
TargetModels map[string]int | |||
Prompt string | |||
Headers map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment with description like the other fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I also added one for MutatedHeaders in the SchedulingContext
/assign |
@@ -32,6 +32,9 @@ type LLMRequest struct { | |||
// Target models is a map of target model name to weight. | |||
TargetModels map[string]int | |||
Prompt string | |||
// Headers during request processing contains all of the request headers. | |||
// During response processing it contains all of the response headers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove spaces
@@ -377,6 +380,15 @@ func (s *StreamingServer) populateRequestHeaderResponse(reqCtx *RequestContext, | |||
}, | |||
}) | |||
} | |||
// Add headers added by filters/scorers | |||
for key, value := range mutatedHeaders { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The headers is a list. If the plugins updated the value of an existing header key, should we update the header here instead of just appending?
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shmuelk there have been some changes in main that affect this PR.
in main - the headers are now passed through the reqCtx
arg to HandleRequestBody
using the Request
struct.
this is a step towards the direction you made in this PR.
please rebase to grap latest changes.
for _, header := range req.RequestHeaders.Headers.Headers { | ||
reqCtx.RequestHeaders[header.Key] = header.Value | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this part was already merged into main
@@ -86,6 +86,8 @@ type RequestContext struct { | |||
RequestState StreamRequestState | |||
modelServerStreaming bool | |||
|
|||
RequestHeaders map[string]string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also available already in main but in different field - Request
, which contains body and headers.
@@ -117,7 +119,8 @@ func (s *StreamingServer) Process(srv extProcPb.ExternalProcessor_ProcessServer) | |||
// Create request context to share states during life time of an HTTP request. | |||
// See https://github.com/envoyproxy/envoy/issues/17540. | |||
reqCtx := &RequestContext{ | |||
RequestState: RequestReceived, | |||
RequestState: RequestReceived, | |||
RequestHeaders: make(map[string]string), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
already in main in Request
field
This PR adds support for the PreSchedule, Filter, Scorer, Picker, and PostSchedule plugins to receive all of the headers sent by the user in their request. This enables the writing of plugins that make use of additional headers.
In addition, this PR adds support for the above plugins to add and/or modify the headers in the user's request. This is needed in certain solutions of Disaggregated Prefill/Decode.
The unit tests added are very simplistic. A better unit test will be added later at the StreamingServer level