-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Access to Raw Serialized Data (Raw Data) in Latest gRPC Version Without Marshaling - Previous Workaround No Longer Available #7794
Comments
Apologies for the breakage! This field needed to be removed for performance reasons as we no longer create a full contiguous copy of the serialized data (except within the default proto codec, unfortunately, but if they ever improve the proto library to take advantage of split input buffers, we would start using it ASAP). It's possible we could add the data back as a Let me talk with the other gRPC leads (for Java/C++) and see what kind of options they might offer for something like this. It could be the case that it's just not possible without doing things more manually -- i.e. not using the generated code to register your method handlers, and have a custom codec that just can pass the bytes through instead. |
Thank you, @dfawley , for the insights and explanation regarding the changes. The removal of payload.Data does impact applications like digital signature verification is essential. In our scenario, we sign the serialized raw data, and upon receiving it, we verify the signature against the exact bytes to ensure the data’s authenticity and integrity before processing it. With the recent changes, we no longer have a straightforward way to access this raw data through *stats.InPayload, making it challenging to perform our verification step effectively |
I spoke with the leads from Java & C++. They don't have any easy way to do exactly what you want, either. I'd probably recommend using the generic interface, which is what you'd probably want to do in the other languages. Or if you want something that takes a bit of setting up and a lot of magical looking stuff, keep reading... Before that, can you explain your use case a little more? What are the reasons you are signing data and checking signatures at the application level, instead of using something like mTLS so that both sides know they are talking to a trusted party? It seems like that's really what you want to be using instead, or else you could validate specific fields within the payloads instead. Regardless, the stats handler feels like the wrong place to do what you're doing. It was never intended for this use case at all. And the gRPC-Go interceptors have a very strange design that unfortunately doesn't lend itself to this kind of use case. The other potential approach: registering the service by taking the generated E.g. using our route guide example:
Step 1: define a custom type to hold the message and the bytes: type specialProto struct {
msg proto.Message
rawBytes []byte
} Step 2: implement a custom version of the "proto" codec to handle this type: func init() {
encoding.RegisterCodecV2(myProtoCodec{CodecV2: encoding.GetCodecV2("proto")})
}
type myProtoCodec {
encoding.CodecV2 // embed the real proto codec for simplicity
}
// Override Unmarshal to handle specialProto.
// Note: this could be optimized, but gets more complicated.
func (m *myProtoCodec) Unmarshal(data mem.BufferSlice, v any) error {
sp, ok := v.(*specialProto)
if !ok {
// Not a special proto; fall back.
return m.CodecV2.Unmarshal(data, v)
}
// Special proto: call the original codec on the message field, then set rawBytes.
if err := m.CodecV2.Unmarshal(data, sp.msg); err != nil {
return err
}
sp.rawBytes = data.Materialize()
return nil
} Step 3: create something that can alter the way RPCs are handled to pass this // interceptingHandler creates a new method handler that intercepts the codec to provide a different type to it.
func interceptingHandler(origMethodHandler grpcMethodHandler) grpcMethodHandler {
return func(srv any, ctx context.Context, dec func(any) error, interceptor UnaryServerInterceptor) (any, error) {
// Substitute the decoding function. When the real handler calls it, it will pass the proper proto
// message type in "in". We wrap that and pass it to the real decode function (the codec).
dec := func(in any) error {
spProto := &specialProto{msg: in}
if err := dec(spProto); err != nil { // call the real codec on the special type.
return nil, err
}
return spProto, nil
}
// Now call the original handler -- in our example that's "routeguide._RouteGuide_GetFeature_Handler"
// https://github.com/grpc/grpc-go/blob/d66fc3a1efa1dfb33dfedf9760528f1ac2b923b6/examples/route_guide/routeguide/route_guide_grpc.pb.go#L210C6-L210C36
return origMethodHandler(srv, ctx, dec, interceptor)
}
}
// grpc should likely be exporting this; please file a bug if you'd like.
type grpcMethodHandler func(srv any, ctx context.Context, dec func(any) error, interceptor UnaryServerInterceptor) (any, error) Step 4: modify the service descriptor and register and implement the service: func main() {
// Modify ServiceDesc to intercept whatever methods you want (here we do all of them)
sd := routeguide.RouteGuide_ServiceDesc
sd.HandlerType = (*any)(nil) // unfortunate, but necessary to appease some internal checks we have.
for i := range sd.Methods {
sd.Methods[i].Handler = interceptingHandler(sd.Methods[i].Handler) // could be limited to only specific methods as needed.
}
// Intercepting streams is similar, but different.
s := grpc.NewServer(...)
s.RegisterService(sd, myServiceImpl{})
}
type myServiceImpl struct {}
func (myServiceImpl) GetFeature(ctx context.Context, in *specialProto) (*routeguide.Feature, error) {
req, ok := in.msg.(*routeguide.Point)
if !ok {
return nil, status.Errorf(codes.Internal, "error decoding request message") // should never happen, but probably better than a panic
}
// in.rawBytes contains the raw bytes
} That's a lot of hoops to jump through, though, so I'd just recommend directly creating your own ServiceDesc and implementing your own handler instead of making something that intercepts other handlers. |
Thank you for the detailed guidance and potential solutions. Given the current limitations with the interceptor design, would it be possible to provide access to mem.BufferSlice directly within interceptors instead of proto.Message? This access would be invaluable for workflows that require raw serialized data, particularly in complex designs where exact byte-level accuracy is critical and additional marshaling may not be feasible. @dfawley |
@mohdjishin do you have a proposal for how we could change things to do that? |
@mohdjishin Are you still actively looking into this? |
This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed. |
@mohdjishin are you still working on it? |
I'm not exactly sure how we can achieve this. The previous approach was working perfectly. Alternatively, we could consider using some grpc.DialOption function to register a custom type. Based on this, we could internally map the raw data in the context. |
In prior gRPC versions, the stats handler provided direct access to raw serialized data through payload.Data in
*stats.InPayload
. This functionality enabled certain applications to retrieve unmodified byte data directly. While a workaround was previously available — leveragingTagRPC
to add values to the context andHandleRPC
to access raw bytes for*stats.InPayload
— this approach is no longer possible in the latest version.Issue: With this removal, it is challenging to obtain raw request bytes directly without additional marshaling. The previous workaround, as discussed in @dfawley comment on May 17, 2023 Workaround, no longer works in the latest gRPC version.
The removal of payload.Data occurred in this commit
The text was updated successfully, but these errors were encountered: