Skip to content

Commit

Permalink
Fix unfinished future in Python Thrift client
Browse files Browse the repository at this point in the history
Summary:
noideadog2

My team has a Python library that uses Thrift to talk to a WWW service that my team also maintains. This Python library is used in Dataswarm to process lots of data.

The problem that we are seeing is that, ocasionally, a Thrift deserialization error happens. When this error happens, what we also observe is that our entire Dataswarm pipeline freezes. It is as if something within the Thrift client becomes frozen and it ends up blocking our entire Dataswarm pipeline.

What do we see in our logs? What we see in our logs is that, ocasionally, these Dataswarm pipelines log the following error:

```
Traceback (most recent call last):
File "thrift.python/serializer.pyx", line 59, in thrift.python.serializer.deserialize
File "thrift.python/serializer.pyx", line 55, in thrift.python.serializer.deserialize_with_length
thrift.python.exceptions.Error: Encountered invalid field/element type (166) during skipping
Exception ignored in: 'thrift.python.client.async_client._async_client_send_request_callback'
```

After we see this message, the Dataswarm pipeline freezes, and we can see that our Python code does not move forward. So, it seems that something is frozen, somewhere. Given the above error message, we suspect it's somewhere in the Thrift infra.

After looking at that error message what I did is decided to look at the `_async_client_send_request_callback` function that the stack trace is mentioning. That function is here https://fburl.com/code/nwufgygd

According to that stack trace, what seems to be happening is that this call to deserialize here https://fburl.com/code/zajanbby is throwing an exception, and this exception is escaping the `_async_client_send_request_callback` call.

Now, this is ** JUST A GUESS **, but I'm wondering: could it be that, because the exception is escaping the `_async_client_send_request_callback` call, we're just leaving the pyfuture unfinalized, and thus everything depending on that future will freeze? This would explain what we're seeing on our end (that is, our Python code in Dataswarm not moving forward).

This diff is an attempt at fixing that.

Reviewed By: Filip-F

Differential Revision: D68503986

fbshipit-source-id: 7e8c2c5f3931fb0008b49ac2c99e6e2ab2a54d12
  • Loading branch information
ricardojuanpalmaduran authored and facebook-github-bot committed Jan 22, 2025
1 parent cc8868a commit cd975cf
Showing 1 changed file with 14 additions and 7 deletions.
21 changes: 14 additions & 7 deletions third-party/thrift/src/thrift/lib/python/client/async_client.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ from libcpp.utility cimport move as cmove
from thrift.python.client.omni_client cimport cOmniClientResponseWithHeaders, RpcKind, cOmniInteractionClient, createOmniInteractionClient, cData, FunctionQualifier, InteractionMethodPosition
from thrift.python.client.request_channel cimport RequestChannel
from thrift.python.exceptions cimport create_py_exception
from thrift.python.exceptions import ApplicationError, ApplicationErrorType
from thrift.python.exceptions import ApplicationError, ApplicationErrorType, ProtocolError, ProtocolErrorType
from thrift.python.serializer import serialize_iobuf, deserialize
from thrift.python.mutable_serializer import (
serialize_iobuf as serialize_iobuf_mutable,
Expand Down Expand Up @@ -255,12 +255,19 @@ cdef void _async_client_send_request_callback(
stream_cls,
protocol,
)
py_resp = (
deserialize(response_cls, response_iobuf, protocol=protocol)
if not is_mutable_types
else deserialize_mutable(response_cls, response_iobuf, protocol=protocol)
)
pyfuture.set_result(py_resp if py_stream is None else (py_resp, py_stream))
try:
py_resp = (
deserialize(response_cls, response_iobuf, protocol=protocol)
if not is_mutable_types
else deserialize_mutable(response_cls, response_iobuf, protocol=protocol)
)

pyfuture.set_result(py_resp if py_stream is None else (py_resp, py_stream))
except Exception as e:
pyfuture.set_exception(ProtocolError(
ProtocolErrorType.UNKNOWN,
"Deserialization failed, following exception thrown: " + str(e)
))

cdef void _interaction_client_callback(cFollyTry[unique_ptr[cOmniInteractionClient]]&& result,
PyObject* userData,) noexcept:
Expand Down

0 comments on commit cd975cf

Please sign in to comment.