Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix unfinished future in Python Thrift client
Summary: noideadog2 My team has a Python library that uses Thrift to talk to a WWW service that my team also maintains. This Python library is used in Dataswarm to process lots of data. The problem that we are seeing is that, ocasionally, a Thrift deserialization error happens. When this error happens, what we also observe is that our entire Dataswarm pipeline freezes. It is as if something within the Thrift client becomes frozen and it ends up blocking our entire Dataswarm pipeline. What do we see in our logs? What we see in our logs is that, ocasionally, these Dataswarm pipelines log the following error: ``` Traceback (most recent call last): File "thrift.python/serializer.pyx", line 59, in thrift.python.serializer.deserialize File "thrift.python/serializer.pyx", line 55, in thrift.python.serializer.deserialize_with_length thrift.python.exceptions.Error: Encountered invalid field/element type (166) during skipping Exception ignored in: 'thrift.python.client.async_client._async_client_send_request_callback' ``` After we see this message, the Dataswarm pipeline freezes, and we can see that our Python code does not move forward. So, it seems that something is frozen, somewhere. Given the above error message, we suspect it's somewhere in the Thrift infra. After looking at that error message what I did is decided to look at the `_async_client_send_request_callback` function that the stack trace is mentioning. That function is here https://fburl.com/code/nwufgygd According to that stack trace, what seems to be happening is that this call to deserialize here https://fburl.com/code/zajanbby is throwing an exception, and this exception is escaping the `_async_client_send_request_callback` call. Now, this is ** JUST A GUESS **, but I'm wondering: could it be that, because the exception is escaping the `_async_client_send_request_callback` call, we're just leaving the pyfuture unfinalized, and thus everything depending on that future will freeze? This would explain what we're seeing on our end (that is, our Python code in Dataswarm not moving forward). This diff is an attempt at fixing that. Reviewed By: Filip-F Differential Revision: D68503986 fbshipit-source-id: 7e8c2c5f3931fb0008b49ac2c99e6e2ab2a54d12
- Loading branch information