Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smbprotocol.transport.Tcp._recv can end up in a dead lock #263

Open
adiroiban opened this issue Jan 26, 2024 · 4 comments
Open

smbprotocol.transport.Tcp._recv can end up in a dead lock #263

adiroiban opened this issue Jan 26, 2024 · 4 comments

Comments

@adiroiban
Copy link
Contributor

adiroiban commented Jan 26, 2024

Hi. Thanks again for working on smbprotocol.

I have recently start seeing on my Windows tests the overlap of smbprotocol.tranport.Tcp.close() with smbprotocol.tranport.Tcp.recv() ...the issue is that recv() will only free the Tcp._close_lock after a read timeout.

I am trying to see how this can be fixed.

One idea is to lock only after the select() call.

def _recv(self, length, timeout):
    buffer = bytearray(length)
    offset = 0
    while offset < length:
        read_len = length - offset
        log.debug(f"Socket recv({read_len}) (total {length})")

        start_time = timeit.default_timer()

-        with self._close_lock:
-            if not self.connected:
-                # The socket was closed - need the no cover to avoid CI failing on race condition differences
-                return None, timeout  # pragma: no cover
-
-           read = select.select([self._sock], [], [], max(timeout, 1))[0]
-            timeout = timeout - (timeit.default_timer() - start_time)
-            if not read:
-                log.debug("Socket recv(%s) timed out")
-                raise TimeoutError()

+       read = select.select([self._sock], [], [], max(timeout, 1))[0]
+       timeout = timeout - (timeit.default_timer() - start_time)
+       if not read:
+            log.debug("Socket recv(%s) timed out")
+            raise TimeoutError()
+
+        with self._close_lock:
+            if not self.connected:
+                # The socket was closed - need the no cover to avoid CI failing on race condition differences
+                return None, timeout  # pragma: no cover
            try:
                b_data = self._sock.recv(read_len)
            except OSError as e:
                # Windows will raise this error if the socket has been shutdown, Linux return returns an empty byte
                # string so we just replicate that.
                if e.errno not in [errno.ESHUTDOWN, errno.ECONNRESET]:
                    # Avoid collecting coverage here to avoid CI failing due to race condition differences
                    raise  # pragma: no cover
                b_data = b""

Originally posted by @adiroiban in #80 (comment)

@adiroiban
Copy link
Contributor Author

I have created the issue as a reminder. I will see how I can write an automated tests for this.

@jborean93
Copy link
Owner

I've been meaning to revamp the whole transport mechanism to work in a more reliable and plugin like way. This would enable transports like Quic and use epoll on platforms that support this. It's been a while since I last played around with it but the WIP code is in the socket-refactor branch.

@adiroiban
Copy link
Contributor Author

Removing the need for threads would be nice.
I would love to see an async API, as I am using this code from Twisted with some wrappers to work around the threads.

I think that going with asyncio is an excelent first step...and later it should be easy for people to refactor the code for uvloop or twisted


At the same time, I think that it is worth trying to write an automated test for this case.
It might help with the future refactoring.

I have observed this issue with my automated tests.

The issue is that we have this close,

with self._close_lock:
    b_data = self._sock.recv(read_len)

Which blocks on self._sock.recv(read_len)

It also holds the close lock.
If you try to close, you need to wait for the recv timeout.


I have this patch applied to my smbprotocol version, so I would be happy send a PR to fix it , so that I can use vanilla smbprotocol.

I will try to find some time to work on this

@jborean93
Copy link
Owner

Removing the need for threads would be nice. I would love to see an async API, as I am using this code from Twisted with some wrappers to work around the threads.

That's another use case, unfortunately async is a lot more difficult as you would essentially need to duplicate the rest of the APIs to expose an async one :(

I have this patch applied to my smbprotocol version, so I would be happy send a PR to fix it , so that I can use vanilla smbprotocol.

If you are coming across a deadlock and have a fix then yes please a PR would be great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants