Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No retries on ConnectTimeoutError #951

Closed
arogozhnikov opened this issue Mar 17, 2025 · 3 comments
Closed

No retries on ConnectTimeoutError #951

arogozhnikov opened this issue Mar 17, 2025 · 3 comments

Comments

@arogozhnikov
Copy link
Contributor

I'm observing ConnectTimeoutError sometimes, which is likely caused by the network and searching for a right way to handle thimeouts.

I see s3fs has retries with exponential backoff, but it ignores

How this can be tested/reproduced:

sudo apt install iproute2    # likely you already have it
sudo tc qdisc add dev eth0 root netem delay 3500ms    # set delay on network
# sudo tc qdisc del dev eth0 root     # use to remove delay when done with testing.

now in python:

import s3fs
s3fs.S3FileSystem().exists('s3://example-bucker/example-folder')

This crashes with

ConnectTimeoutError: Connect timeout on endpoint URL: "https://example-bucker.s3.us-west-2.amazonaws.com/example-folder"

This failure is expected, BUT when I insert logging in s3fs, it does not retry this exception (though default retries=5). Should this exception be retried?

@martindurant
Copy link
Member

This failure is expected, BUT when I insert logging in s3fs, it does not retry this exception (though default retries=5). Should this exception be retried?

Although networks can be unreliable, I would think that ConnectionTimeout should not in general be retriable at the s3fs level. It is rather long and I believe the network stack itself does retries at a lower level. If we were to have 5 retries, it might take a very long time for a genuine error to propagate to the user.

@arogozhnikov
Copy link
Contributor Author

thanks, makes sense.

I believe the network stack itself does retries at a lower level

That's correct, aiohttp does retry internally - I just did not find a way to control number of retries/timeouts in a call to s3fs. Would be great to have this knob, as I need to deal with shaky network.

@martindurant
Copy link
Member

Sorry, I don't know how to pipe connection-level arguments down the stack in aiobotocore. Perhaps ask on their tracker?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants