Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2025.3.0 vs 2024.12.0 #952

Open
hovanhuang opened this issue Mar 18, 2025 · 6 comments
Open

2025.3.0 vs 2024.12.0 #952

hovanhuang opened this issue Mar 18, 2025 · 6 comments

Comments

@hovanhuang
Copy link

when we reading large dataset (10TB+) using 2025.3.0, there's below error while 2024.12.0 has no such issue.

File "/opt/conda/lib/python3.10/site-packages/fsspec/caching.py", line 249, in _fetch self.cache = self.fetcher(start, end) # new block replaces old File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2380, in _fetch_range return _fetch_range( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2552, in _fetch_range return sync(fs.loop, _inner_fetch, fs, bucket, key, version_id, start, end, req_kw) File "/opt/conda/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File "/opt/conda/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2570, in _inner_fetch return await _error_wrapper(_call_and_read, retries=fs.retries) File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 146, in _error_wrapper raise err File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 114, in _error_wrapper return await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2557, in _call_and_read resp = await fs._call_s3( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 371, in _call_s3 return await _error_wrapper( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 146, in _error_wrapper raise err OSError: [Errno 121] We encountered an internal error. Please try

@martindurant
Copy link
Member

Are you certain that the same workflow would work today with v2024.12, or could it be a change elsewhere (botocore or the server itself)? The traceback is not at all helpful ("We encountered an error" is not coming from s3fs but deeper down).

Can you turn on logging for "s3fs", so you can at least see what was being called at the time?
e.g.,

fsspec.utils.setup_logging(logger_name="s3fs")

@martindurant
Copy link
Member

PS: is "large dataset" a single file or many?

@hovanhuang
Copy link
Author

PS: is "large dataset" a single file or many?

5000 files with total files size in 10TB+.

Are you certain that the same workflow would work today with v2024.12, or could it be a change elsewhere (botocore or the server itself)? The traceback is not at all helpful ("We encountered an error" is not coming from s3fs but deeper down).

Can you turn on logging for "s3fs", so you can at least see what was being called at the time? e.g.,

fsspec.utils.setup_logging(logger_name="s3fs")

we are using NeMo-Curator with Dask framework so turning on this level of logging might not be possible. The workflow is exactly the same except I tried install s3fs==2024.12.0 that makes different.

@martindurant
Copy link
Member

except I tried install s3fs==2024.12.0 that makes different

You mean, if you install the older version, the error goes away?

Is this intermittent or occurring consistently?

@hovanhuang
Copy link
Author

hovanhuang commented Mar 18, 2025

except I tried install s3fs==2024.12.0 that makes different

You mean, if you install the older version, the error goes away?

Is this intermittent or occurring consistently?

In the context of the same copy of data and same workflow, this is consistent. We tried twice and both show same error. After switching 2024.12.0, the error goes away.

@martindurant
Copy link
Member

Is there any chance you can do a git bisect to see which commit broke the workflow for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants