-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2025.3.0 vs 2024.12.0 #952
Comments
Are you certain that the same workflow would work today with v2024.12, or could it be a change elsewhere (botocore or the server itself)? The traceback is not at all helpful ("We encountered an error" is not coming from s3fs but deeper down). Can you turn on logging for "s3fs", so you can at least see what was being called at the time?
|
PS: is "large dataset" a single file or many? |
5000 files with total files size in 10TB+.
we are using NeMo-Curator with Dask framework so turning on this level of logging might not be possible. The workflow is exactly the same except I tried install s3fs==2024.12.0 that makes different. |
You mean, if you install the older version, the error goes away? Is this intermittent or occurring consistently? |
In the context of the same copy of data and same workflow, this is consistent. We tried twice and both show same error. After switching 2024.12.0, the error goes away. |
Is there any chance you can do a git bisect to see which commit broke the workflow for you? |
when we reading large dataset (10TB+) using 2025.3.0, there's below error while 2024.12.0 has no such issue.
File "/opt/conda/lib/python3.10/site-packages/fsspec/caching.py", line 249, in _fetch self.cache = self.fetcher(start, end) # new block replaces old File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2380, in _fetch_range return _fetch_range( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2552, in _fetch_range return sync(fs.loop, _inner_fetch, fs, bucket, key, version_id, start, end, req_kw) File "/opt/conda/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File "/opt/conda/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2570, in _inner_fetch return await _error_wrapper(_call_and_read, retries=fs.retries) File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 146, in _error_wrapper raise err File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 114, in _error_wrapper return await func(*args, **kwargs) File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 2557, in _call_and_read resp = await fs._call_s3( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 371, in _call_s3 return await _error_wrapper( File "/opt/conda/lib/python3.10/site-packages/s3fs/core.py", line 146, in _error_wrapper raise err OSError: [Errno 121] We encountered an internal error. Please try
The text was updated successfully, but these errors were encountered: