-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What happened?
While trying to load a .h5 file using load_dataset
, I accidently specified the wrong path. Instead of getting a "no such file or directory" error, however, I got a "did not find a match in any of xarray's currently installed IO backends ['netcdf4']" error. It took some time to find out that the problem was actually with the path, and not with my installed software libraries.
What did you expect to happen?
I would expect that load_dataset
informs me with a "no such file or directory" error, and not with something refering to the IO backends, if I attempt to open a file, that is clearly not existing. For .nc files this seems to work, see below.
Minimal Complete Verifiable Example
import xarray
xarray.load_dataset('not-existing-file.h5')
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.Complete example — the example is self-contained, including all data and the text of any traceback.Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.New issue — a search of GitHub Issues suggests this is not a duplicate.To pick up a draggable item, press the space bar. While dragging, use the arrow keys to move the item. Press space again to drop the item in its new position, or press escape to cancel.
Relevant log output
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[12], line 1
----> 1 xarray.load_dataset('not-existing-file.h5')
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
276 if "cache" in kwargs:
277 raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
280 return ds.load()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:524, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
521 kwargs.update(backend_kwargs)
523 if engine is None:
--> 524 engine = plugins.guess_engine(filename_or_obj)
526 backend = plugins.get_backend(engine)
528 decoders = _resolve_decoders_kwargs(
529 decode_cf,
530 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
536 decode_coords=decode_coords,
537 )
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/plugins.py:177, in guess_engine(store_spec)
169 else:
170 error_msg = (
171 "found the following matches with the input file in xarray's IO "
172 f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
173 "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
174 "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
175 )
--> 177 raise ValueError(error_msg)
ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html
Anything else we need to know?
It should be noted that the .h5 file is a working netcdf file, that can be loaded and used w/o installing further libraries if the path is correctly specified. Interestingly, for attempting to load a non existing .nc file, the load_dataset error message correctly says "FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'".
Example code,
import xarray
xarray.load_dataset('not-existing-file.nc')
Error message,
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:209, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
208 try:
--> 209 file = self._cache[self._key]
210 except KeyError:
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/lru_cache.py:55, in LRUCache.__getitem__(self, key)
54 with self._lock:
---> 55 value = self._cache[key]
56 self._cache.move_to_end(key)
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/jovyan/my_materials/not-existing-file.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'ae3bbd85-042b-46e1-97ae-f8d523bb578a']
During handling of the above exception, another exception occurred:
FileNotFoundError Traceback (most recent call last)
Cell In[11], line 1
----> 1 xarray.load_dataset('not-existing-file.nc')
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:279, in load_dataset(filename_or_obj, **kwargs)
276 if "cache" in kwargs:
277 raise TypeError("cache has no effect in this context")
--> 279 with open_dataset(filename_or_obj, **kwargs) as ds:
280 return ds.load()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/api.py:540, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, backend_kwargs, **kwargs)
528 decoders = _resolve_decoders_kwargs(
529 decode_cf,
530 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
536 decode_coords=decode_coords,
537 )
539 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 540 backend_ds = backend.open_dataset(
541 filename_or_obj,
542 drop_variables=drop_variables,
543 **decoders,
544 **kwargs,
545 )
546 ds = _dataset_from_backend_dataset(
547 backend_ds,
548 filename_or_obj,
(...)
556 **kwargs,
557 )
558 return ds
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:572, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
551 def open_dataset(
552 self,
553 filename_or_obj,
(...)
568 autoclose=False,
569 ):
571 filename_or_obj = _normalize_path(filename_or_obj)
--> 572 store = NetCDF4DataStore.open(
573 filename_or_obj,
574 mode=mode,
575 format=format,
576 group=group,
577 clobber=clobber,
578 diskless=diskless,
579 persist=persist,
580 lock=lock,
581 autoclose=autoclose,
582 )
584 store_entrypoint = StoreBackendEntrypoint()
585 with close_on_error(store):
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:376, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
370 kwargs = dict(
371 clobber=clobber, diskless=diskless, persist=persist, format=format
372 )
373 manager = CachingFileManager(
374 netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
375 )
--> 376 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:323, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
321 self._group = group
322 self._mode = mode
--> 323 self.format = self.ds.data_model
324 self._filename = self.ds.filepath()
325 self.is_remote = is_remote_uri(self._filename)
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:385, in NetCDF4DataStore.ds(self)
383 @property
384 def ds(self):
--> 385 return self._acquire()
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:379, in NetCDF4DataStore._acquire(self, needs_lock)
378 def _acquire(self, needs_lock=True):
--> 379 with self._manager.acquire_context(needs_lock) as root:
380 ds = _nc4_require_group(root, self._group, self._mode)
381 return ds
File ~/my-pykernel/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
135 del self.args, self.kwds, self.func
136 try:
--> 137 return next(self.gen)
138 except StopIteration:
139 raise RuntimeError("generator didn't yield") from None
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:197, in CachingFileManager.acquire_context(self, needs_lock)
194 @contextlib.contextmanager
195 def acquire_context(self, needs_lock=True):
196 """Context manager for acquiring a file."""
--> 197 file, cached = self._acquire_with_cache_info(needs_lock)
198 try:
199 yield file
File ~/my-pykernel/lib/python3.11/site-packages/xarray/backends/file_manager.py:215, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
213 kwargs = kwargs.copy()
214 kwargs["mode"] = self._mode
--> 215 file = self._opener(*self._args, **kwargs)
216 if self._mode == "w":
217 # ensure file doesn't get overridden when opened again
218 self._mode = "a"
File src/netCDF4/_netCDF4.pyx:2463, in netCDF4._netCDF4.Dataset.__init__()
File src/netCDF4/_netCDF4.pyx:2026, in netCDF4._netCDF4._ensure_nc_success()
FileNotFoundError: [Errno 2] No such file or directory: b'/home/jovyan/my_materials/not-existing-file.nc'
Environment
INSTALLED VERSIONS
commit: None
python: 3.11.0 | packaged by conda-forge | (main, Oct 25 2022, 06:24:40) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 5.4.0-136-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2022.12.0
pandas: 1.5.2
numpy: 1.24.1
scipy: None
netCDF4: 1.6.2
pydap: None
h5netcdf: None
h5py: None
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: None
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: None
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.6.3
pip: 22.3.1
conda: None
pytest: None
mypy: None
IPython: 8.8.0
sphinx: None
Activity
slevang commentedon Jan 23, 2023
You do get a
FileNotFoundError
if you explicitly specify an engine withxarray.load_dataset('not-existing-file.h5', engine='h5netcdf')
.It looks like neither
NetCDF4BackendEntrypoint
orH5netcdfBackendEntrypoint
include.h5
in the set of openable extensions they handle inguess_can_open
, but they will work if they can peer into the file and detect a valid.h5
. Probably some good reason for this.dcherian commentedon Jan 23, 2023
I think we should update the error to suggest trying with an explicit
engine
kwarg.