Skip to content

Commit

Permalink
Add HDF5 Section to read/write docs page (#8012)
Browse files Browse the repository at this point in the history
* add HDF5 section to read/write docs

* change wording for hdf5 vs. netcdf4

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add write section for hdf5 docs

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Tom Nicholas <[email protected]>
  • Loading branch information
3 people authored Jul 24, 2023
1 parent ba26410 commit 88315fd
Showing 1 changed file with 61 additions and 0 deletions.
61 changes: 61 additions & 0 deletions doc/user-guide/io.rst
Original file line number Diff line number Diff line change
Expand Up @@ -559,6 +559,67 @@ and currently raises a warning unless ``invalid_netcdf=True`` is set:
Note that this produces a file that is likely to be not readable by other netCDF
libraries!

.. _io.hdf5:

HDF5
----
`HDF5`_ is both a file format and a data model for storing information. HDF5 stores
data hierarchically, using groups to create a nested structure. HDF5 is a more
general verion of the netCDF4 data model, so the nested structure is one of many
similarities between the two data formats.

Reading HDF5 files in xarray requires the ``h5netcdf`` engine, which can be installed
with ``conda install h5netcdf``. Once installed we can use xarray to open HDF5 files:

.. code:: python
xr.open_dataset("/path/to/my/file.h5")
The similarities between HDF5 and netCDF4 mean that HDF5 data can be written with the
same :py:meth:`Dataset.to_netcdf` method as used for netCDF4 data:

.. ipython:: python
ds = xr.Dataset(
{"foo": (("x", "y"), np.random.rand(4, 5))},
coords={
"x": [10, 20, 30, 40],
"y": pd.date_range("2000-01-01", periods=5),
"z": ("x", list("abcd")),
},
)
ds.to_netcdf("saved_on_disk.h5")
Groups
~~~~~~

If you have multiple or highly nested groups, xarray by default may not read the group
that you want. A particular group of an HDF5 file can be specified using the ``group``
argument:

.. code:: python
xr.open_dataset("/path/to/my/file.h5", group="/my/group")
While xarray cannot interrogate an HDF5 file to determine which groups are available,
the HDF5 Python reader `h5py`_ can be used instead.

Natively the xarray data structures can only handle one level of nesting, organized as
DataArrays inside of Datasets. If your HDF5 file has additional levels of hierarchy you
can only access one group and a time and will need to specify group names.

.. note::

For native handling of multiple HDF5 groups with xarray, including I/O, you might be
interested in the experimental
`xarray-datatree <https://github.com/xarray-contrib/datatree>`_ package.


.. _HDF5: https://hdfgroup.github.io/hdf5/index.html
.. _h5py: https://www.h5py.org/


.. _io.zarr:

Zarr
Expand Down

0 comments on commit 88315fd

Please sign in to comment.