Skip to content

Commit 031d308

Browse files
authored
Merge pull request #346 from zmoon/doc-obs-nc
Document surface obs xarray/nc format
2 parents fe28122 + 18c108a commit 031d308

File tree

1 file changed

+59
-1
lines changed

1 file changed

+59
-1
lines changed

docs/develop/datasets.rst

+59-1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,64 @@ Use these examples as reference in order to add new surface observational datase
1111

1212
Instructions for reading in aircraft and satellite observations are under development.
1313

14+
If you are interested in converting a new observational dataset to our netCDF format,
15+
please see the notes below.
16+
17+
* The dataset should have these dimensions (in this order):
18+
19+
- ``time``
20+
- ``y`` (an optional singleton dimension, included for consistency with
21+
model surface datasets)
22+
- ``x`` (the site dimension)
23+
24+
* The dataset should have these coordinate variables:
25+
26+
- ``time`` (UTC time, as timezone-naive ``datetime64`` format in xarray; ``time`` dim)
27+
- ``siteid`` (unique site identifier, as string; ``x`` dim)
28+
- ``latitude`` (site latitude, in degrees; ``x`` dim)
29+
- ``longitude`` (site longitude, in degrees; ``x`` dim)
30+
31+
* This variable is required for regulatory metrics,
32+
and can be optionally used for time series plots.
33+
Otherwise, you might omit it:
34+
35+
- ``time_local`` (local time,
36+
usually local standard time, not including daylight savings,
37+
as timezone-naive ``datetime64`` format in xarray;
38+
note that this varies in both the ``time`` and ``x`` dimensions)
39+
40+
* It's good practice to include ``units`` attributes for your data variables,
41+
though this is not strictly required.
42+
Similarly, you may wish to include ``long_name``\ s.
43+
44+
* Site metadata variables (e.g. site name, site elevation, EPA region, etc.)
45+
should ideally be stored as varying only in the ``x`` dimension, to save space.
46+
47+
* If you have sub-hourly data, you may want to aggregate it to hourly,
48+
especially if different sites have different time resolutions.
49+
50+
Example abbreviated xarray representation for AirNow
51+
demonstrating these qualities:
52+
53+
.. code-block:: text
54+
55+
<xarray.Dataset>
56+
Dimensions: (time: 289, y: 1, x: 2231)
57+
Coordinates:
58+
* time (time) datetime64[ns] 2023-04-04 ... 2023-04-16
59+
siteid (x) <U12 ...
60+
latitude (x) float64 ...
61+
longitude (x) float64 ...
62+
Dimensions without coordinates: y, x
63+
Data variables:
64+
NO2 (time, y, x) float64 ...
65+
time_local (time, y, x) datetime64[ns] ...
66+
epa_region (y, x) <U5 ...
67+
68+
You can examine the ``get_*`` functions in the :doc:`/cli`
69+
(``melodies_monet/_cli.py``) for examples of converting observational datasets
70+
in pandas DataFrame format to xarray Dataset format.
71+
1472
Models
1573
------
1674
Examples for reading model datasets can be
@@ -75,4 +133,4 @@ Standard variables are required to be computed in each model reader for each cap
75133
| Mid-level temperature in kelvin (K)
76134
| Layer thickness in meters (m)
77135
| Surface pressure in pascals (Pa)
78-
- | Provide vertical model data.
136+
- | Provide vertical model data.

0 commit comments

Comments
 (0)