Skip to content

Commit 727fd4f

Browse files
authored
Merge pull request #39 from Jena-Earth-Observation-School/docs/data_access
Improve 01_00_Data_Access.md
2 parents 780bafd + 8c579fa commit 727fd4f

File tree

1 file changed

+42
-26
lines changed

1 file changed

+42
-26
lines changed

docs/content/02_Getting_Started/01_00_Data_Access.md

Lines changed: 42 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ information that might be useful for working with them.
77
```{tableofcontents}
88
```
99

10+
(load_product-intro)=
1011
## Using the `load_product`-function
1112

1213
This function is the recommended main entry point for loading data from the SDC.
@@ -41,33 +42,38 @@ The basic usage is to specify the following parameters:
4142

4243
- `product`: The name of the data product to load. The following strings are
4344
supported at the moment:
44-
- `"s1_rtc"`: Sentinel-1 Radiometric Terrain Corrected (RTC)
45-
- `"s1_surfmi"`: Sentinel-1 Surface Moisture Index (SurfMI)
46-
- `"s1_coh"`: Sentinel-1 Coherence VV-pol, ascending
47-
- `"s2_l2a"`: Sentinel-2 Level 2A (L2A)
48-
- `"sanlc"`: South African National Land Cover (SANLC) 2020
49-
- `"mswep"`: Multi-Source Weighted-Ensemble Precipitation (MSWEP) daily
50-
- `"cop_dem"`: Copernicus Digital Elevation Model GLO-30
45+
- _"s1_rtc"_: Sentinel-1 Radiometric Terrain Corrected (RTC)
46+
- _"s1_surfmi_: Sentinel-1 Surface Moisture Index (SurfMI)
47+
- _"s1_coh"_: Sentinel-1 Coherence VV-pol, ascending
48+
- _"s2_l2a"_: Sentinel-2 Level 2A (L2A)
49+
- _"sanlc"_: South African National Land Cover (SANLC) 2020
50+
- _"mswep"_: Multi-Source Weighted-Ensemble Precipitation (MSWEP) daily
51+
- _"cop_dem"_: Copernicus Digital Elevation Model GLO-30
5152
- `vec`: Filter the returned data spatially by either providing the name of a
52-
SALDi site in the format `"siteXX"`, where XX is the site number (e.g.
53-
`"site06"`), or a path to a vector file (any format [fiona](https://github.com/Toblerity/Fiona)
54-
can handle, e.g. `.geojson`, `.shp`, `.gpkg`) that defines an area of interest
55-
as a subset of a SALDi site. Providing a vector file outside the spatial extent
56-
of the SALDi sites will result in an empty dataset. Please note, that always the
57-
bounding box of the provided geometry will be used to load the data.
53+
SALDi site in the format _"siteXX"_, where XX is the site number (e.g.
54+
_"site06"_), or a path to a vector file (any format [fiona](https://github.com/Toblerity/Fiona)
55+
can handle, e.g. GeoJSON, Shapefile or GeoPackage) that defines an area of
56+
interest as a subset of a SALDi site. Providing a vector file outside the
57+
spatial extent of the SALDi sites will result in an empty dataset. Please note,
58+
that always the bounding box of the provided geometry will be used to load the
59+
data.
5860
- `time_range`: Filter the returned data temporally by providing a tuple of
59-
strings in the format `("YY-MM-dd", "YY-MM-dd")`, or `None` to return all
60-
available data.
61+
strings in the format _("YY-MM-dd", "YY-MM-dd")_, or _None_ to return all
62+
available data. If you want to use a different date format, you can also provide
63+
the parameter `time_pattern` with a string that specifies the format of the
64+
provided time strings.
65+
66+
The following additional parameters are product-specific:
67+
6168
- `s2_apply_mask`: Apply a quality mask to the Sentinel-2 L2A product by using
62-
its `SCL`-band. The default value is `True`. As the name already suggests, this
63-
is only relevant for Sentinel-2 L2A data.
69+
its SCL-band. The default value is _True_.
6470
- `sanlc_year`: Select a specific year of the SANLC product by providing an
65-
integer in the format `YYYY`. The default value is `None`, which will return the
71+
integer in the format _YYYY_. The default value is _None_, which will return the
6672
product for all available years: 2018 & 2020.
6773

6874
```{warning}
6975
While it is possible to load data for an entire SALDi site by providing the site
70-
name (e.g. `"site06"`), please be aware that this will result in a large dataset
76+
name (e.g. _"site06"_), please be aware that this will result in a large dataset
7177
and will very likely result in performance issues if your workflow is not
7278
optimized.
7379
@@ -90,10 +96,10 @@ All data products except for the MSWEP product are loaded internally using the
9096
-function. As mentioned above, some loading parameters are set to default values
9197
to make this package beginner-friendly and easier to use. To be more precise,
9298
the following defaults are used:
93-
- `crs='EPSG:4326'`
94-
- `resolution=0.0002`
95-
- `resampling='bilinear'`
96-
- `chunks={'time': -1, 'latitude': 'auto', 'longitude': 'auto'}`
99+
- `crs='EPSG:4326'`
100+
- `resolution=0.0002`
101+
- `resampling='bilinear'`
102+
- `chunks={'time': -1, 'latitude': 'auto', 'longitude': 'auto'}`
97103

98104
The default values for `crs` and `resolution`, for example, are the native CRS
99105
and resolution of the Sentinel-1 RTC and the Sentinel-2 L2A products (most bands
@@ -109,9 +115,9 @@ Dask's default).
109115

110116
If you want to override these defaults or add additional parameters that
111117
influence the loading process, you can do so by providing the
112-
`override_defaults`-parameter to the `load_product`-function. This parameter
113-
should be a dictionary with keys corresponding to parameter names of the
114-
[`odc.stac.load`](https://odc-stac.readthedocs.io/en/latest/_api/odc.stac.load.html#odc-stac-load)
118+
`override_defaults`-parameter to the [`load_product`](load_product-intro)
119+
-function. This parameter should be a dictionary with keys corresponding to
120+
parameter names of the [`odc.stac.load`](https://odc-stac.readthedocs.io/en/latest/_api/odc.stac.load.html#odc-stac-load)
115121
-function and values corresponding to the desired values. It is also possible to
116122
partially override the defaults while keeping the rest unchanged. The following
117123
is a simple example of how to override only the default `resolution`-parameter
@@ -127,6 +133,16 @@ s1_data = load_product(product="s1_rtc",
127133
override_defaults=override_defaults)
128134
```
129135

136+
```{note}
137+
The above example might be a bit misleading, especially for beginners, as we
138+
can't magically increase the spatial resolution of Earth Observation data by
139+
simply changing a parameter called `resolution`. What we do here instead is
140+
changing the pixel spacing of the loaded data. Both terms are often used
141+
interchangeably, but they are not the same. Please keep this in mind! In the
142+
example we need to use the term `resolution` as this is the name of the
143+
corresponding parameter of the `odc.stac.load`-function.
144+
```
145+
130146
(xarray-dask-intro)=
131147
## Xarray, Dask and lazy loading
132148

0 commit comments

Comments
 (0)