Skip to content

Commit 5b0d4f8

Browse files
committed
update croptype notebooks with new querying public extractions flow
1 parent 6350015 commit 5b0d4f8

File tree

2 files changed

+130
-116
lines changed

2 files changed

+130
-116
lines changed

notebooks/worldcereal_v1_demo_custom_croptype.ipynb

Lines changed: 62 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,13 @@
2424
" \n",
2525
"- [Before you start](###-Before-you-start)\n",
2626
"- [1. Define your region of interest](#1.-Define-your-region-of-interest)\n",
27-
"- [2. Extract public reference data](#2.-Extract-public-reference-data)\n",
28-
"- [3. Select your desired crop types](#3.-Select-your-desired-crop-types)\n",
29-
"- [4. Prepare training features](#4.-Prepare-training-features)\n",
30-
"- [5. Train custom classification model](#5.-Train-custom-classification-model)\n",
31-
"- [6. Deploy your custom model](#6.-Deploy-your-custom-model)\n",
32-
"- [7. Generate a map](#7.-Generate-a-map)\n"
27+
"- [2. Define your temporal extent](#2.-Define-your-temporal-extent)\n",
28+
"- [3. Extract public reference data](#3-extract-public-reference-data)\n",
29+
"- [4. Select your desired crop types](#4.-Select-your-desired-crop-types)\n",
30+
"- [5. Prepare training features](#5.-Prepare-training-features)\n",
31+
"- [6. Train custom classification model](#6.-Train-custom-classification-model)\n",
32+
"- [7. Deploy your custom model](#7.-Deploy-your-custom-model)\n",
33+
"- [8. Generate a map](#8.-Generate-a-map)\n"
3334
]
3435
},
3536
{
@@ -81,7 +82,48 @@
8182
"cell_type": "markdown",
8283
"metadata": {},
8384
"source": [
84-
"### 2. Extract public reference data\n",
85+
"### 2. Define your temporal extent\n",
86+
"\n",
87+
"To determine your season of interest, you can consult the WorldCereal crop calendars (by executing the next cell), or check out the [USDA crop calendars](https://ipad.fas.usda.gov/ogamaps/cropcalendar.aspx)."
88+
]
89+
},
90+
{
91+
"cell_type": "code",
92+
"execution_count": null,
93+
"metadata": {},
94+
"outputs": [],
95+
"source": [
96+
"from utils import retrieve_worldcereal_seasons\n",
97+
"\n",
98+
"spatial_extent = map.get_processing_extent()\n",
99+
"seasons = retrieve_worldcereal_seasons(spatial_extent)"
100+
]
101+
},
102+
{
103+
"cell_type": "markdown",
104+
"metadata": {},
105+
"source": [
106+
"Now use the slider to select your processing period. Note that the length of the period is always fixed to a year.\n",
107+
"Just make sure your season of interest is fully captured within the period you select."
108+
]
109+
},
110+
{
111+
"cell_type": "code",
112+
"execution_count": null,
113+
"metadata": {},
114+
"outputs": [],
115+
"source": [
116+
"from utils import date_slider\n",
117+
"\n",
118+
"slider = date_slider()\n",
119+
"slider.show_slider()"
120+
]
121+
},
122+
{
123+
"cell_type": "markdown",
124+
"metadata": {},
125+
"source": [
126+
"### 3. Extract public reference data\n",
85127
"\n",
86128
"Here we query existing reference data that have already been processed by WorldCereal and are ready to use.\n",
87129
"To increase the number of hits, we expand the search area by 250 km in all directions.\n",
@@ -97,19 +139,22 @@
97139
"source": [
98140
"from worldcereal.utils.refdata import query_public_extractions\n",
99141
"\n",
100-
"# retrieve the polygon you just drew\n",
142+
"# Retrieve the polygon you just drew\n",
101143
"polygon = map.get_polygon_latlon()\n",
102144
"\n",
145+
"# Retrieve the date range you just selected\n",
146+
"processing_period = slider.get_processing_period()\n",
147+
"\n",
103148
"# Query our public database of training data\n",
104-
"public_df = query_public_extractions(polygon)\n",
149+
"public_df = query_public_extractions(polygon, processing_period=processing_period)\n",
105150
"public_df.year.value_counts()"
106151
]
107152
},
108153
{
109154
"cell_type": "markdown",
110155
"metadata": {},
111156
"source": [
112-
"### 3. Select your desired crop types\n",
157+
"### 4. Select your desired crop types\n",
113158
"\n",
114159
"Run the next cell and select all crop types you wish to include in your model. All the crops that are not selected will be grouped under the \"other\" category."
115160
]
@@ -150,9 +195,9 @@
150195
"cell_type": "markdown",
151196
"metadata": {},
152197
"source": [
153-
"### 4. Prepare training features\n",
198+
"### 5. Prepare training features\n",
154199
"\n",
155-
"Using a deep learning framework (Presto), we derive classification features for each sample. The resulting `encodings` and `targets` will be used for model training."
200+
"Using a deep learning framework (Presto), we derive classification features for each sample in the dataframe resulting from your query. Presto was pre-trained on millions of unlabeled samples around the world and finetuned on global labelled land cover and crop type data from the WorldCereal reference database. The resulting *embeddings* and the *target* labels to train on will be returned as a training dataframe which we will use for downstream model training."
156201
]
157202
},
158203
{
@@ -170,8 +215,8 @@
170215
"cell_type": "markdown",
171216
"metadata": {},
172217
"source": [
173-
"### 5. Train custom classification model\n",
174-
"We train a catboost model for the selected crop types. Class weights are automatically determined to balance the individual classes."
218+
"### 6. Train custom classification model\n",
219+
"We train a catboost model for the selected crop types. By default, no class weighting is done. You could opt to enable this by setting `balance_classes=True`, however, depending on the class distribution this may lead to undesired results. There is no golden rule here."
175220
]
176221
},
177222
{
@@ -182,7 +227,7 @@
182227
"source": [
183228
"from utils import train_classifier\n",
184229
"\n",
185-
"custom_model, report, confusion_matrix = train_classifier(training_dataframe)"
230+
"custom_model, report, confusion_matrix = train_classifier(training_dataframe, balance_classes=False)"
186231
]
187232
},
188233
{
@@ -206,7 +251,7 @@
206251
"cell_type": "markdown",
207252
"metadata": {},
208253
"source": [
209-
"### 6. Deploy your custom model\n",
254+
"### 7. Deploy your custom model\n",
210255
"\n",
211256
"Once trained, we have to upload our model to the cloud so it can be used by OpenEO for inference. Note that these models are only kept in cloud storage for a limited amount of time.\n"
212257
]
@@ -230,48 +275,10 @@
230275
"cell_type": "markdown",
231276
"metadata": {},
232277
"source": [
233-
"### 7. Generate a map\n",
278+
"### 8. Generate a map\n",
234279
"\n",
235280
"Using our custom model, we generate a map for our region and season of interest.\n",
236-
"To determine your season of interest, you can consult the WorldCereal crop calendars (by executing the next cell), or check out the [USDA crop calendars](https://ipad.fas.usda.gov/ogamaps/cropcalendar.aspx)."
237-
]
238-
},
239-
{
240-
"cell_type": "code",
241-
"execution_count": null,
242-
"metadata": {},
243-
"outputs": [],
244-
"source": [
245-
"from utils import retrieve_worldcereal_seasons\n",
246-
"\n",
247-
"spatial_extent = map.get_processing_extent()\n",
248-
"seasons = retrieve_worldcereal_seasons(spatial_extent)"
249-
]
250-
},
251-
{
252-
"cell_type": "markdown",
253-
"metadata": {},
254-
"source": [
255-
"Now use the slider to select your processing period. Note that the length of the period is always fixed to a year.\n",
256-
"Just make sure your season of interest is fully captured within the period you select."
257-
]
258-
},
259-
{
260-
"cell_type": "code",
261-
"execution_count": null,
262-
"metadata": {},
263-
"outputs": [],
264-
"source": [
265-
"from utils import date_slider\n",
266281
"\n",
267-
"slider = date_slider()\n",
268-
"slider.show_slider()"
269-
]
270-
},
271-
{
272-
"cell_type": "markdown",
273-
"metadata": {},
274-
"source": [
275282
"Set some other customization options:"
276283
]
277284
},

0 commit comments

Comments
 (0)