Skip to content

Commit

Permalink
Update hyperparamter tuning section for dollar street dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
svenvanderburg committed May 20, 2024
1 parent 4d3ae69 commit bcd0f35
Showing 1 changed file with 55 additions and 48 deletions.
103 changes: 55 additions & 48 deletions episodes/4-advanced-layer-types.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -839,7 +839,7 @@ This is called hyperparameter tuning.
## Hyperparameter tuning
::: instructor
## Do a live demo instead of live coding
You might want to demonstrate this section on hyperparamater tuning instead of doing live coding.
You might want to demonstrate this section on hyperparameter tuning instead of doing live coding.
The goal is to show that hyperparameter tuning can be done easily with `keras_tuner`, not to memorize all the exact syntax of how to do it. This will probably save you half an hour of participants typing over code that they already know from before. In addition, on really slow machines running the grid search could possibly take more than 10 minutes.
:::

Expand All @@ -865,7 +865,14 @@ def create_nn_with_hp(dropout_rate, n_layers):
Now, let's find the best combination of hyperparameters using grid search.
Grid search is the simplest hyperparameter tuning strategy,
you test all the combinations of predefined values for the hyperparameters that you want to vary.

For this we will make use of the package `keras_tuner`, we can install it by typing in the command line:
```bash
pip install keras_tuner
```

Note that this can take some time to train (around 5 minutes or longer).

```python
import keras_tuner

Expand All @@ -886,11 +893,11 @@ tuner.search(train_images, train_labels, epochs=20,
validation_data=(val_images, val_labels))
```
```output
Trial 6 Complete [00h 00m 46s]
val_loss: 1.3021799325942993
Trial 6 Complete [00h 00m 19s]
val_loss: 2.086069345474243
Best val_loss So Far: 1.2409346103668213
Total elapsed time: 00h 03m 59s
Best val_loss So Far: 2.086069345474243
Total elapsed time: 00h 01m 28s
```
Let's have a look at the results:

Expand All @@ -903,41 +910,41 @@ Results in ./untitled_project
Showing 10 best trials
Objective(name="val_loss", direction="min")
Trial 0004 summary
Trial 0005 summary
Hyperparameters:
n_layers: 2
dropout_rate: 0.5
Score: 1.2409346103668213
dropout_rate: 0.8
Score: 2.086069345474243
Trial 0003 summary
Trial 0000 summary
Hyperparameters:
n_layers: 2
n_layers: 1
dropout_rate: 0.2
Score: 1.281008005142212
Score: 2.101102352142334
Trial 0005 summary
Trial 0001 summary
Hyperparameters:
n_layers: 1
dropout_rate: 0.5
Score: 2.1184325218200684
Trial 0003 summary
Hyperparameters:
n_layers: 2
dropout_rate: 0.8
Score: 1.3021799325942993
dropout_rate: 0.2
Score: 2.1233835220336914
Trial 0002 summary
Hyperparameters:
n_layers: 1
dropout_rate: 0.8
Score: 1.3677740097045898
Score: 2.1370232105255127
Trial 0001 summary
Trial 0004 summary
Hyperparameters:
n_layers: 1
n_layers: 2
dropout_rate: 0.5
Score: 1.3880290985107422
Trial 0000 summary
Hyperparameters:
n_layers: 1
dropout_rate: 0.2
Score: 1.4468265771865845
Score: 2.143627882003784
```

::: challenge
Expand All @@ -947,11 +954,11 @@ Score: 1.4468265771865845
1: Looking at the grid search results, select all correct statements:

- A. 6 different models were trained in this grid search run, because there are 6 possible combinations for the defined hyperparameter values
- B. There are 2 different models trained, 1 for each hyperparameter that we want to change
- B. 2 different models were trained, 1 for each hyperparameter that we want to change
- C. 1 model is trained with 6 different hyperparameter combinations
- D. The model with 1 layer and a dropout rate of 0.2 is the best model with a validation loss of 1.45
- E. The model with 2 layers and a dropout rate of 0.5 is the best model with a validation loss of 1.24
- F. We have found the model with the best possible combination of dropout rate and number of layers
- D. The model with 2 layer and a dropout rate of 0.5 is the best model with a validation loss of 2.144
- E. The model with 2 layers and a dropout rate of 0.8 is the best model with a validation loss of 2.086
- F. We found the model with the best possible combination of dropout rate and number of layers

2 (Optional): Perform a grid search finding the best combination of the following hyperparameters: 2 different activation functions: 'relu', and 'tanh', and 2 different values for the kernel size: 3 and 4. Which combination works best?

Expand Down Expand Up @@ -995,52 +1002,52 @@ def build_model(hp):
compile_model(model)
return model

tuner = keras_tuner.GridSearch(build_model, objective='val_loss')
tuner = keras_tuner.GridSearch(build_model, objective='val_loss', project_name='new_project')
tuner.search(train_images, train_labels, epochs=20,
validation_data=(val_images, val_labels))
```
```output
Trial 4 Complete [00h 00m 56s]
val_loss: 1.44037926197052
Trial 4 Complete [00h 00m 25s]
val_loss: 2.0591845512390137
Best val_loss So Far: 1.2700632810592651
Total elapsed time: 00h 04m 01s
Best val_loss So Far: 2.0277602672576904
Total elapsed time: 00h 01m 30s
```
Let's look at the results:
```python
tuner.results_summary()
```
```output
Results summary
Results in ./untitled_project
Results in ./new_project
Showing 10 best trials
Objective(name="val_loss", direction="min")
Trial 0000 summary
Hyperparameters:
kernel_size: 3
activation: relu
Score: 1.2700632810592651
Trial 0001 summary
Hyperparameters:
kernel_size: 3
activation: tanh
Score: 1.2945374250411987
Score: 2.0277602672576904
Trial 0002 summary
Trial 0003 summary
Hyperparameters:
kernel_size: 4
activation: tanh
Score: 2.0591845512390137
Trial 0000 summary
Hyperparameters:
kernel_size: 3
activation: relu
Score: 1.431167483329773
Score: 2.123767614364624
Trial 0003 summary
Trial 0002 summary
Hyperparameters:
kernel_size: 4
activation: tanh
Score: 1.44037926197052
activation: relu
Score: 2.150160551071167
```
A kernel size of 3 and `relu` as activation function is the best tested combination.
A kernel size of 3 and `tanh` as activation function is the best tested combination.

::::
:::
Expand All @@ -1057,7 +1064,7 @@ Let's save our model
model.save('cnn_model')
```

## Conclusion and Next Steps
## Conclusion and next steps
How successful were we with creating a model here?
With ten image classes, and assuming that we would not ask the model to classify an image that contains none of the given classes of object, a model working on complete guesswork would be correct 10% of the time.
Against this baseline accuracy of 10%, and considering the diversity and relatively low resolution of the example images, perhaps our last model's validation accuracy of ~30% is not too bad.
Expand Down

0 comments on commit bcd0f35

Please sign in to comment.