Skip to content

Commit

Permalink
Updated readme
Browse files Browse the repository at this point in the history
To describe how to use invoke commands
  • Loading branch information
PuchatekwSzortach committed Dec 16, 2022
1 parent ca27abb commit 9fd5405
Showing 1 changed file with 32 additions and 12 deletions.
44 changes: 32 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ A short description of each change follows.

### Data analysis
Original paper defined sizes and prediction layers for default boxes used in the network without any reference to sizes of objects it was trying to detect.
In this work we provide a script to analyze sizes and aspect ratios of objects in data set, which can be used to guide network architecture design.
In this work we provide a script to analyze sizes and aspect ratios of objects in data set, which can be used to guide network architecture design.

### Maximum theoretical network recall analysis
Given sizes and arrangement of default boxes in SSD, it's possible to compute what is the maximum theoretical recall for the network against a given dataset.
Expand All @@ -23,17 +23,17 @@ Knowing that ceiling helps to inform both training and architecture design decis
### Only blocks 2 to 5 of VGG backbone are used for predictions
From data analysis one can compute set of optimal default boxes and their placement for given dataset.
It turns out that for PASCAL VOC 2012 dataset using layers above block 5 is not necessary - all objects in the dataset
can be detected with boxes placed on earlier layers.
can be detected with boxes placed on earlier layers.
Analysis also shows that over 70% of annotations require default boxes placed on block 2 and 3 for default boxes
to be able to match them.

### Configurable network architecture
Configuration file provided allows to control which of major VGG blocks outputs should be used to construct prediction
heads, as well as what should be sizes and aspect ratios of default boxes placed on them.
No changes in code are necessary to adjust network to configuration optimal for a given dataset.
heads, as well as what should be sizes and aspect ratios of default boxes placed on them.
No changes in code are necessary to adjust network to configuration optimal for a given dataset.

### Use of corner boxes instead of center boxes
Original SSD network defines default boxes in `[center_x, center_y, width, height]` format.
Original SSD network defines default boxes in `[center_x, center_y, width, height]` format.
This work uses an alternative `[min_x, min_y, max_x, max_y]` format.
Both formats are interchangeable, but the latter is far more popular among computer vision frameworks and easier to work with.

Expand All @@ -43,7 +43,7 @@ Offsets losses are just square error losses scaled by boxes sizes, and computed
### Network operates on original image resolution
Original SSD scales images to 300x300 or 500x500 resolution.
This has several disadvantages, especially for VOC dataset:
- objects aspect ratios might be distorted - and the distortion factor varies across images
- objects aspect ratios might be distorted - and the distortion factor varies across images
- data analysis becomes more difficult, making finding optimal network configuration difficult as well
- for most VOC images above rescaling decreases image resolution, making small objects, so ones that are particularly hard to detect, even smaller

Expand All @@ -62,21 +62,42 @@ Following scripts are provided in the `scripts` directory
Location of data and model paths, training hyperparameters and other inputs for all scripts are controlled through configuration file parameter.
`config.yaml` provides a sample configuration.

### Running code
Docker file for building container in which project can be run is provided at ./docker/app.Dockerfile
Helper invoke command `invoke host.build-app-container` is provided for building the container.

You can use `invoke host.run-app-container` to start the container. It wraps `docker run` command, setting useful options, such us enabling GPU support and mounting logs directory.

Once inside docker container, you can use [invoke](https://www.pyinvoke.org/) to run provided commands.
Use `invoke --list` to see all available commands.
The most frequently used commands are:
- train.train-object-detection-model - for training model
- analyze.analyze-objects-detections-predictions - for analyzing accuracy of predictions
- visualize.log-predictions - for visualizing predictions

### Using with PASCAL VOC 2012 dataset

Dataset is not included with this repository. Please download dataset from the [official webpage](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit).
Once downloaded, adjust config.yaml so its relevant section points to path with data.

A few sample predictions on VOC 2012 dataset made with a trained model are shown below

### Accuracy
```
Dataset used: VOC Pascal 2012
Confidence threshold used: 0.5
Recall: 0.441
Precision: 0.764
```

#### Good prediction
![alt text](./images/good_prediction.png)
![alt text](./images/good_prediction.png)

#### Typical prediction - many objects are correctly detected, but a few are off
![alt text](./images/typical_prediction.png)
![alt text](./images/typical_prediction.png)

#### Bad prediction
![alt text](./images/bad_prediction.png)
![alt text](./images/bad_prediction.png)

### Using with other datasets

Expand All @@ -85,7 +106,6 @@ In most cases the only changes you would need to do are:
- implement a data loader - look at `net.data.VOCSamplesDataLoader` for reference
- adjust configuration file to load data from appropriate path

Of course I would then advise to use tools project provides to define optimal network configuration for your dataset, going through
`data analysis -> network configuration adjustments -> theoretical network performance analysis loop -> training -> model performance analysis`
Of course I would then advise to use tools project provides to define optimal network configuration for your dataset, going through
`data analysis -> network configuration adjustments -> theoretical network performance analysis loop -> training -> model performance analysis`
loop.

0 comments on commit 9fd5405

Please sign in to comment.