This directory provides examples and best practices for building object detection systems. Our goal is to enable the users to bring their own datasets and train a high-accuracy model easily and quickly. To this end, we provide example notebooks with pre-set default parameters shown to work well on a variety of datasets, and extensive documentation of common pitfalls, best practices, etc.
Object detection | Object detection and segmentation | Object detection and keypoint localization |
---|---|---|
Object Detection is one of the main problems in Computer Vision. Traditionally, this required expert knowledge to identify and implement so called “features” that highlight the position of objects in the image. Starting in 2012 with the famous AlexNet and Fast(er) R-CNN papers, Deep Neural Networks are used to automatically find these features. This lead to a huge improvement in the field for a large range of problems.
This repository uses torchvision's Faster R-CNN implementation which has been shown to work well on a wide variety of Computer Vision problems. See the FAQ for an explanation of the underlying data science aspects.
We recommend running these samples on a machine with GPU, on either Linux or (~20% slower) Windows. While a GPU is technically not required, training gets prohibitively slow even when using only a few dozens of images.
Answers to frequently asked questions such as "How does the technology work?" can be found in the FAQ located in this folder. For generic questions such as "How many training examples do I need?" or "How to monitor GPU usage during training?" see the FAQ.md in the classification folder.
We provide several notebooks to show how object detection algorithms can be designed and evaluated:
Notebook name | Description |
---|---|
00_webcam.ipynb | Quick-start notebook which demonstrates how to build an object detection system using a single image or webcam as input. |
01_training_introduction.ipynb | Notebook which explains the basic concepts around model training and evaluation. |
02_mask_rcnn.ipynb | In addition to detecting objects, also find their precise pixel-masks in an image. |
03_keypoint_rcnn.ipynb | Notebook which shows how to (i) run a pre-trained model for human pose estimation; and (ii) train a custom keypoint detection model. |
11_exploring_hyperparameters_on_azureml.ipynb | Performs highly parallel parameter sweeping using AzureML's HyperDrive. |
12_hard_negative_sampling.ipynb | Demonstrates how to sample hard negatives to improve model performance. |
20_deployment_on_kubernetes.ipynb | Deploys a trained model using AzureML. |
See the contribution guidelines in the root folder.