GitHub - xirui-li/Interpretation_DETR

PyTorch implementation of Visualization of Deformable Attention in Deformable DETR based on the following methods:

*DEFORMABLE DETR: DEFORMABLE TRANSFORMERS FOR END-TO-END OBJECT DETECTION link, *End-to-End Object Detection with Transformers link, *Visualization demo link

This repository is an official implementation of the thesis: Visualization of Deformable Attention in Deformable DETR.

Introduction

TL; DR. Deformable DETR is an efficient and fast-converging end-to-end object detector. It mitigates the high complexity and slow convergence issues of DETR via a novel sampling-based efficient attention mechanism.

Abstract. Deformable DEtection TRansformer (Deformable DETR) algorithm, which utilizes a transformer structure and deformable attention mechanisms, achieves high performance in object detection tasks. In certain scenarios with high safety standards, such as autonomous driving, it is crucial to examine the decision-making procedure of detection algorithms. This thesis works on the interpretability of Deformable DETR algorithm through visualizing deformable attention mechanisms, sampling procedures that weight selected points from image feature spaces. Firstly, Deformable DETR is a more intuitive detection approach to humans because of the similarity between the deformable attention mechanism and human visual mechanism. Then this thesis investigates the observability of sampling methods at a module level, a layer level, and an inner-layer level in terms of network architecture. This kind of observability demonstrates panoramic interpretability from macro to micro. In addition, the observability of multi-head and multi-scale mechanisms also enriches the interpretability, which enables task-oriented interventions with regard to the human-in-the-loop concept for better task results. Through a large number of experiments, this thesis demonstrates that Deformable DETR is an interpretable approach, which has the potential for explainable artificial intelligence.

License

This project is released under the Apache 2.0 license.

Installation

Requirements

Linux, CUDA>=9.2, GCC>=5.4
Python>=3.7

We recommend you to use Anaconda to create a conda environment:
```
conda create -n deformable_detr python=3.7 pip
```
Then, activate the environment:
```
conda activate deformable_detr
```
PyTorch>=1.5.1, torchvision>=0.6.1 (following instructions here)

For example, if your CUDA version is 9.2, you could install pytorch and torchvision as following:
```
conda install pytorch=1.5.1 torchvision=0.6.1 cudatoolkit=9.2 -c pytorch
```
Other requirements
```
pip install -r requirements.txt
```

Compiling CUDA operators

cd ./models/ops
sh ./make.sh
# unit test (should see all checking is True)
python test.py

Usage

Dataset preparation

Please download COCO 2017 dataset and organize them as following:

code_root/
└── data/
    └── coco/
        ├── train2017/
        ├── val2017/
        └── annotations/
        	├── instances_train2017.json
        	└── instances_val2017.json

Training

Training on single node

For example, the command for training Deformable DETR on 8 GPUs is as following:

GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 8 ./configs/r50_deformable_detr.sh

Training on multiple nodes

For example, the command for training Deformable DETR on 2 nodes of each with 8 GPUs is as following:

On node 1:

MASTER_ADDR=<IP address of node 1> NODE_RANK=0 GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 16 ./configs/r50_deformable_detr.sh

On node 2:

MASTER_ADDR=<IP address of node 1> NODE_RANK=1 GPUS_PER_NODE=8 ./tools/run_dist_launch.sh 16 ./configs/r50_deformable_detr.sh

Training on slurm cluster

If you are using slurm cluster, you can simply run the following command to train on 1 node with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deformable_detr 8 configs/r50_deformable_detr.sh

Or 2 nodes of each with 8 GPUs:

GPUS_PER_NODE=8 ./tools/run_dist_slurm.sh <partition> deformable_detr 16 configs/r50_deformable_detr.sh

Some tips to speed-up training

If your file system is slow to read images, you may consider enabling '--cache_mode' option to load whole dataset into memory at the beginning of training.
You may increase the batch size to maximize the GPU utilization, according to GPU memory of yours, e.g., set '--batch_size 3' or '--batch_size 4'.

Evaluation

You can get the config file and pretrained model of Deformable DETR (the link is in "Main Results" session), then run following command to evaluate it on COCO 2017 validation set:

<path to config file> --resume <path to pre-trained model> --eval

You can also run distributed evaluation by using ./tools/run_dist_launch.sh or ./tools/run_dist_slurm.sh.

Visualization

Visualization of images

You can plot input image from the dataset with its bounding boxes and labels by running commands:

python main_training_data_vis.py

Visualization of evolvement of AP

You can plot a loss file of models' output to visualize loss or AP evolvement in epochs by running commands:

python main_plot.py

Visualization of deformable attention

You can plot sampling strategy of a query (encoder, decoder, head, level, layer) by running commands:

python main_visual.py

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
__pycache__		__pycache__
configs		configs
datasets		datasets
models		models
output		output
tools		tools
util		util
LICENSE		LICENSE
README.md		README.md
benchmark.py		benchmark.py
engine.py		engine.py
main.py		main.py
main_plot.py		main_plot.py
main_visual.py		main_visual.py
requirements.txt		requirements.txt
training_data_vis.py		training_data_vis.py
visualization.py		visualization.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

License

Installation

Requirements

Compiling CUDA operators

Usage

Dataset preparation

Training

Training on single node

Training on multiple nodes

Training on slurm cluster

Some tips to speed-up training

Evaluation

Visualization

Visualization of images

Visualization of evolvement of AP

Visualization of deformable attention

About

Releases

Packages

Languages

License

xirui-li/Interpretation_DETR

Folders and files

Latest commit

History

Repository files navigation

Introduction

License

Installation

Requirements

Compiling CUDA operators

Usage

Dataset preparation

Training

Training on single node

Training on multiple nodes

Training on slurm cluster

Some tips to speed-up training

Evaluation

Visualization

Visualization of images

Visualization of evolvement of AP

Visualization of deformable attention

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages