This is an unofficial re-implementation for Pix2Seq. It is mainly developped based on Pretrained-Pix2Seq and Pix2Seq.
If you have any ideas, please feel free to let us know.
Install PyTorch 1.5+ and torchvision 0.6+ (recommend torch1.8.1 torchvision 0.8.0)
Install pycocotools (for evaluation on COCO):
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
That's it, should be good to train and evaluate detection models.
Download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:
path/to/coco/
annotations/ # annotation json files
train2017/ # train images
val2017/ # val images
Please link coco dataset to the project folder
ln -s /path/to/coco ./coco
Not Ready.
top_k
and top_p
are tunable parameters for evaluation.
bash scripts/resnet50_pretrained.sh 8 --eval --resume /path/to/checkpoint/file
We provide AP
Backbone | Input Size | Epoch | Batch Size | AP | Weights | Comments |
---|---|---|---|---|---|---|
R50 | 640 | - | - | 39.3 | Weight | Official |
Convert the official model with scripts/convert_official.py
.
This repo borrows a lot from Pix2Seq, Pretrained-Pix2Seq and DETR. Thanks a lot!