Skip to content

Compact Image Captioning (CoCA) is an open source image captioning project to promote Green Computer Vision, as well as to make image captioning research accessible to universities, research labs and individual practitioners with limited financial resources.

License

Notifications You must be signed in to change notification settings

CISiPLab/cisip-CoCa

Repository files navigation

CISiP-CoCa - Compact Image Captioning Project

Tests Black Documentation Status

(Released August 2021)

Introduction

Compact Image Captioning (CoCa) is an open source image captioning project release by Center of Image and Signal Processing Lab (CISiP Lab), Universiti Malaya. This project is to promote Green Computer Vision to reduce carbon footprint, as well as to make computer vision research (in this case - image captioning research) accessible to universities, research labs and individual practitioners with limited financial resources.

Get Started

Please refer to the documentation.

Features

Pre-trained Sparse and ACORT Models

The checkpoints are available at this repo.

Soft-attention models implemented in TensorFlow 1.9 are available at this repo.

CIDEr score of pruning methods (on MS-COCO dataset)

Up-Down (UD)

Sparsity NNZ Dense Baseline SMP Lottery ticket (class-blind) Lottery ticket (class-uniform) Lottery ticket (gradual) Gradual pruning Hard pruning (class-blind) Hard pruning (class-distribution) Hard pruning (class-uniform) SNIP
0.950 2.7 M 111.3 112.5 - 107.7 109.5 109.7 - 110.0 110.2 38.2
0.975 1.3 M 111.3 110.6 - 103.8 106.6 107.0 - 105.9 105.4 34.7
0.988 0.7 M 111.3 109.0 - 99.3 102.2 103.4 - 101.3 100.5 32.6
0.991 0.5 M 111.3 107.8

Object Relation Transformer (ORT)

Sparsity NNZ Dense Baseline SMP Lottery ticket (gradual) Gradual pruning Hard pruning (class-blind) Hard pruning (class-distribution) Hard pruning (class-uniform) SNIP
0.950 2.8 M 114.7 113.7 115.7 115.3 4.1 112.5 113.0 47.2
0.975 1.4 M 114.7 113.7 112.9 113.2 0.7 106.6 106.9 44.0
0.988 0.7 M 114.7 110.7 109.8 110.0 0.9 96.9 59.8 37.3
0.991 0.5 M 114.7 109.3 107.1 107.0

Acknowledgements

Citation

If you find this work useful for your research, please cite

@article{tan2021end,
  title={End-to-End Supermask Pruning: Learning to Prune Image Captioning Models},
  author={Tan, Jia Huei and Chan, Chee Seng and Chuah, Joon Huang},
  journal={Pattern Recognition},
  pages={108366},
  year={2021},
  publisher={Elsevier},
  doi={10.1016/j.patcog.2021.108366}
}

Contribution

We welcome the contributions to improve this project, in particular on other datasets - such as Flickr8k, Flickr30k, InstaPIC-1.1M etc. Please file your suggestions/issues by creating new issues or send us a pull request for your new changes/improvement/features/fixes.

License and Copyright

The project is open source under BSD-3 license (see the LICENSE file).

©2021 Universiti Malaya.

About

Compact Image Captioning (CoCA) is an open source image captioning project to promote Green Computer Vision, as well as to make image captioning research accessible to universities, research labs and individual practitioners with limited financial resources.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages