Skip to content

Vision-CAIR/ChatCaptioner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interactive ChatCaptioner for image and video

Official repository of ChatCaptioner and Video ChatCaptioner.

ChatCaptioner paper ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions

Video ChatCaptioner paper Video ChatCaptioner: Towards the Enriched Spatiotemporal Descriptions

Demo

demo1 demo2 demo3 demo4

Acknowledgement

Please cite ChatCaptioner and Video ChatCaptioner from the following bibtex

@article{zhu2023chatgpt,
  title={ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions},
  author={Deyao Zhu and Jun Chen and Kilichbek Haydarov and Xiaoqian Shen and Wenxuan Zhang and Mohamed Elhoseiny},
  journal={arXiv preprint arXiv:2303.06594},
  year={2023}
}
@article{chen2023video,
      title={Video ChatCaptioner: Towards the Enriched Spatiotemporal Descriptions}, 
      author={Jun Chen and Deyao Zhu and Kilichbek Haydarov and Xiang Li and Mohamed Elhoseiny},
      journal={arXiv preprint arXiv:2304.04227},
      year={2023}
}

License

ChatCaptioner and Video ChatCaptioner are released under the MIT license.