How to perform the task of generating video captions #4

cascat0 · 2024-10-31T03:20:58Z

Great work!

How to perform the task of generating video captions？

Flaick · 2024-11-05T13:02:57Z

Hello, thanks for the appreciation. Here we follow CapDec to perform the text-only training for image captioning.

cascat0 · 2024-11-05T13:18:14Z

Hello, thanks for the appreciation. Here we follow CapDec to perform the text-only training for image captioning.

Thank you for your answer. I mean how I can use this code to infer captions directly. My requirement is to use this model to generate captions for my surgical video dataset (i.e., image-to-text). However, I only see the three inference modes of 'video', 'text' and 'all' in README, which seem to be unable to implement image-to-text transformation.

Flaick · 2024-11-05T19:27:50Z

Oh ok, so the current model is CLIP-like architecture, which only includes the visual and text encoders. The function of caption generation requires the trained text decoder trained from CapDec. We do not have this in the current repo, but let me check if we can integrate it. I will get to you back soon

cascat0 · 2024-11-13T02:12:06Z

Oh ok, so the current model is CLIP-like architecture, which only includes the visual and text encoders. The function of caption generation requires the trained text decoder trained from CapDec. We do not have this in the current repo, but let me check if we can integrate it. I will get to you back soon

Thank you very much. I really need this feature for my graduation thesis.

Flaick · 2025-02-17T12:34:54Z

Hello, hope this is not too late, we have released the codebase for that, please refer to this repo: https://github.com/CAMMA-public/Surg-FTDA

Flaick closed this as completed Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to perform the task of generating video captions #4

How to perform the task of generating video captions #4

cascat0 commented Oct 31, 2024

Flaick commented Nov 5, 2024

cascat0 commented Nov 5, 2024

Flaick commented Nov 5, 2024

cascat0 commented Nov 13, 2024

Flaick commented Feb 17, 2025

How to perform the task of generating video captions #4

How to perform the task of generating video captions #4

Comments

cascat0 commented Oct 31, 2024

Flaick commented Nov 5, 2024

cascat0 commented Nov 5, 2024

Flaick commented Nov 5, 2024

cascat0 commented Nov 13, 2024

Flaick commented Feb 17, 2025