You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great project in the right direction, i.e. getting about the same results with less compute.
In your paper, you mention that you are discarding the text-head and only using the visual backbone and future research could leverage the text-head – do you think it could develop comparable performance to BERT-like models trained on text corpuses?
Also, I’m very interested in using VirTex for classification problems such as Conceptual Captions – Did you try / Would you estimate performance improvements using VirTex’s Visual Backbone + BERT or VirTex Visual & Textual (not sure if the latter would work) over VilBERT or VisualBERT?
The text was updated successfully, but these errors were encountered:
Great project in the right direction, i.e. getting about the same results with less compute.
In your paper, you mention that you are discarding the text-head and only using the visual backbone and future research could leverage the text-head – do you think it could develop comparable performance to BERT-like models trained on text corpuses?
Also, I’m very interested in using VirTex for classification problems such as Conceptual Captions – Did you try / Would you estimate performance improvements using VirTex’s Visual Backbone + BERT or VirTex Visual & Textual (not sure if the latter would work) over VilBERT or VisualBERT?
The text was updated successfully, but these errors were encountered: