-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mathvista Idefics2 #393
Conversation
Thanks, we will re-evaluate and update the results of Idefics2 on MathVista. |
BTW, a community contributor is also trying to add support for Idefics3, do you have time to take a look (on sth like evalset-specific prompts)? |
Actually I haven't tested on Idefics2, but only Idefics2-large that we are going to release soon (maybe this week) I think there's not much to change. There are some (hopefully small) discrepancies between generating with our internal repo and Transformers integration. |
@HugoLaurencon |
Okay thanks for the evaluation! Maybe it's because recently the integration of Idefics2 was broken with the recent versions of Transformers, could you tell me your version? I will try to investigate a bit more |
The results is obtained with transformers=4.44.0, torch=2.0.1+cu118 |
Thanks I'll have a look when I find time! Also, if you still have the details of MMMU evaluation scores for Idefics2 for all the categories in your cache, would it be possible to copy paste the whole output of VLMEvalKit here, to compare with what I have with slightly different prompts? |
Hi, @HugoLaurencon |
Very nice feature! |
A very small change in the prompting of MathVista.
This can change a bit the performance (up to 1 point).
Idefics2 was fine-tuned with a specific prompt for MCQ.
In this PR, I add a sentence that was always seen during the fine-tuning when the model is expected to answer with a letter for MCQ.
Feel free to directly merge if you think this modification makes sense.