Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Align evaluation results with paper #563

Merged
merged 6 commits into from
Nov 4, 2024

Conversation

white2018
Copy link
Contributor

@white2018 white2018 commented Oct 31, 2024

The current verison of minimonkey.py evaluates model with different results compared to paper's evaluation. The paper's link refers to https://arxiv.org/pdf/2408.02034

@white2018 white2018 changed the title align evaluation results with paper [Improvement] align evaluation results with paper Nov 1, 2024
@white2018 white2018 changed the title [Improvement] align evaluation results with paper [Improvement] Align evaluation results with paper Nov 1, 2024
@kennymckormick
Copy link
Member

Will re-evaluate w. this piece of codes to see if results improve

@white2018
Copy link
Contributor Author

white2018 commented Nov 1, 2024 via email

@kennymckormick
Copy link
Member

Thanks a lot!

------------------ 原始邮件 ------------------ 发件人: "open-compass/VLMEvalKit" @.>; 发送时间: 2024年11月1日(星期五) 下午5:20 @.>; @.@.>; 主题: Re: [open-compass/VLMEvalKit] [Improvement] Align evaluation results with paper (PR #563) Will re-evaluate w. this piece of codes to see if results improve — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>

Can you confirm that the modified codes run well on benchmarks we supported, at least for the 8 benchmarks on our main leaderboard? I ran MiniMonkey on MMMU_DEV_VAL with 80G A800 and the OOM error occurs.

image

@white2018
Copy link
Contributor Author

Thanks a lot!

------------------ 原始邮件 ------------------ 发件人: "open-compass/VLMEvalKit" @.>; 发送时间: 2024年11月1日(星期五) 下午5:20 _@**._>; _@.@._>; 主题: Re: [open-compass/VLMEvalKit] [Improvement] Align evaluation results with paper (PR #563) Will re-evaluate w. this piece of codes to see if results improve — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: _@_.*>

Can you confirm that the modified codes run well on benchmarks we supported, at least for the 8 benchmarks on our main leaderboard? I ran MiniMonkey on MMMU_DEV_VAL with 80G A800 and the OOM error occurs.

image

I will re-evaluate MMMU_DEV_VAL dataset to see what happens. Thanks

@kennymckormick
Copy link
Member

The evaluation results are updated.

@kennymckormick kennymckormick merged commit 0c44cd2 into open-compass:main Nov 4, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants