You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can we access the results of other tools in the competition?
It's best to have them in CSV files (e.g. like what we generated in pre-competition experiments) so that we can cherry-pick benchmarks according to Legion's compatibility and only compare those scores.
I failed to reproduce the final score of each tool in the competition from the score of each category with the formula from the Google Sheets of our pre-competition experiments:
This is important as we want to compute the score of our new experiments
How are the final scores computed from the scores of each category
Did they remove the results of some benchmarks? For example, SQLite-MemSafety has only 1 task where everyone got 0; some benchmarks from other sets have the same problem. How did they deal with them?
By normalisation, do they mean simply taking averages? (i.e. like we did in our pre-competition experiments)
The text was updated successfully, but these errors were encountered:
Can we access the results of other tools in the competition?
It's best to have them in CSV files (e.g. like what we generated in pre-competition experiments) so that we can cherry-pick benchmarks according to
Legion
's compatibility and only compare those scores.I failed to reproduce the final score of each tool in the competition from the score of each category with the formula from the Google Sheets of our pre-competition experiments:
SQLite-MemSafety
has only 1 task where everyone got 0; some benchmarks from other sets have the same problem. How did they deal with them?The text was updated successfully, but these errors were encountered: