-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add normalized game pair Elo to stats #2134
Comments
I'd obviously be happy if the excruciating math stuff I had to deal with in order to make the WDL Contempt work for Leela and align it to Elo would be put to some wider use, especially in regard to the challenges stemming from mixed data sources due to testing with different opening biases and different TC. I will think about whether it makes sense to try replacing nElo in fishtest, as using UHO openings for small expected rating differences already deal with the two main issues of regular Elo, with the only remaining major issue that nElo has wrong assumptions about the draw distribution. Key propertiesThe formula
BackgroundThe reason why it is necessary to redefine Elo for the high levels of modern chess circles around the combination of these issues:
Origin of the formulaThe full derivations behind Leela's WDL Contempt etc are beautiful but this margin isn't wide enough ;) The relevant parts however can be summarized as
Relationship with regular ElongpElo is designed for the upper range of playing strength where WW and LL results are much less frequent than WD and LD, which is the case approximately from >80% expected draw rate from regular startpos resp. balanced openings. In the human range, this isn't the case, leading to a factor 1.5-2x discrepancy between Elo differences and ngpElo differences (at 2000 level, +100 regular Elo is equivalent to +50 ngpElo); an approximate conversion can be found in LeelaChessZero/lc0#1941 (comment) used for converting regular Elo into ngpElo internally in the Lc0 Contempt implementation. |
I am just seeing this.
This is incorrect. The motivation for expressing bounds in
Do I understand correctly that you claim you can prove this for some reasonable model? Is there some write up of this? |
glad you joined the discussion :-) I think the neat thing of this proposal is that it is based on game pairs, contrary to nElo. I think using games pairs from the beginning is very important nowadays. |
I don't understand. nElo also uses game pairs (it is computed from the pentanomial frequencies). |
ah. So now I don't understand. |
We're observing some nice results with the normalized game pair Elo (ngpElo) which seemingly is a good way to derive an Elo number that is largely book independent, working with game pairs. See some data (and the formula) here:
https://github.com/official-stockfish/Stockfish/wiki/Useful-data#equivalent-time-odds-and-normalized-game-pair-elo
Would be nice to put it in the stats, and possibly even see if it makes sense to replace nElo with it in fishtest, even though that's a larger undertaking.
The ngpElo concept is from @Naphthalin he might be able to explain the properties a bit better.
The text was updated successfully, but these errors were encountered: