It seems that there are two related bugs affecting compare_performance() and performance_score().
- Bug 1: logloss is ranked in wrong direction:
Currently the best model (lowest log-loss) receives the lowest Performance_Score and the worst model receives 100%.
logloss should be added to the score with opposite order (like RMSE)
reprex:
library(performance)
m1 <- glm(vs ~ 1 , data = mtcars, family = "binomial")
m2 <- glm(vs ~ wt , data = mtcars, family = "binomial")
m3 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
#correct order:
performance::compare_performance(m1,m2,m3, metrics="RMSE",rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | RMSE | Performance-Score
#> ----------------------------------------
#> m3 | glm | 0.359 | 100.00%
#> m2 | glm | 0.410 | 62.75%
#> m1 | glm | 0.496 | 0.00%
performance::compare_performance(m1,m2,m3, metrics="R2",rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | Tjur's R2 | Performance-Score
#> --------------------------------------------
#> m3 | glm | 0.478 | 100.00%
#> m2 | glm | 0.328 | 68.59%
#> m1 | glm | 0.000 | 0.00%
#wrong order:
performance::compare_performance(m1,m2,m3, metrics="LOGLOSS",rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | Log_loss | Performance-Score
#> -------------------------------------------
#> m1 | glm | 0.685 | 100.00%
#> m2 | glm | 0.490 | 32.69%
#> m3 | glm | 0.395 | 0.00%
Created on 2026-06-22 with reprex v2.1.1
- Bug 2: scale and order of score_log/score_spherical:
Again the worst model gets 100% performance, also values seem off, most likely due to line 175 of the performance_score function of the package (quadrat_p <- sum(p_y^2)) which should probably be an average. Then in [quadratic = mean(2 * p_y + quadrat_p),
spherical = mean(p_y / sqrt(quadrat_p))] it seems that quadratic grows with the sample size, while spherical goes to zero.
reprex:
library(performance)
#again m1 is best, also weird results
m1 <- glm(vs ~ 1 , data = mtcars, family = "binomial")
m2 <- glm(vs ~ wt , data = mtcars, family = "binomial")
m3 <- glm(vs ~ wt + mpg, data = mtcars, family = "binomial")
performance::compare_performance(m1,m2,m3, metrics="SCORE",rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | Score_log | Score_spherical | Performance-Score
#> --------------------------------------------------------------
#> m1 | glm | -7.010 | 0.130 | 100.00%
#> m2 | glm | -9.834 | 0.067 | 32.11%
#> m3 | glm | -14.903 | 0.095 | 21.71%
mtcars2<-rbind(mtcars,mtcars,mtcars)
m1 <- glm(vs ~ 1 , data = mtcars2, family = "binomial")
m2 <- glm(vs ~ wt , data = mtcars2, family = "binomial")
m3 <- glm(vs ~ wt + mpg, data = mtcars2, family = "binomial")
performance::compare_performance(m1,m2,m3, metrics="SCORE",rank = TRUE)
#> # Comparison of Model Performance Indices
#>
#> Name | Model | Score_log | Score_spherical | Performance-Score
#> --------------------------------------------------------------
#> m1 | glm | -22.640 | 0.070 | 100.00%
#> m2 | glm | -31.966 | 0.031 | 31.72%
#> m3 | glm | -48.156 | 0.037 | 7.16%
Created on 2026-06-22 with reprex v2.1.1
It seems that there are two related bugs affecting compare_performance() and performance_score().
Currently the best model (lowest log-loss) receives the lowest Performance_Score and the worst model receives 100%.
logloss should be added to the score with opposite order (like RMSE)
reprex:
Created on 2026-06-22 with reprex v2.1.1
Again the worst model gets 100% performance, also values seem off, most likely due to line 175 of the performance_score function of the package (quadrat_p <- sum(p_y^2)) which should probably be an average. Then in [quadratic = mean(2 * p_y + quadrat_p),
spherical = mean(p_y / sqrt(quadrat_p))] it seems that quadratic grows with the sample size, while spherical goes to zero.
reprex:
Created on 2026-06-22 with reprex v2.1.1