You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! Congratulations on the great package! I've noticed an unexpected behaviour: r2() for hurdle models is different depending on the package used. If one uses r2(), hurdle models fitted with pscl default to r2_zeroinflated(), while glmmTMB ones don't. Moreover, r2_zeroinflated() do not return the same values, perhaps due to small differences in fitted values? None of this is particularly worrisome, and possibly it is not even related to the performance package (but rather to pscl and glmmTMB) but I thought it best to report it here. Cheers!
library(glmmTMB)
#> Warning: package 'glmmTMB' was built under R version 4.2.3#> Warning in checkMatrixPackageVersion(): Package version inconsistency detected.#> TMB was built with Matrix version 1.6.1#> Current Matrix version is 1.4.1#> Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
library(performance)
#> Warning: package 'performance' was built under R version 4.2.3
library(pscl)
#> Warning: package 'pscl' was built under R version 4.2.3#> Classes and Methods for R developed in the#> Political Science Computational Laboratory#> Department of Political Science#> Stanford University#> Simon Jackman#> hurdle and zeroinfl functions by Achim Zeileis
data("bioChemists", package="pscl")
summary(bioChemists)
#> art fem mar kid5 phd #> Min. : 0.000 Men :494 Single :309 Min. :0.0000 Min. :0.755 #> 1st Qu.: 0.000 Women:421 Married:606 1st Qu.:0.0000 1st Qu.:2.260 #> Median : 1.000 Median :0.0000 Median :3.150 #> Mean : 1.693 Mean :0.4951 Mean :3.103 #> 3rd Qu.: 2.000 3rd Qu.:1.0000 3rd Qu.:3.920 #> Max. :19.000 Max. :3.0000 Max. :4.620 #> ment #> Min. : 0.000 #> 1st Qu.: 3.000 #> Median : 6.000 #> Mean : 8.767 #> 3rd Qu.:12.000 #> Max. :77.000list.mod<-list(
hurdle=pscl::hurdle(art~fem+mar, data=bioChemists, dist="poisson", zero.dist="binomial", link="logit"),
glmmtmb=glmmTMB::glmmTMB(art~fem+mar, data=bioChemists, family= truncated_poisson(), ziformula=~fem+mar)
)
lapply(list.mod, summary)
#> $hurdle#> #> Call:#> pscl::hurdle(formula = art ~ fem + mar, data = bioChemists, dist = "poisson", #> zero.dist = "binomial", link = "logit")#> #> Pearson residuals:#> Min 1Q Median 3Q Max #> -1.1392 -1.0178 -0.3446 0.3905 10.2843 #> #> Count model coefficients (truncated poisson with log link):#> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 0.847598 0.064084 13.226 < 2e-16 ***#> femWomen -0.237351 0.064199 -3.697 0.000218 ***#> marMarried 0.008846 0.066944 0.132 0.894867 #> Zero hurdle model coefficients (binomial with logit link):#> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 0.90794 0.15640 5.805 6.42e-09 ***#> femWomen -0.24195 0.14923 -1.621 0.105 #> marMarried 0.07802 0.15636 0.499 0.618 #> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 #> #> Number of iterations in BFGS optimization: 9 #> Log-likelihood: -1670 on 6 Df#> #> $glmmtmb#> Family: truncated_poisson ( log )#> Formula: art ~ fem + mar#> Zero inflation: ~fem + mar#> Data: bioChemists#> #> AIC BIC logLik deviance df.resid #> 3352.2 3381.1 -1670.1 3340.2 909 #> #> #> Conditional model:#> Estimate Std. Error z value Pr(>|z|) #> (Intercept) 0.847585 0.064084 13.226 < 2e-16 ***#> femWomen -0.237339 0.064199 -3.697 0.000218 ***#> marMarried 0.008862 0.066944 0.132 0.894684 #> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> #> Zero-inflation model:#> Estimate Std. Error z value Pr(>|z|) #> (Intercept) -0.90794 0.15640 -5.805 6.42e-09 ***#> femWomen 0.24195 0.14923 1.621 0.105 #> marMarried -0.07802 0.15636 -0.499 0.618 #> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
cor(fitted(list.mod$hurdle), fitted(list.mod$glmmtmb))
#> [1] 1
all.equal(fitted(list.mod$hurdle), fitted(list.mod$glmmtmb))
#> [1] "names for target but not for current" #> [2] "Mean relative difference: 4.620395e-06"
lapply(list.mod, performance::r2)
#> Warning: mu of 2.1 is too close to zero, estimate of random effect variances may#> be unreliable.#> Random effect variances not available. Returned R2 does not account for random effects.#> $hurdle#> # R2 for Zero-Inflated and Hurdle Regression#> R2: 0.028#> adj. R2: 0.025#> #> $glmmtmb#> # R2 for Mixed Models#> #> Conditional R2: NA#> Marginal R2: 0.095
lapply(list.mod, performance::r2_zeroinflated)
#> $hurdle#> # R2 for Zero-Inflated and Hurdle Regression#> R2: 0.028#> adj. R2: 0.025#> #> $glmmtmb#> # R2 for Zero-Inflated and Hurdle Regression#> R2: 0.031#> adj. R2: 0.027
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)#> Platform: x86_64-w64-mingw32/x64 (64-bit)#> Running under: Windows 10 x64 (build 22000)#> #> Matrix products: default#> #> locale:#> [1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8 #> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C #> [5] LC_TIME=Spanish_Spain.utf8 #> #> attached base packages:#> [1] stats graphics grDevices utils datasets methods base #> #> other attached packages:#> [1] pscl_1.5.5.1 performance_0.10.4 glmmTMB_1.1.7 #> #> loaded via a namespace (and not attached):#> [1] Rcpp_1.0.10 compiler_4.2.0 nloptr_2.0.1 #> [4] highr_0.9 TMB_1.9.6 tools_4.2.0 #> [7] boot_1.3-28 digest_0.6.29 lme4_1.1-29 #> [10] evaluate_0.15 lifecycle_1.0.3 nlme_3.1-157 #> [13] lattice_0.20-45 rlang_1.1.1 reprex_2.0.2 #> [16] Matrix_1.4-1 cli_3.6.1 rstudioapi_0.13 #> [19] yaml_2.3.5 mvtnorm_1.1-3 xfun_0.40 #> [22] fastmap_1.1.0 coda_0.19-4 withr_2.5.0 #> [25] stringr_1.5.0 knitr_1.39 fs_1.5.2 #> [28] vctrs_0.6.3 grid_4.2.0 glue_1.6.2 #> [31] survival_3.3-1 rmarkdown_2.14 multcomp_1.4-19 #> [34] TH.data_1.1-1 minqa_1.2.4 magrittr_2.0.3 #> [37] codetools_0.2-18 htmltools_0.5.6 emmeans_1.8.8 #> [40] splines_4.2.0 MASS_7.3-57 insight_0.19.1.3 #> [43] xtable_1.8-4 numDeriv_2016.8-1.1 sandwich_3.0-1 #> [46] stringi_1.7.6 estimability_1.4.1 zoo_1.8-10
Hi! Congratulations on the great package! I've noticed an unexpected behaviour:
r2()
for hurdle models is different depending on the package used. If one usesr2()
, hurdle models fitted withpscl
default tor2_zeroinflated()
, whileglmmTMB
ones don't. Moreover,r2_zeroinflated()
do not return the same values, perhaps due to small differences in fitted values? None of this is particularly worrisome, and possibly it is not even related to theperformance
package (but rather topscl
andglmmTMB
) but I thought it best to report it here. Cheers!Created on 2023-09-12 with reprex v2.0.2
The text was updated successfully, but these errors were encountered: