r2 for hurdle models: different values between pscl and glmmTMB

Hi! Congratulations on the great package! I've noticed an unexpected behaviour: `r2()` for hurdle models is different depending on the package used. If one uses `r2()`, hurdle models fitted with `pscl` default to `r2_zeroinflated()`, while `glmmTMB` ones don't. Moreover, `r2_zeroinflated()` do not return the same values, perhaps due to small differences in fitted values? None of this is particularly worrisome, and possibly it is not even related to the `performance` package (but rather to `pscl` and `glmmTMB`) but I thought it best to report it here. Cheers!

``` r
library(glmmTMB)
#> Warning: package 'glmmTMB' was built under R version 4.2.3
#> Warning in checkMatrixPackageVersion(): Package version inconsistency detected.
#> TMB was built with Matrix version 1.6.1
#> Current Matrix version is 1.4.1
#> Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
library(performance)
#> Warning: package 'performance' was built under R version 4.2.3
library(pscl)
#> Warning: package 'pscl' was built under R version 4.2.3
#> Classes and Methods for R developed in the
#> Political Science Computational Laboratory
#> Department of Political Science
#> Stanford University
#> Simon Jackman
#> hurdle and zeroinfl functions by Achim Zeileis


data("bioChemists", package = "pscl")
summary(bioChemists)
#>       art            fem           mar           kid5             phd       
#>  Min.   : 0.000   Men  :494   Single :309   Min.   :0.0000   Min.   :0.755  
#>  1st Qu.: 0.000   Women:421   Married:606   1st Qu.:0.0000   1st Qu.:2.260  
#>  Median : 1.000                             Median :0.0000   Median :3.150  
#>  Mean   : 1.693                             Mean   :0.4951   Mean   :3.103  
#>  3rd Qu.: 2.000                             3rd Qu.:1.0000   3rd Qu.:3.920  
#>  Max.   :19.000                             Max.   :3.0000   Max.   :4.620  
#>       ment       
#>  Min.   : 0.000  
#>  1st Qu.: 3.000  
#>  Median : 6.000  
#>  Mean   : 8.767  
#>  3rd Qu.:12.000  
#>  Max.   :77.000


list.mod <- list(
  hurdle = pscl::hurdle(art ~ fem + mar, data = bioChemists, dist = "poisson", zero.dist = "binomial", link = "logit"),
  glmmtmb = glmmTMB::glmmTMB(art ~ fem + mar, data = bioChemists, family = truncated_poisson(), ziformula = ~ fem + mar)
)


lapply(list.mod, summary)
#> $hurdle
#> 
#> Call:
#> pscl::hurdle(formula = art ~ fem + mar, data = bioChemists, dist = "poisson", 
#>     zero.dist = "binomial", link = "logit")
#> 
#> Pearson residuals:
#>     Min      1Q  Median      3Q     Max 
#> -1.1392 -1.0178 -0.3446  0.3905 10.2843 
#> 
#> Count model coefficients (truncated poisson with log link):
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.847598   0.064084  13.226  < 2e-16 ***
#> femWomen    -0.237351   0.064199  -3.697 0.000218 ***
#> marMarried   0.008846   0.066944   0.132 0.894867    
#> Zero hurdle model coefficients (binomial with logit link):
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.90794    0.15640   5.805 6.42e-09 ***
#> femWomen    -0.24195    0.14923  -1.621    0.105    
#> marMarried   0.07802    0.15636   0.499    0.618    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
#> 
#> Number of iterations in BFGS optimization: 9 
#> Log-likelihood: -1670 on 6 Df
#> 
#> $glmmtmb
#>  Family: truncated_poisson  ( log )
#> Formula:          art ~ fem + mar
#> Zero inflation:       ~fem + mar
#> Data: bioChemists
#> 
#>      AIC      BIC   logLik deviance df.resid 
#>   3352.2   3381.1  -1670.1   3340.2      909 
#> 
#> 
#> Conditional model:
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  0.847585   0.064084  13.226  < 2e-16 ***
#> femWomen    -0.237339   0.064199  -3.697 0.000218 ***
#> marMarried   0.008862   0.066944   0.132 0.894684    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Zero-inflation model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -0.90794    0.15640  -5.805 6.42e-09 ***
#> femWomen     0.24195    0.14923   1.621    0.105    
#> marMarried  -0.07802    0.15636  -0.499    0.618    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


cor(fitted(list.mod$hurdle), fitted(list.mod$glmmtmb))
#> [1] 1


all.equal(fitted(list.mod$hurdle), fitted(list.mod$glmmtmb))
#> [1] "names for target but not for current"  
#> [2] "Mean relative difference: 4.620395e-06"

lapply(list.mod, performance::r2)
#> Warning: mu of 2.1 is too close to zero, estimate of random effect variances may
#>   be unreliable.
#> Random effect variances not available. Returned R2 does not account for random effects.
#> $hurdle
#> # R2 for Zero-Inflated and Hurdle Regression
#>        R2: 0.028
#>   adj. R2: 0.025
#> 
#> $glmmtmb
#> # R2 for Mixed Models
#> 
#>   Conditional R2: NA
#>      Marginal R2: 0.095


lapply(list.mod, performance::r2_zeroinflated)
#> $hurdle
#> # R2 for Zero-Inflated and Hurdle Regression
#>        R2: 0.028
#>   adj. R2: 0.025
#> 
#> $glmmtmb
#> # R2 for Zero-Inflated and Hurdle Regression
#>        R2: 0.031
#>   adj. R2: 0.027


sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22000)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.utf8  LC_CTYPE=Spanish_Spain.utf8   
#> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C                  
#> [5] LC_TIME=Spanish_Spain.utf8    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] pscl_1.5.5.1       performance_0.10.4 glmmTMB_1.1.7     
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.10         compiler_4.2.0      nloptr_2.0.1       
#>  [4] highr_0.9           TMB_1.9.6           tools_4.2.0        
#>  [7] boot_1.3-28         digest_0.6.29       lme4_1.1-29        
#> [10] evaluate_0.15       lifecycle_1.0.3     nlme_3.1-157       
#> [13] lattice_0.20-45     rlang_1.1.1         reprex_2.0.2       
#> [16] Matrix_1.4-1        cli_3.6.1           rstudioapi_0.13    
#> [19] yaml_2.3.5          mvtnorm_1.1-3       xfun_0.40          
#> [22] fastmap_1.1.0       coda_0.19-4         withr_2.5.0        
#> [25] stringr_1.5.0       knitr_1.39          fs_1.5.2           
#> [28] vctrs_0.6.3         grid_4.2.0          glue_1.6.2         
#> [31] survival_3.3-1      rmarkdown_2.14      multcomp_1.4-19    
#> [34] TH.data_1.1-1       minqa_1.2.4         magrittr_2.0.3     
#> [37] codetools_0.2-18    htmltools_0.5.6     emmeans_1.8.8      
#> [40] splines_4.2.0       MASS_7.3-57         insight_0.19.1.3   
#> [43] xtable_1.8-4        numDeriv_2016.8-1.1 sandwich_3.0-1     
#> [46] stringi_1.7.6       estimability_1.4.1  zoo_1.8-10
```

<sup>Created on 2023-09-12 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

r2 for hurdle models: different values between pscl and glmmTMB #612

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

r2 for hurdle models: different values between pscl and glmmTMB #612

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions