Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error and Incomplete Output Using performance::check_collinearity with Cox Models #714

Open
zhaohongxin0 opened this issue Apr 19, 2024 · 1 comment
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working

Comments

@zhaohongxin0
Copy link

Hello,

I encountered an issue when using the performance package to assess multicollinearity in Cox proportional hazards models. When I specify a model with exactly two predictors, the function check_collinearity throws an error, but it works as expected when there are more than two predictors, although it omits the output for the first predictor. Here are the details:

Code to Reproduce:

library(survival)
library(performance)

# Cox model with two predictors
cox_model_2vars <- coxph(Surv(time, status) ~ age + sex, data = lung)
performance::check_collinearity(cox_model_2vars)
# Error: 'V' is not a square numeric matrix

# Cox model with three predictors
cox_model_3vars <- coxph(Surv(time, status) ~ age + sex + ph.ecog, data = lung)
performance::check_collinearity(cox_model_3vars)

Expected Behavior:
The function check_collinearity should provide the multicollinearity diagnostics for models regardless of the number of predictors.

Actual Behavior:
With two predictors, it results in an error: Error in stats::cov2cor(v) : 'V' is not a square numeric matrix. However, with three or more predictors, the function works correctly and outputs the multicollinearity statistics, with an omission of the statistics for the first predictor (age in this case). Here are the results for the model with three predictors:

# Check for Multicollinearity

Low Correlation

    Term  VIF  VIF 95% CI Increased SE Tolerance Tolerance 95% CI
     sex 1.00 [1.00, Inf]         1.00      1.00     [0.00, 1.00]
 ph.ecog 1.00 [1.00, Inf]         1.00      1.00     [0.00, 1.00]

I am using R version 4.3.3 and performance package version 0.11.0. It would be helpful to understand whether this is a bug or if there's a recommended workaround for models with only two predictors.

Thank you for your assistance!

@zhaohongxin0
Copy link
Author

I suspect the issue might be related to the handling of the intercept term in the Cox model. Since the check_collinearity function in the performance package appears to automatically remove the intercept term, this could inadvertently lead to the removal of the first variable. Could this be the cause of the observed error and output omission for the first predictor in models with more than two variables? I would appreciate any insights or confirmation on this hypothesis. Thank you!

@strengejacke strengejacke added Bug 🐛 Something isn't working 3 investigators ❔❓ Need to look further into this issue labels Apr 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants