Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement feature_importances_ in sksurv.ensemble.RandomSurvivalForest #140

Open
mtomaszewski95 opened this issue Sep 30, 2020 · 6 comments

Comments

@mtomaszewski95
Copy link

Implement feature_importances_ in sksurv.ensemble.RandomSurvivalForest.
Examples:
https://cran.r-project.org/web/packages/randomForestSRC/randomForestSRC.pdf
https://square.github.io/pysurvival/models/random_survival_forest.html
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6364686/

@sebp
Copy link
Owner

sebp commented Oct 2, 2020

Feature importances based on node/split statistics are rather flawed (see e.g. this paper). Therefore, I'm hesitant to implement this feature. In particular, you can already compute permutation-based feature importance via ELI5. It is more expensive to compute, but has better properties.

@funnell
Copy link

funnell commented Jun 14, 2023

My vote would be for adding the feature, at the very least for compatibility with scikit-learn.

@sebp
Copy link
Owner

sebp commented Jun 15, 2023

@funnell
Copy link

funnell commented Jun 16, 2023

Yes, thanks! I understand your point of view, and that there are alternative ways to compute importance.
Still, even if it's not an ideal algorithm, it can still be nice to have. Some things presume feature_importances_ is available (e.g. RFECV) and not having it might add a little friction for new scikit-survival users already familiar with scikit-learn. It's also a lot faster which can be helpful during early iteration.

Thanks for the package and thanks for considering! :)

@anwurl
Copy link

anwurl commented Jan 17, 2024

I also have a use-case where I am only interested in which feature are used or not used. For that, the feature importances based on node/split statistics could do the job and would be quick to calculate. In contrast, the calculation of permutation feature importances takes so much longer.

Thanks a lot for this package and your work.

@sebp
Copy link
Owner

sebp commented Jan 17, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants