-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ShapRFECV min number of features to keep per feature group #75
Comments
The more parameters we add to a class, the more intimidating it becomes for a new user to learn it. I prefer a simple knife as a tool. My point: If a user really wants this, they could write some code that runs What I'm getting at is also called the Unix philosophy:
So, even though I see the added value of the feature, I would advocate against it. |
I agree with the fact that this would complicate the API. That is why a way to implement it could be make a class that inherits from What do you think about this approach? Does the use case argument implementation of an extra class just for this puprose? I had a similar idea of tackling the issue, of just running ShapRFECV separately for each group. The main drawback of that we don't take into consideration the predictive power of relations between features in different groups. |
A separate class for this makes sense. However, does the problem of having to keep at least one feature for every group occur often enough to outweigh the costs of having to maintain more code? (I'm trying to push for keeping things as simple as possible ;) ) |
After offline discussion, I understand there are teams that could benefit from this use-case. Separating it from |
Problem Description
There could be a grouping of features e.g. demographic, siocial etc. In that case you might want to end up with a combination of features from each group, at the end of the feature elimination process.
Desired Outcome
We can add two parameters:
Thanks to this, even if one type of features is not predictive, at least a couple of them will be kept.
This way the model can be explained better to the business in terms of different characteristics of the sample.
The text was updated successfully, but these errors were encountered: