-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LabelEncoder + Imputer + LabelBinarizer error #96
Comments
My recommendation here is to create a subclass of If you come up with that implementation please post it here in a PR as it is suitable to be included with sklearn-pandas. |
@paubelsan @dukebody If the proposal for this enhancement is still actual, and nobody works on it right now, I could make a try. Though I am not sure how to replicate the issue, I am getting a different exception when trying to apply sequence
And, not on |
@devforfu I guess that @paubelsan might have different versions of numpy/pandas, but the issue looks the same to me: I kind of remember some conversations about creating a transformer ( |
Since this is a problem that is likely still encountered by many it may be good to write here that in the dev-0.20 version of sklearn OneHotEncoder directly supports categorical inputs without using |
I'm not sure if anyone is still experiencing this problem in light of recent updates to sklearn but if you have a list of categorical variable keys you can do something like
Hopefully this can be helpful to somebody |
Hi,
I'm having an error while using a LabelEncoder + Imputer + LabelBinarizer in a mapper, as a LabelEncoder output is a vector of (n_samples,) so Imputer, that calls sklearn function check_array, that calls numpy funciont atleast_2d, transforms it to (1,n_samples), so LabelBinarizer crashes:
ValueError: Multioutput target data is not supported with label binarization
How can I fix this issue?
Many thanks!
The text was updated successfully, but these errors were encountered: