Preserve input data types #149

hacktuarial · 2018-04-16T20:08:19Z

As discussed in #138, when using DataFrameMapper with default=None, the current behavior is to create a np.array with the unselected columns. This has the undesired side effect of casting them to a common data type. This PR preserves the data types of unselected columns when default=None, input_df=True, output_df=True.

output heterogeneous data types

hacktuarial · 2018-04-16T20:48:51Z

Tests failed on README.rst, line 187 for 2 reasons

Columns are out of order. I will fix this
Data types have changed; when df_out=True, it seems unnecessary to promote int values to float since a pd.DataFrame can accommodate heterogeneous types.

hacktuarial · 2018-04-16T20:58:28Z

The only failing check now is number 2. With permission, I would like to edit the test so that the output columns of LabelBinarizer() are of type int, not float.

dukebody

Thanks for your contribution! I was going to comment requesting changes in your PR, but I did them myself in https://github.com/scikit-learn-contrib/sklearn-pandas/pull/153/files. If everything looks good I'm gonna merge that one.

dukebody · 2018-04-20T18:10:47Z

README.rst

@@ -504,3 +509,4 @@ Other contributors:
 * Ritesh Agrawal (@ragrawal)
 * Vitaley Zaretskey (@vzaretsk)
 * Zac Stewart (@zacstewart)
+* Timothy Sweetser (@hacktuarial)


Please preserve the alphabetic order. :)

hacktuarial · 2018-05-05T20:28:38Z

Looks great! Thanks for reviewing.

hacktuarial and others added 3 commits April 16, 2018 12:33

output heterogeneous data types

bc40765

bump version

9b82edf

Merge pull request #1 from stitchfix/mixed-output-types

c736465

output heterogeneous data types

partial fix for doctests

3ec3b48

dukebody mentioned this pull request May 5, 2018

Output heterogeneous data types #153

Merged

dukebody reviewed May 5, 2018

View reviewed changes

hacktuarial closed this May 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preserve input data types #149

Preserve input data types #149

hacktuarial commented Apr 16, 2018

hacktuarial commented Apr 16, 2018 •

edited

Loading

hacktuarial commented Apr 16, 2018

dukebody left a comment

dukebody Apr 20, 2018

hacktuarial commented May 5, 2018

Preserve input data types #149

Preserve input data types #149

Conversation

hacktuarial commented Apr 16, 2018

hacktuarial commented Apr 16, 2018 • edited Loading

hacktuarial commented Apr 16, 2018

dukebody left a comment

Choose a reason for hiding this comment

dukebody Apr 20, 2018

Choose a reason for hiding this comment

hacktuarial commented May 5, 2018

hacktuarial commented Apr 16, 2018 •

edited

Loading