Skip to content

More efficient alternative in 04_ApplyStudents_Alcohol_Consumption #78

Open
@rahimnathwani

Description

@rahimnathwani

In step 10, we want to multiply all numerical values by 10.

The provided solution is:
df.applymap(times10).head(10)

But this is very slow, because it runs a regular python function on every element in the dataframe.

Better is to test each column's type, and then use pandas built in multiplication on the whole column:

for colname, coltype in df.dtypes.to_dict().items():
    if coltype.name in ['int64']:
        df[colname] = df[colname] * 10

I used %%timeit to test the two solutions. On this small dataset, my solution is 5x as fast (1.1ms vs 5.8ms). The difference would get larger with a larger dataset.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions