[BUG] `len` after `fillna` operation uses way more memory than expected

I have a dataframe that occupies ~700-800 MB when persisted. I fill all the nulls in the Dataframe using `fill_na` and call `len` on the new Dataframe. I notice an explosion in memory usage.

Reproducer:
```
# Create a dataframe and write to file
import numpy as np
import pandas as pd
import dask.dataframe

pdf = pd.DataFrame()
for i in range(80): 
    pdf[str(i)] = pd.Series([12,None]*100000)
ddf = dask.dataframe.from_pandas(pdf,1)
ddf.to_parquet('temp_data.parquet')

# Read the dataframe from file
import os
import dask
import dask_cudf
import cudf

path = 'temp_data.parquet/'
files = [fn for fn in os.listdir(path) if fn.endswith('.parquet')]
parts= [dask.delayed(cudf.io.parquet.read_parquet)
         (path=path+fn) for fn in files]

temp = dask_cudf.from_delayed(parts)
```
Now when I do `len(temp)`

Nvidia-smi usage shoots to a max state here:
```

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   47C    P0    28W /  70W |    685MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:5E:00.0 Off |                    0 |
| N/A   33C    P8    10W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla T4            Off  | 00000000:AF:00.0 Off |                    0 |
| N/A   32C    P8    10W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla T4            Off  | 00000000:D8:00.0 Off |                    0 |
| N/A   32C    P8     9W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    404162      C   /conda/envs/rapids/bin/python                841MiB |
+-----------------------------------------------------------------------------+

```
Now for the `fill_na` operation

```
%%time
for col in temp.columns:
    temp[col] = temp[col].fillna(-1)
```
CPU times: user 35.6 s, sys: 1.26 s, total: 36.8 s
Wall time: 38.7 s (Which is slow)

(No change in memory usage leading me to believe this operation is only done at a metadata level but not on the complete data)

Finally:
`len(temp)`

Nvidia-smi usage
```
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104      Driver Version: 410.104      CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   46C    P0    28W /  70W |  13681MiB / 15079MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla T4            Off  | 00000000:5E:00.0 Off |                    0 |
| N/A   33C    P8    10W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  Tesla T4            Off  | 00000000:AF:00.0 Off |                    0 |
| N/A   32C    P8     9W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  Tesla T4            Off  | 00000000:D8:00.0 Off |                    0 |
| N/A   32C    P8     9W /  70W |     10MiB / 15079MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0    404162      C   /conda/envs/rapids/bin/python              13755MiB |
+-----------------------------------------------------------------------------+
```

Which is more than a 16x spike in memory usage. Not sure if my approach is wrong or there is some other underlying issue. 

**Environment Info**
cudf: Built from source at commit: rapidsai/cudf@79af3a8806bbe01a
dask-cudf:Built from source at commit 24798dd8cf9502

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[BUG] `len` after `fillna` operation uses way more memory than expected #2283

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] len after fillna operation uses way more memory than expected #2283

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[BUG] `len` after `fillna` operation uses way more memory than expected #2283