You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
q02 is particularly memory heavy and specifies a subset of columns along with a WHERE clause. The intent is to minimize the size of partitions which will be shuffled, so reading the extra columns is likely very expensive.
One Eg: In the dask-cudf query we drop columns right after a merge: https://github.com/rapidsai/gpu-bdb/blob/main/gpu_bdb/queries/q05/gpu_bdb_query_05.py#L57
The dask-sql Version does the full sql query but investigation needs to be done on whether unneeded columns are carried to the of the query and then dropped: https://github.com/rapidsai/gpu-bdb/blob/main/gpu_bdb/queries/q05/gpu_bdb_query_05_dask_sql.py#L34-L70
The text was updated successfully, but these errors were encountered: