Write parquet with decimal columns in load_test #217

ChrisJar · 2021-05-21T15:51:01Z

This enables using load_test to create parquet files with decimal columns as well as reading in parquet files with decimal columns for running queries. However, as the csv reader/writer doesn't support decimal yet, the decimal values have to be read in as str and cast to decimal before they're written to parquet.

Due to: rapidsai/cudf#8311 the decimal columns appear to be type object however, the underlying type is decimal and they only appear as object when calling df.dtypes due to a metadata error.

Due to: rapidsai/cudf#8354 the original precision of each decimal column is lost, but this shouldn't affect the underlying values.

CC: @randerzander

Chris Jarrett added 3 commits May 21, 2021 08:45

Write parquet with decimal columns in load_test

0317afb

Read in as string

caad91c

Cleanup

515e4dc

ChrisJar marked this pull request as ready for review May 25, 2021 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write parquet with decimal columns in load_test #217

Write parquet with decimal columns in load_test #217

ChrisJar commented May 21, 2021 •

edited

Loading

Write parquet with decimal columns in load_test #217

Are you sure you want to change the base?

Write parquet with decimal columns in load_test #217

Conversation

ChrisJar commented May 21, 2021 • edited Loading

ChrisJar commented May 21, 2021 •

edited

Loading