-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support trunc
and date_trunc
SQL function
#11833
Conversation
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
trunc
and date_trunc
SQL functiontrunc
and date_trunc
SQL function
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM overall, left minor comments/questions on the doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a follow on issue to support the format as a Scalar in the kernel?
Performance looks great I ran
GPU time was around 880 ms CPU time was around 165,600 ms or about 188 times faster. I think this is what happens when we have to deal with time zones on the CPU. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this change the tests fail in other time zones, like with TZ='Africa/Casablanca'
because the code falls back to the CPU, but the tests don't expect that.
trunc_timestamp_format_gen = StringGen('(?i:YEAR|YYYY|YY|QUARTER|MONTH|MM|MON|WEEK|DAY|DD|HOUR|MINUTE|SECOND|MILLISECOND|MICROSECOND)') \ | ||
.with_special_pattern('invalid', weight=50) | ||
|
||
@pytest.mark.parametrize('data_gen', [date_gen], ids=idfn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed this before all of the tests need to be marked with
@allow_non_gpu(*non_utc_tz_allow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should apply only for the timestamp test, right? Date tests should not care about timezone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is only needed for timestamp. But you might want to test it to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested the latest commit with TZ=Asia/Shanghai ./integration_tests/run_pyspark_from_build.sh ...
and all passed.
I also filed #11860 I don't see it as a big deal, but I wanted to capture it. |
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
I've pushed a new JNI PR (NVIDIA/spark-rapids-jni#2687) to optimize the kernel when string format is given as a scalar. |
Signed-off-by: Nghia Truong <[email protected]>
build |
This change updates operatorsScore.csv and supportedExprs.csv to include `TruncDate` and `TruncTimestamp`, for all shim versions. This seems to have been left out of NVIDIA#11833. Signed-off-by: MithunR <[email protected]>
…11890) This change updates `operatorsScore.csv` and `supportedExprs.csv` to include `TruncDate` and `TruncTimestamp`, for all shim versions. This seems to have been left out of #11833. Signed-off-by: MithunR <[email protected]>
This implements override for
TruncDate
andTruncTimestamp
to support the SQL functionstrunc
anddate_trunc
.Closes #11804 and #11860.
Depends on: