-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HIVE-27847: Casting NUMERIC types to TIMESTAMP is prohibited #4851
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please create a test around it? The ticket is well described what it is about. But would be great to have test(s) to check what are the new results. Depend of the input
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information The version of Java (11.0.8) you have used to run this analysis is deprecated and we will stop accepting it soon. Please update to at least Java 17. |
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
@basapuram-kumar @deniskuzZ Hi, we are also hitting this problem, and I found the issue happens when both of the following conditions are satisfied.
I guess the current implementation partially works but it could not be perfect as This is the minimal reproduction in my environment.
|
Also, GenericUDFTimestamp.java could not be the only file that accesses runtime configurations. We should review all UDFs accessing HiveConf. |
intToTimestampInSeconds = SessionState.get() != null ? SessionState.get().getConf() | ||
.getBoolVar(ConfVars.HIVE_INT_TIMESTAMP_CONVERSION_IN_SECONDS) : new HiveConf() | ||
.getBoolVar(ConfVars.HIVE_INT_TIMESTAMP_CONVERSION_IN_SECONDS); | ||
|
||
if (strict) { | ||
SessionState ss = SessionState.get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use more convenient names like sessionState here.
@@ -0,0 +1,12 @@ | |||
set hive.strict.timestamp.conversion=false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can not see the q.out file for this test in this commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check my comments!
if (strict) { | ||
SessionState ss = SessionState.get(); | ||
|
||
if (ss != null && ss.getConf().getBoolVar(ConfVars.HIVE_STRICT_TIMESTAMP_CONVERSION)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root cause is SessionState is not available at runtime. So, we need to inject the proper config from anywhere else.
For example, we implement GenericUDF#configure
for some UDFs in case that SessionState is unavailable.
hive/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToUnixTimeStamp.java
Lines 125 to 132 in a1420ed
@Override | |
public void configure(MapredContext context) { | |
if (context != null) { | |
formatter = InstantFormatter.ofConfiguration(context.getJobConf()); | |
String timeZoneStr = HiveConf.getVar(context.getJobConf(), HiveConf.ConfVars.HIVE_LOCAL_TIME_ZONE); | |
timeZone = TimestampTZUtil.parseTimeZone(timeZoneStr); | |
} | |
} |
hive/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFRegExp.java
Lines 64 to 70 in a1420ed
@Override | |
public void configure(MapredContext context) { | |
if (context != null) { | |
if(HiveConf.getBoolVar(context.getJobConf(), HiveConf.ConfVars.HIVE_USE_GOOGLE_REGEX_ENGINE)){ | |
this.useGoogleRegexEngine=true; | |
} | |
} |
|
||
@Override | ||
public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException { | ||
checkArgsSize(arguments, 1, 1); | ||
checkArgPrimitive(arguments, 0); | ||
checkArgGroups(arguments, 0, tsInputTypes, STRING_GROUP, DATE_GROUP, NUMERIC_GROUP, VOID_GROUP, BOOLEAN_GROUP); | ||
|
||
strict = SessionState.get() != null ? SessionState.get().getConf() | ||
.getBoolVar(ConfVars.HIVE_STRICT_TIMESTAMP_CONVERSION) : new HiveConf() | ||
.getBoolVar(ConfVars.HIVE_STRICT_TIMESTAMP_CONVERSION); | ||
intToTimestampInSeconds = SessionState.get() != null ? SessionState.get().getConf() | ||
.getBoolVar(ConfVars.HIVE_INT_TIMESTAMP_CONVERSION_IN_SECONDS) : new HiveConf() | ||
.getBoolVar(ConfVars.HIVE_INT_TIMESTAMP_CONVERSION_IN_SECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this also needs to be correctly configured.
Hello @basapuram-kumar, thank you for the contribution, could you please address the suggested comments if have cycles? Thank you in advance! |
So, let me take a glance. |
@dengzhhu653 I created a new PR because I have no permission to directly update @basapuram-kumar 's branch. |
What changes were proposed in this pull request?
As per this jira HIVE-27847 , this fix will prevent query failures for Casting NUMERIC types to TIMESTAMP.
Why are the changes needed?
To avoid query failures on master branch, 4.0.0-alpha-1 & 2 branches.
Does this PR introduce any user-facing change?
No
Is the change a dependency upgrade?
No
How was this patch tested?
Reproducible steps are added in the jira HIVE-27847, after applying this fix, the failed query gets succeded.