-
Notifications
You must be signed in to change notification settings - Fork 6.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GraalVM Reachability Metadata and corresponding nativeTest for Iceberg table in HiveServer2 #31526
Conversation
b324afa
to
5eb2c97
Compare
99f7063
to
02e7c95
Compare
7640e07
to
e8bac17
Compare
e8bac17
to
cf6af0c
Compare
cf6af0c
to
3602d31
Compare
b7abc5b
to
959b056
Compare
959b056
to
3caa615
Compare
0fece25
to
52508bb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Let's continue to wait for the resolution of
shardingsphere-logging-core
should not set the scope oflogback-classic
tocompile
#32377 . - By the way, using HiveServer2 JDBC Driver under GraalVM Native Image does involve a lot of external discussions. 🤗Perhaps we will have to wait for external discussions to be completed. 🦒
- https://issues.apache.org/jira/browse/HIVE-28295
- https://issues.apache.org/jira/browse/HIVE-28308
- https://issues.apache.org/jira/browse/HIVE-28316
- https://issues.apache.org/jira/browse/HIVE-28317
- https://issues.apache.org/jira/browse/HIVE-28322
- https://issues.apache.org/jira/browse/HIVE-28417
- https://issues.apache.org/jira/browse/HIVE-28418
- https://issues.apache.org/jira/browse/HIVE-28420
- https://issues.apache.org/jira/browse/HIVE-28423
- https://issues.apache.org/jira/browse/HIVE-28424
- https://issues.apache.org/jira/browse/HIVE-28429
- https://issues.apache.org/jira/browse/HIVE-28430
- https://issues.apache.org/jira/browse/HIVE-28432
shardingsphere-logging-core
should not set the scope oflogback-classic
tocompile
#32377- [QUESTION] Should we collect GraalVM Reachability Metadata for shaded packages? oracle/graalvm-reachability-metadata#377
- grpc-netty-shaded cannot be successfully compiled under GraalVM grpc/grpc-java#10601
- Add GraalVM native compilation support logging-log4j2#1539
- [SPARK-33343][BUILD] Fix the build with sbt to copy hadoop-client-runtime.jar spark#30250
- HIVE-28315: Missing classes while using hive jdbc standalone jar hive#5313
- HIVE-28191. Upgrade Hadoop Version to 3.4.0. hive#5187
- Fix handling of
onMatch
andonMismatch
attributes in the properties configuration format logging-log4j2#2791 - When bump from 2.23.0 to 2.23.1, the
status
of the.properties
configuration file no longer works logging-log4j2#2794 - Dbeaver's Snapcraft package is missing font files under WSL dbeaver/dbeaver#35028
- Hive Connection Error with ZooKeeper dbeaver/dbeaver#22777
- Fixes the connection of Remote HiveServer2 through HiveServer2 JDBC Driver #31573
- Bump the minimum GraalVM CE version required to compile ShardingSphere's GraalVM Native Image artifacts to JDK22 #31630
- Add documentation for using XA distributed transactions in GraalVM Native Image #31975
- Uses GraalVM CE For JDK 22.0.2 in CI to prevent OutOfMemoryError #32246
- Fixes HotSpot JDK 22 CI failure due to GR-54293 #32267
52508bb
to
6508406
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Let's import https://github.com/linghengqian/hive-server2-jdbc-driver to shield thousands of lines of dependency management.
3df3899
to
1eb45d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Non-focus of the current PR has been moved to the following external issues or PRs.
- https://issues.apache.org/jira/browse/HIVE-28295
- https://issues.apache.org/jira/browse/HIVE-28308
- https://issues.apache.org/jira/browse/HIVE-28316
- https://issues.apache.org/jira/browse/HIVE-28317
- https://issues.apache.org/jira/browse/HIVE-28322
- https://issues.apache.org/jira/browse/HIVE-28417
- https://issues.apache.org/jira/browse/HIVE-28418
- https://issues.apache.org/jira/browse/HIVE-28420
- https://issues.apache.org/jira/browse/HIVE-28423
- https://issues.apache.org/jira/browse/HIVE-28424
- https://issues.apache.org/jira/browse/HIVE-28429
- https://issues.apache.org/jira/browse/HIVE-28430
- https://issues.apache.org/jira/browse/HIVE-28432
- https://issues.apache.org/jira/browse/HIVE-28437
- https://issues.apache.org/jira/browse/HIVE-28444
- https://issues.apache.org/jira/browse/HIVE-28445
- [SPARK-33343][BUILD] Fix the build with sbt to copy hadoop-client-runtime.jar spark#30250
- Hive Connection Error with ZooKeeper dbeaver/dbeaver#22777
- Add GraalVM native compilation support logging-log4j2#1539
- grpc-netty-shaded cannot be successfully compiled under GraalVM grpc/grpc-java#10601
- [QUESTION] Should we collect GraalVM Reachability Metadata for shaded packages? oracle/graalvm-reachability-metadata#377
- HIVE-28315: Missing classes while using hive jdbc standalone jar hive#5313
- HIVE-28191. Upgrade Hadoop Version to 3.4.0. hive#5187
- When bump from 2.23.0 to 2.23.1, the
status
of the.properties
configuration file no longer works logging-log4j2#2794 - Fix handling of
onMatch
andonMismatch
attributes in the properties configuration format logging-log4j2#2791 - HIVE-28417: Bump Log4j2 to 2.24.1 to facilitate compilation of GraalVM Native Image hive#5375
- Improve HPL/SQL tests hive#5381
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We needed a transactional HiveServer2 for our unit tests, but the options provided by apache/hive were a bit overkill. Compared to manually creating ACID tables,
set metastore.compactor.initiator.on=true;
set metastore.compactor.cleaner.on=true;
set metastore.compactor.worker.threads=5;
set hive.support.concurrency=true;
set hive.exec.dynamic.partition.mode=nonstrict;
set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
CREATE TABLE IF NOT EXISTS t_order
(
order_id BIGINT,
order_type INT,
user_id INT NOT NULL,
address_id BIGINT NOT NULL,
status VARCHAR(50),
PRIMARY KEY (order_id) disable novalidate
) CLUSTERED BY (order_id) INTO 2 BUCKETS STORED AS ORC TBLPROPERTIES ('transactional' = 'true');
- I prefer to create Iceberg tables directly. Whether it is an ACID table or an iceberg table, from the perspective of the apache/shardingsphere parser, it only provides support for DML statements, and any transaction rollback operations are not supported.
set iceberg.mr.schema.auto.conversion=true;
CREATE TABLE IF NOT EXISTS t_order
(
order_id BIGINT,
order_type INT,
user_id INT NOT NULL,
address_id BIGINT NOT NULL,
status VARCHAR(50),
PRIMARY KEY (order_id) disable novalidate
) STORED BY ICEBERG STORED AS ORC TBLPROPERTIES ('format-version' = '2');
- It is worth mentioning that apache/iceberg seems to have dropped support for JDK8, but apache/hive still supports JDK8 like apache/shardingsphere. See Core: Drop support for Java 8 iceberg#10518 and Remove Hadoop 2 iceberg#10940.
However, apache/orc has long dropped support for JDK8, and apache/hive seems to be almost unaffected. See ORC-1512: Drop Java 8/11 and make Java 17 by default orc#1627. If apache/hive does not make any changes, apache/shardingsphere actually does not need to consider upgrading the jdk runtime to jdk17.
4262004
to
f4b2af0
Compare
…eberg table in HiveServer2
f4b2af0
to
f1280d8
Compare
For #29052.
Changes proposed in this pull request:
shardingsphere-logging-core
from passinglogback-classic
dependencies to downstream modules #32389DefaultLoggingRuleConfigurationBuilder
to reduce usage oflogback-classic
#32418HiveMetaDataLoader
creates an additional embedded HiveServer2 when using embedded Hive Metastore Server #32552shardingsphere-parser-sql-hive
module does not supportCREATE
,SET
,TRUNCATE
,DROP
statements yet. This is marked with a TODO.DELETE FROM t_address WHERE address_id=?
, we always need to execute some Hive Session-level SQL in the currentjavax.sql.DataSource
. However, since the SQL parsing ofset key=value
is not supported, these SQLs are currently located intest-native/sql/test-native-databases-hive.sql
and are directly passed through to the actual Hive database for processing.For https://issues.apache.org/jira/browse/HIVE-28295 and https://issues.apache.org/jira/browse/HIVE-28418 . The current PR is using Remote HiveServer2 in Docker instead of Embedded HiveServer2.According to the discussion at https://issues.apache.org/jira/browse/HIVE-28418, the use of embedded HiveServer2 is no longer encouraged and the related documents will be deleted.initFile
parameter of HiveServer2 JDBC Driver must be an absolute path. Maybe new properties can be added on the Hive side.For https://issues.apache.org/jira/browse/hive-28308 . Because the unknown Commit on the Hive side,The hive contributors said this was intentional and related to a bug with incorrect display of Maven Central.org.apache.hive:hive-jdbc:4.0.0
does not contain any compile scope dependence.For https://issues.apache.org/jira/browse/HIVE-28429 , [SPARK-33343][BUILD] Fix the build with sbt to copy hadoop-client-runtime.jar spark#30250 and HIVE-28315: Missing classes while using hive jdbc standalone jar hive#5313 . The shaded classSee Fixes the issue whereorg.apache.hadoop.shaded.com.ctc.wstx.io.InputBootstrapper
fromorg.apache.hadoop:hadoop-client-api
should not be excluded internally withinorg.apache.hive:hive-service:4.0.0
, which necessitates excluding its internal dependency onorg.apache.hadoop:hadoop-client-api
and manually addingorg.apache.hadoop:hadoop-client-api:3.3.5
.HiveMetaDataLoader
creates an additional embedded HiveServer2 when using embedded Hive Metastore Server #32552 .t_order
andt_order_item
tables becauseorg.apache.hive.jdbc.HiveStatement
does not implementjava.sql.Statement#getGeneratedKeys()
.For [QUESTION] Should we collect GraalVM Reachability Metadata for shaded packages? oracle/graalvm-reachability-metadata#377 and grpc-netty-shaded cannot be successfully compiled under GraalVM grpc/grpc-java#10601 . SinceIt seems that the use of grpc-netty can lead to a larger range of dependencies.io.grpc:grpc-netty-shaded:1.58.0
inorg.apache.hive:hive-service:4.0.0
shades Netty, this shaded package requires additionalnative-image.properties
. If the HiveServer2 JDBC Driver can remove the use ofgrpc-netty-shaded
, it will obviously make the current PR less difficult to understand.For https://issues.apache.org/jira/browse/HIVE-28417 . HiveServer2 JDBC Driver uses a log4j2 version that is incompatible with the GraalVM Native Image.Resolved by Add GraalVM native compilation support logging-log4j2#1539 (comment).Before committing this PR, I'm sure that I have checked the following options:
./mvnw clean install -B -T1C -Dmaven.javadoc.skip -Dmaven.jacoco.skip -e
.