-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29233: Iceberg: Validate HiveRESTCatalogClient with external RESTCatalogs like Gravitino #6108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @difin, Can we use the standalone HMS instead Gravitino as the Iceberg REST server for E2E testing? |
@zhangbutao, we already have a test for HMS RestCatalog (iceberg_rest_catalog_hms.q). |
.../test/java/org/apache/hadoop/hive/cli/TestIcebergRESTCatalogGravitinoLlapLocalCliDriver.java
Show resolved
Hide resolved
c949a61
to
f3abd30
Compare
f3abd30
to
0497faa
Compare
0497faa
to
b604394
Compare
b604394
to
8656b64
Compare
8656b64
to
54bb487
Compare
I checked out this branch and ran
|
4390c5b
to
8865c8c
Compare
8865c8c
to
c58ec82
Compare
itests/qtest-iceberg/pom.xml
Outdated
<artifactItem> | ||
<groupId>com.h2database</groupId> | ||
<artifactId>h2</artifactId> | ||
<version>2.2.224</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
${h2database.version}, property is defined in a root pom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
--! qt:replace:/(\S\"iceberg-version\\\":\\\")(\w+\s\w+\s\d+\.\d+\.\d+\s\(\w+\s\w+\))(\\\")/$1#Masked#$3/ | ||
|
||
set hive.stats.autogather=false; | ||
set metastore.client.impl=org.apache.iceberg.hive.client.HiveRESTCatalogClient; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we set these configs in the Driver, similar to iceberg.catalog.ice01.uri
?
ice01
is hardcoded there anyways: https://github.com/apache/hive/pull/6108/files#diff-398be9698c49244815af06140fe75eaaeef75ec6e497a74b8953cbda334bf122R122
- metastore.client.impl
- metastore.catalog.default
- iceberg.catalog.ice01.type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These configs are set in the driver too.
I added them here in q-test just for clarity to make them visible in the q-test itself.
Do you think it is better to keep them only in the driver?
|
||
if (warehouseDir != null && Files.exists(warehouseDir)) { | ||
try (var paths = Files.walk(warehouseDir)) { | ||
paths.sorted(Comparator.reverseOrder()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use FileUtils.cleanDirectory(warehouseDir) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done using .deleteDirectory()
.
cleanDirectory(dir) → empties the folder but leaves the folder in place.
deleteDirectory(dir) → deletes the folder and everything under it.
@@ -0,0 +1,14 @@ | |||
gravitino.iceberg-rest.httpPort = 9001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we read this config in Driver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, made it configurable like the warehouseDir.
* to avoid overwriting container data.</p> | ||
*/ | ||
private void startWarehouseDirSync() { | ||
fileSyncExecutor.scheduleAtFixedRate(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we refactor
fileSyncExecutor.scheduleAtFixedRate(this::syncWarehouseDir, 0, 5, TimeUnit.SECONDS);
private void syncWarehouseDir() {
if (!gravitinoContainer.isRunning()) {
return;
}
// The try-with-resources block handles the core logic
try (/*... setup TarArchiveInputStream ...*/) {
// ... (rest of the sync logic)
} catch (Exception e) {
LOG.error("Warehouse folder sync failed: {}", e.getMessage());
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
* "License"); you may not use this file except in compliance | ||
* with the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You have to copy checkstyle/asf.header
or change the template
I'm trying to modify existing files. So far, we need to add exact headers to new files.
https://issues.apache.org/jira/browse/HIVE-29245
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. Please modify the files where SonarQube reported minor issues.
https://sonarcloud.io/project/issues?id=apache_hive&pullRequest=6108&issueStatuses=OPEN,CONFIRMED&sinceLeakPeriod=true
|
What changes were proposed in this pull request?
Implemented a new qtest-driver
TestIcebergRESTCatalogGravitinoLlapLocalCliDriver
for testingHiveRESTCatalogClient
with Gravitino Iceberg Rest Server and a q-test for it.Why are the changes needed?
To validate support for external RestCatalogs like Gravitino.
Does this PR introduce any user-facing change?
No
How was this patch tested?
New q-test.