[docs] add docs to integrate Fluss + Iceberg via Flink with AWS Glue and Hive#3424
Open
qzyu999 wants to merge 1 commit into
Open
[docs] add docs to integrate Fluss + Iceberg via Flink with AWS Glue and Hive#3424qzyu999 wants to merge 1 commit into
qzyu999 wants to merge 1 commit into
Conversation
…Data Lake Catalogs section which includes Lakekeeper
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Linked issue: close #2616
This pull request introduces comprehensive integration guides for using AWS Glue and Hive Metastore catalogs when tiering Fluss streaming data to Apache Iceberg. This completes the Iceberg Data Lake Catalogs documentation suite under
docs/streaming-lakehouse/integrate-data-lakes/catalogs/.Brief change log
website/docs/streaming-lakehouse/integrate-data-lakes/catalogs/glue.md): Documents AWS IAM policy template, required catalog runtime JAR dependencies,server.yamlcluster configurations, Flink tiering service commands, and Amazon Athena query verification.website/docs/streaming-lakehouse/integrate-data-lakes/catalogs/hive.md): Documents Hive Metastore Thrift connection options, required Hadoop client and Hive runtime classpath dependencies,HADOOP_CLASSPATHconfiguration, Flink tiering commands, and Spark SQL query verification.website/docs/streaming-lakehouse/integrate-data-lakes/formats/iceberg.md): Added catalog-specific cross-links forhive(linking to Hive Metastore),glue(linking to AWS Glue), andrest(linking to Lakekeeper).Note: Changes are based on the existing
lakekeeper.mdas a template, and references were based on existing code and online/offline documentation. The actual AWS Glue/HMS implementations have not yet been tested by the developer.Tests
npm run buildto verify the page output and ensure there are no broken links (meeting Docusaurus build validation).http://localhost:3000/docs/next/streaming-lakehouse/integrate-data-lakes/formats/iceberg/.API and Format
This is a documentation-only change. It does not affect any public API or storage formats.
Documentation
This pull request introduces new documentation guides under the
Docusauruswebsite subfolder. No changes were made to code-level Javadocs.