[HUDI-8780][RFC-83] Incremental Table Service #12514

zhangyue19921010 · 2024-12-18T07:22:26Z

Change Logs

In Hudi, when scheduling Compaction and Clustering, the default behavior is to scan all partitions under the current table. When there are many historical partitions, such as 640,000 in our production environment, this scanning and planning operation becomes very inefficient. For Flink, it often leads to checkpoint timeouts, resulting in data delays.
As for cleaning, we already have the ability to do cleaning for incremental partitions.

This RFC will draw on the design of Incremental Clean to generalize the capability of processing incremental partitions to all table services, such as Clustering and Compaction.

Impact

no

Risk level (write none, low medium or high below)

low

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change. If not, put "none".

The config description must be updated if new configs are added or the default value of the configs are changed
Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
ticket number here and follow the instruction to make
changes to the website.

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

yuzhaojing · 2024-12-19T03:24:56Z

rfc/rfc-83/rfc-83.md

+
+### Work Flow for Incremental Table Service
+
+Table Service Planner


Currently we are using completion time, should we indicate here whether instant refers to request time or completion time?

Request time I believe. If using completion, it may miss some instant when multi-write.
Also get this getEarliestCommitToRetain can be referenced to CleanPlanner#getEarliestCommitToRetain

TheR1sing3un · 2024-12-19T03:48:52Z

rfc/rfc-83/rfc-83.md

+Table Service Planner
+1. Retrieve the instant recorded in the last completed table service commit as **INSTANT 1**.
+2. Calculate the current instant to be processed as **INSTANT 2**.
+3. Obtain all partitions involved from **INSTANT 1** to **INSTANT 2** as incremental partitions and perform the table service plan operation.


If we turn on the incremental table service mode, are the various flexible partition selection mechanisms now unavailable? Consider the following scenario:

on ts_0, write to two partitions: p_1 and p_2

on ts_1, a schedule compaction compaction with parittion-selection-strategy that only compacts p_2

on ts_2, write to p_2 again.

on ts_3 compaction will only process partitions written between ts_1 and ts_3. Compaction will still only merge p_2. When can a compaction occur that compacts p_1?

For common strategy, this various flexible partition selection mechanisms still works.
For IncrementalxxxxStrategy, this flexible partition selection mechanisms will apply to incremental fetched partitions

Also in IncrementalxxxxStrategy maybe we can record missing partitions in plan and Process the missing partitions together with the new fetched incremental partitions next time

danny0405 · 2024-12-19T05:21:10Z

rfc/rfc-83/rfc-83.md

+}
+```
+
+`EarliestCommitToReta` in clean commit meta


danny0405 · 2024-12-19T05:24:23Z

rfc/rfc-83/rfc-83.md

+Add `EarliestCommitToReta` in HoodieCommitMetadata extra meta MAP for clustering and compaction operation which are all written-commit
+
+```text
+{"name": "earliestCommitToRetain", "type": "string"}


earliestInstantToRetain or earliestCommitToRetain ?

maybe we can record this earliestInstantToRetain in clustering/compaction plan request meta file. So no need this change any more.

danny0405 · 2024-12-19T05:25:01Z

rfc/rfc-83/rfc-83.md

+{"name": "earliestCommitToRetain", "type": "string"}
+```
+
+We also need a unified interface/abstract-class to control the Plan behavior and Commit behavior of the TableService.


Can you elaborate why this is needed?

Use PartitionBaseTableServicePlanStrategy to control the behavior of getting partitions, filter partitions and generate table service plan etc.

Because we want to control the logic of partition acquisition, partition filtering, and plan generation through different strategies, in the first step, we need to use an abstraction to converge the logic of partition acquisition, partition filtering, and plan generation into the base strategy.

danny0405 · 2024-12-19T05:27:22Z

rfc/rfc-83/rfc-83.md

+   * @return
+   * @throws IOException
+   */
+  private List<String> getIncrementalPartitionPaths(Option<HoodieInstant> instantToRetain, TableServiceType type) {


shouldn't each table service executor already distinguish the service type? Maybe type can be eliminated.

danny0405 · 2024-12-19T05:28:03Z

rfc/rfc-83/rfc-83.md

+    return null;
+  }
+
+  public R buildCommitMeta() {


Not sure we need this.

no need actually. removed.

danny0405 · 2024-12-19T05:28:52Z

rfc/rfc-83/rfc-83.md

+### Work Flow for Incremental Table Service
+
+Table Service Planner
+1. Retrieve the instant recorded in the last completed table service commit as **INSTANT 1**.


Do we want to take care of the cases where INSTANT 1 already been archived?

We record EarliestCommitToRetain in the TableService Request metadata file and use it as the basis for retrieving incremental partitions.
Therefore, when Incremental Table Service is enabled, we should always ensure that there is a Clustering/Compaction request metadata in the active timeline.

Also we can use getAllpartitions as Cover-up plan

Therefore, when Incremental Table Service is enabled, we should always ensure that there is a Clustering/Compaction request metadata in the active timeline.

Not really, for cleaning this is true because there could be data quality issues of wrong files are cleaned, but Compaction and Clustering are just rewrites.

danny0405

@zhangyue19921010 Thanks for the contribution, let's eliminate the unnecessary refactoring first and focus on the core logic.

I kind of think we should expose the incremental partitions to the specific XXXStrategy class, because the strategy class can decide the partition filtering itself which should be very related.

Let's clarify the behavior for archived table service commits.

zhangyue19921010 · 2024-12-19T12:04:12Z

Thanks for your reviewing @danny0405 @yuzhaojing and @TheR1sing3un

I kind of think we should expose the incremental partitions to the specific XXXStrategy class, because the strategy class can decide the partition filtering itself which should be very related.

Totally agree, We need to implement different behaviors such as partition acquisition, partition filtering, and plan construction through different strategies, so that it is more flexible and controllable. But one of the prerequisites for doing so is the need for a unified abstraction of the above API, which is why a PartitionBaseTableServicePlanStrategy is designed.

Let's clarify the behavior for archived table service commits.

Maybe we can do some changes in archive service, such as when Incremental Table Service is enabled, we should always ensure that there is a Clustering/Compaction request metadata in the active timeline Also wen can use getAllpartitions as Cover-up plan

danny0405 · 2024-12-20T03:36:35Z

rfc/rfc-83/rfc-83.md

+
+### Abstraction
+
+Use `PartitionBaseTableServicePlanStrategy` to control the behavior of getting partitions, filter partitions and generate table service plan etc.


Maybe we name it IncrementalPartitionAwareStrategy to emphasize it is "incremental".

danny0405 · 2024-12-20T03:41:47Z

rfc/rfc-83/rfc-83.md

+   * Returns the earliest commit to retain from instant meta
+   */
+  public Option<HoodieInstant> getEarliestCommitToRetain() {
+    throw new UnsupportedOperationException("Not support yet");


The IncrementalPartitionAwareStrategy should be an user interface IMO, the only API we expose to user is the incremental partitions since last table service. So the logic of following should be removed:

generate plan (should be responsibility of the planner)

getEarliestCommitToRetain (should be responsibility of the planner within the plan executor)

And because the implementaion of compaction and clustering are quite different, maybe we just add two new interfaces: IncrementalPartitionAwareCompactionStrategy and IncrementalPartitionAwareClusteringStrategy

generate plan and getEarliestCommitToRetain is removed.

As for base abstraction, although the implementation of compaction and clustering are quite different, but for Partition Aware's Compaction and clustering, they both have the same partition processing logic, that is, first obtain the partition and then filter the partition, so maybe we can use one interface for both to control partition related operations. What do u think :)

In addition, Danny, what's your opinion for the logic of incremental partition acquisition?

Option1 : Record a metadata field in the commit to indicate where the last processing was done. The partition acquisition behavior under Option1 is more flexible.

Option2: Directly obtain the last completed table service commit time as the new starting point. Option2 is simpler and does not require modifying and processing commit metadata fields.

danny0405 · 2024-12-23T04:54:16Z

rfc/rfc-83/rfc-83.md

+   * @return
+   * @throws IOException
+   */
+  List<String> getIncrementalPartitionPaths(HoodieWriteConfig writeConfig, HoodieTableMetaClient metaClient);


Should we just add one interface List<String> filterPartitionPaths(HoodieWriteConfig writeConfig, List<String> allPartitionPaths, List<String> incrementalPartitionPaths); to that the strategy can decide which partition are choosed.

The getXXXPartitionPaths should belong to the scope of the executor/planner, let's move them out.

zhangyue19921010 · 2024-12-25T07:08:02Z

Hi @danny0405, as we discuss offline. All comments are addressed. PTAL. Thanks for your patience

zhangyue143 added 6 commits December 18, 2024 14:39

claim rfc-83 Incremental Table Service

24e4b7f

claim rfc-83 Incremental Table Service

ec54175

claim rfc-83 Incremental Table Service

ec123fc

rfc-83 under review

a40ddf5

rfc-83 under review

aadf356

rfc-83 under review

e32a04e

zhangyue19921010 requested review from danny0405 and yuzhaojing December 18, 2024 07:24

zhangyue19921010 mentioned this pull request Dec 18, 2024

[HUDI-8675] Flink schedule compaction with incremental partitions #12448

Open

4 tasks

github-actions bot added the size:M PR with lines of changes in (100, 300] label Dec 18, 2024

yuzhaojing reviewed Dec 19, 2024

View reviewed changes

TheR1sing3un reviewed Dec 19, 2024

View reviewed changes

danny0405 reviewed Dec 19, 2024

View reviewed changes

zhangyue143 added 3 commits December 19, 2024 19:01

update rfc-83

10f82e8

update rfc-83

c03db6f

update rfc-83

b5e7538

zhangyue143 added 2 commits December 19, 2024 20:06

update rfc-83

82b67d8

update rfc-83

17cd121

zhangyue19921010 requested a review from danny0405 December 20, 2024 02:09

update rfc-83

6d354e6

danny0405 reviewed Dec 20, 2024

View reviewed changes

update rfc-83

a439369

zhangyue19921010 requested a review from danny0405 December 20, 2024 11:16

danny0405 reviewed Dec 23, 2024

View reviewed changes

zhangyue143 added 3 commits December 25, 2024 14:53

update rfc-83

5797a60

update rfc-83

93cdd4c

update rfc-83

1c182a4

zhangyue19921010 requested a review from danny0405 December 25, 2024 07:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-8780][RFC-83] Incremental Table Service #12514

[HUDI-8780][RFC-83] Incremental Table Service #12514

zhangyue19921010 commented Dec 18, 2024

yuzhaojing Dec 19, 2024

zhangyue19921010 Dec 19, 2024 •

edited

Loading

TheR1sing3un Dec 19, 2024

zhangyue19921010 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024 •

edited

Loading

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 19, 2024

zhangyue19921010 Dec 19, 2024

danny0405 Dec 20, 2024

danny0405 left a comment

zhangyue19921010 commented Dec 19, 2024 •

edited

Loading

danny0405 Dec 20, 2024

zhangyue19921010 Dec 20, 2024

danny0405 Dec 20, 2024

zhangyue19921010 Dec 20, 2024

zhangyue19921010 Dec 20, 2024 •

edited

Loading

danny0405 Dec 23, 2024

zhangyue19921010 commented Dec 25, 2024


		### Work Flow for Incremental Table Service

		Table Service Planner


		### Abstraction

		Use `PartitionBaseTableServicePlanStrategy` to control the behavior of getting partitions, filter partitions and generate table service plan etc.

[HUDI-8780][RFC-83] Incremental Table Service #12514

Are you sure you want to change the base?

[HUDI-8780][RFC-83] Incremental Table Service #12514

Conversation

zhangyue19921010 commented Dec 18, 2024

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

Choose a reason for hiding this comment

zhangyue19921010 Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangyue19921010 Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danny0405 left a comment

Choose a reason for hiding this comment

zhangyue19921010 commented Dec 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangyue19921010 Dec 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zhangyue19921010 commented Dec 25, 2024

zhangyue19921010 Dec 19, 2024 •

edited

Loading

zhangyue19921010 Dec 19, 2024 •

edited

Loading

zhangyue19921010 commented Dec 19, 2024 •

edited

Loading

zhangyue19921010 Dec 20, 2024 •

edited

Loading