Skip to content

Add Admin API and sys procedure to describe bucket metadata #3436

@raoluSmile

Description

@raoluSmile

Search before asking

  • I searched in the issues and found nothing similar.

Motivation

Currently, bucket metadata can be inspected indirectly from internal metadata or client-side metadata, but there is no formal Admin API or SQL procedure for users and operators to describe bucket distribution.

This makes it hard to inspect bucket-level metadata such as leader, leader epoch, replicas, ISR, and partition-specific bucket assignment through a stable public interface.

Solution

Add a formal Admin API and SQL procedure for describing bucket metadata.

The proposed Admin APIs are:

CompletableFuture<List<BucketInfo>> describeBuckets(TablePath tablePath);

CompletableFuture<List<BucketInfo>> describeBuckets(
        TablePath tablePath,
        PartitionSpec partitionSpec);

The returned BucketInfo should include:

  • table path
  • table id
  • partition id/name, if applicable
  • bucket id
  • leader id
  • leader epoch
  • replicas
  • ISR

Expose the same functionality through Flink SQL procedure:

 CALL sys.describe_buckets('db.table');

 CALL sys.describe_buckets('db.table', 'partition_key=partition_value');

For non-partitioned tables, the procedure returns one row per bucket. For partitioned tables, the overload with PartitionSpec returns bucket metadata for the specified partition.

Scope

This issue focuses on:

  • Admin API
  • BucketInfo DTO
  • server RPC implementation
  • partition-specific bucket description
  • CALL sys.describe_buckets
  • Admin IT and Flink procedure IT

CLI integration is intentionally left out of scope and can be handled in a follow-up issue/PR.

Relation to existing work

This is related to #3360, but focuses on a different layer.

#3360 adds CLI table/database commands and displays bucket distribution from client metadata. This issue proposes a formal Admin/RPC API and SQL procedure for describing bucket metadata, including partition-scoped queries.

CLI support can be added later, preferably on top of the CLI framework introduced by #3360 if it is merged.

Anything else?

No response

Willingness to contribute

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions