Skip to content

HIVE-29413: Avoid code duplication by updating getPartCols method for iceberg tables#6337

Closed
ramitg254 wants to merge 2 commits intoapache:masterfrom
ramitg254:HIVE-29413
Closed

HIVE-29413: Avoid code duplication by updating getPartCols method for iceberg tables#6337
ramitg254 wants to merge 2 commits intoapache:masterfrom
ramitg254:HIVE-29413

Conversation

@ramitg254
Copy link
Copy Markdown
Contributor

@ramitg254 ramitg254 commented Feb 24, 2026

What changes were proposed in this pull request?

updated getPartCols method and introduced a boolean parameter in it to handle partition keys needed for non native tables and to get rid of this kind of code redundancy in some places:

table.hasNonNativePartitionSupport() ?
            table.getStorageHandler().getPartitionKeys(table) :
            table.getPartCols();

Why are the changes needed?

to avoid code redundancy and have a common method to retrieve keys for both native and non native tables.

Does this PR introduce any user-facing change?

No

How was this patch tested?

locally

org.apache.hadoop.hive.ql.metadata.Table tbl = hive.getTable(database, table);
FetchWork work;
if (!tbl.getPartCols().isEmpty()) {
if (!tbl.getSupportedPartCols().isEmpty()) {
Copy link
Copy Markdown
Member

@deniskuzZ deniskuzZ Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please keep the old method signature

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

allColumns.addAll(table.getCols());
allColumns.addAll(table.getPartCols());
return allColumns;
return new ArrayList<>(table.getAllCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to clone it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was already cloning it so I just replaced

List<FieldSchema> allColumns = new ArrayList<>();
allColumns.addAll(table.getCols());
allColumns.addAll(table.getPartCols());

with
new ArrayList<>(table.getAllCols());


table.setFields(oldTable.getCols());
table.setPartCols(oldTable.getPartCols());
table.setPartCols(oldTable.getNativePartCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is super confusing!!!

 table.setPartCols <- oldTable.getNativePartCols

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kept the old signature

partitionColumns = table.hasNonNativePartitionSupport() ?
table.getStorageHandler().getPartitionKeys(table) :
table.getPartCols();
partitionColumns = table.getSupportedPartCols();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not make it like partitionColumns = table.getPartCols() ?

Copy link
Copy Markdown
Contributor Author

@ramitg254 ramitg254 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please check #6337 (comment)

table.getCols() : partition.getCols());
if (!desc.isFormatted()) {
cols.addAll(table.getPartCols());
cols.addAll(table.getNativePartCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the old signature

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

public List<FieldSchema> getPartCols() {
public List<FieldSchema> getSupportedPartCols() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should have generic API - getPartCols

Copy link
Copy Markdown
Contributor Author

@ramitg254 ramitg254 Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updating getPartCols has direct effect on implementations related to alter statements and stats autogather implementation in which some of them are limited to native ones only ( like getPartCols() should return empty list for iceberg tables even if they have keys which is required for those implementations) which will introduce more ifs at those places and even if after all that if we get a green run we can't be sure that everything would be unaffected as we can't be sure that all possible scenarios affected from it are covered by tests.

so IMHO, I think we should go for this separate api for generic use cases and older one for native use cases.

I tried to explain my thinking and one such scenario more elaboratively in this comment: #6337 (comment)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deniskuzZ can you please validate this explanation for separate api seems reasonable or are we still leaning towards having a common api with updated implementation which can lead to further if else blocks introduction on those kind of areas mentioned above

f_list.addAll(getCols());
f_list.addAll(getPartCols());
return f_list;
ArrayList<FieldSchema> allCols = new ArrayList<>(getCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code to Interface

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Map<String, String> colTypes = new HashMap<>();
List<FieldSchema> partitionKeys = table.hasNonNativePartitionSupport() ?
table.getStorageHandler().getPartitionKeys(table) : table.getPartitionKeys();
List<FieldSchema> partitionKeys = table.getSupportedPartCols();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prev used api is table.getPartitionKeys()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes but it should be better to use the current one as table.getPartitionKeys() throws null when no keys are there which is different from table.getStorageHandler().getPartitionKeys(table) used here which gives empty arraylist for no keys.
so with current usage it will return empty list in both scenarios and rule out null pointer exception case

}
} else {
partColSchema.addAll(tbl.getPartCols());
partColSchema.addAll(tbl.getNativePartCols());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep the method signature unchanged

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@sonarqubecloud
Copy link
Copy Markdown

@ramitg254
Copy link
Copy Markdown
Contributor Author

closing this pull request for now as we should have single method and other places where it is getting used should be updated in such a way that it should be inclusive of updated logic

@deniskuzZ
Copy link
Copy Markdown
Member

hi @ramitg254, since you put this activity on-hold, would you mind if I take it over?

@ramitg254
Copy link
Copy Markdown
Contributor Author

Hi, I am still working on it on this pr: #6413, it is in wip right now as I am resolving test failures corresponding to that.
although it's in wip, I'll appreciate your review on it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants