[spark] Support nested fields in SparkFilterConverter by Mesut-Doner · Pull Request #8399 · apache/paimon

Mesut-Doner · 2026-06-30T15:37:46Z

Purpose

Currently, SparkFilterConverter throws an UnsupportedOperationException when it encounters dot-separated nested field paths (e.g. a.b.c). This limits predicate pushdown capabilities in Spark when queries filter on nested Struct types.

This PR implements nested field support in SparkFilterConverter so that Spark V1 Filter objects on nested fields are correctly converted into Paimon Predicate structures:

Nested Schema Resolution: Added getNestedFieldType(...) and resolveField(...) to recursively walk the RowType schema along dot-separated path components to find the correct nested field's DataType.
Transform-based Predicate Conversion: Refactored the converter branches (EqualTo, In, IsNull, IsNotNull, GreaterThan, etc.) to use FieldTransform(FieldRef) and call the corresponding PredicateBuilder methods that accept Transform instead of index-based builders.
Literal Conversion: Updated convertLiteral(...) and convertString(...) to correctly resolve nested field paths and convert literals to their matching leaf data type.

Tests

Added a new test case testNestedField() in SparkFilterConverterTest to verify that nested struct field predicates are successfully converted to Paimon predicates.

JingsongLi · 2026-07-01T03:05:17Z

+            }
+        }
+
+        Transform transform = new FieldTransform(new FieldRef(topLevelIndex, field, fieldType));


This does not actually evaluate the nested field. FieldTransform only reads InternalRowUtils.get(row, fieldRef.index(), fieldRef.type()), so for a filter like a.b = 1 this FieldRef still reads top-level column a (index 0) but with b type. That can make predicate.test(row) and statistics pruning read the struct column as an int/string instead of traversing into b. The new test only checks toString(), so it misses this runtime behavior. Please add a real nested-field transform/path traversal, or keep these filters unsupported until evaluation and stats pruning can handle nested paths correctly.

[spark] Support nested fields in SparkFilterConverter

85017e3

Mesut-Doner force-pushed the spark_nested_fields branch from 990ec18 to 85017e3 Compare June 30, 2026 15:39

Mesut-Doner changed the title ~~Spark nested fields~~ [spark] Support nested fields in SparkFilterConverter Jun 30, 2026

JingsongLi reviewed Jul 1, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[spark] Support nested fields in SparkFilterConverter#8399

[spark] Support nested fields in SparkFilterConverter#8399
Mesut-Doner wants to merge 1 commit into
apache:masterfrom
Mesut-Doner:spark_nested_fields

Mesut-Doner commented Jun 30, 2026 •

edited

Loading

Uh oh!

JingsongLi Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Mesut-Doner commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Tests

Uh oh!

JingsongLi Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Mesut-Doner commented Jun 30, 2026 •

edited

Loading