[Spark] Restrict partition-like data filters to whitelist of known-good expressions #3872

chirag-s-db · 2024-11-12T19:04:22Z

Which Delta project/connector is this regarding?

Description

Currently, we try to rewrite any arbitrary expression as partition-like. To avoid having to repeatedly remove known-bad expressions, start with a whitelist (to be expanded) of known-good expressions that can safely be rewritten.

This change will fix an existing issue where partition-like filters are generated for a non-skipping eligible column. This partition-like filter will throw an analysis exception because these referenced columns aren't found in the stats. This issue was originally missed (and is a difference in behavior vs. partition filters) because partitioning isn't allowed on non-atomic types (or string types), so we missed adding this additional match.

How was this patch tested?

See test changes.

Does this PR introduce any user-facing changes?

No.

chirag-s-db · 2024-11-22T22:25:37Z

@scovich Could you take a look at this PR? Thanks!

chirag-s-db added 2 commits November 12, 2024 10:59

fix

82da180

fix

0ed4e7b

chirag-s-db changed the title ~~[Spark] Don't apply partition-like data filters to ineligible columns~~ [Spark] Restrict partition-like data filters to whitelist of known-good expressions Nov 22, 2024

chirag-s-db added 2 commits November 22, 2024 14:00

fix

b569d3d

remove unused import

62d4216

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Spark] Restrict partition-like data filters to whitelist of known-good expressions #3872

[Spark] Restrict partition-like data filters to whitelist of known-good expressions #3872

chirag-s-db commented Nov 12, 2024 •

edited

Loading

chirag-s-db commented Nov 22, 2024

[Spark] Restrict partition-like data filters to whitelist of known-good expressions #3872

Are you sure you want to change the base?

[Spark] Restrict partition-like data filters to whitelist of known-good expressions #3872

Conversation

chirag-s-db commented Nov 12, 2024 • edited Loading

Which Delta project/connector is this regarding?

Description

How was this patch tested?

Does this PR introduce any user-facing changes?

chirag-s-db commented Nov 22, 2024

chirag-s-db commented Nov 12, 2024 •

edited

Loading