[flink] Add missing predicat converters in PredicateConverter#8395
[flink] Add missing predicat converters in PredicateConverter#8395Mesut-Doner wants to merge 2 commits into
Conversation
- Fix typo: child.get(0) -> children.get(0) in IS_NOT_TRUE branch - Fix missing dot before toString() in SIMILAR branch (compile error) - Fix BinaryString: pass BinaryString.fromString(pattern) to builder.like instead of a raw String - Rewrite convertSimilarToRegex as convertSimilarToLike: produce a SQL LIKE pattern (not a Java regex) so that the downstream Like function processes it correctly; use backslash as the output escape char to match Like's default escape convention - Fix escape-sequence handling: escaped _ and % become \_ / \% (literals), escaped escape char becomes the literal char itself - Throw UnsupportedExpression for SIMILAR TO-only features (character classes [...], alternation |, quantifiers * + ?, grouping ()) that have no SQL LIKE equivalent - Add unit tests: testConvertSimilarToLike covers pattern pass-through, escape sequences and unsupported-feature rejection; testSimilarExpression* tests cover the full predicate path including row-level filtering
41af127 to
ac0069c
Compare
| } else if (func == BuiltInFunctionDefinitions.IS_NOT_TRUE) { | ||
| FieldReferenceExpression fieldRefExpr = | ||
| extractFieldReference(children.get(0)).orElseThrow(UnsupportedExpression::new); | ||
| return builder.notEqual(builder.indexOf(fieldRefExpr.getName()), Boolean.TRUE); |
There was a problem hiding this comment.
This changes the NULL semantics of IS NOT TRUE. In SQL, x IS NOT TRUE is true for both FALSE and NULL, but Paimon NotEqual returns false when the field value is null (PredicateTest.testNotEqual covers this). With this conversion, a filter like WHERE bool_col IS NOT TRUE can incorrectly prune rows where bool_col is NULL. This needs to be represented as isNull(field) OR equal(field, false) (or left unsupported) and should have a test for the NULL row case.
| // SIMILAR TO wildcards are the same as SQL LIKE wildcards | ||
| like.append(c); | ||
| } else if (c == '[' || c == '|' || c == '(' || c == ')' | ||
| || c == '*' || c == '+' || c == '?') { |
There was a problem hiding this comment.
This unsupported-feature check is incomplete for SIMILAR TO. Calcite/Flink treat { and } as SIMILAR special characters as well (quantifiers, e.g. a{2}), but this conversion lets them pass through as literal LIKE characters. That means a real SIMILAR predicate such as col SIMILAR TO a{2} can be pushed down as a LIKE pattern matching the literal string a{2} instead of aa, which is incorrect filtering. Please reject all SIMILAR-only metacharacters that cannot be represented by Paimon LIKE, including { and } (and add a regression test for such a pattern).
…ling Address review feedback on apache#8395: - IS_NOT_TRUE now returns true for NULL rows (isNull OR equal-false) instead of NotEqual, which incorrectly evaluated to false for NULL, causing rows to be wrongly pruned. - convertSimilarToLike now rejects '{' and '}' as unsupported SIMILAR TO-only quantifier syntax (e.g. a{2}), preventing such patterns from being pushed down as literal LIKE matches. Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
…ling Address review feedback on apache#8395: - IS_NOT_TRUE now returns true for NULL rows (isNull OR equal-false) instead of NotEqual, which incorrectly evaluated to false for NULL, causing rows to be wrongly pruned. - convertSimilarToLike now rejects '{' and '}' as unsupported SIMILAR TO-only quantifier syntax (e.g. a{2}), preventing such patterns from being pushed down as literal LIKE matches.
7eea7a0 to
1788a61
Compare
Purpose
implement the missing Flink predicate converters that were marked as TODO
Tests