Fix sharding routing dropping ?::int4 casts on INSERT VALUES sharding key#38698
Fix sharding routing dropping ?::int4 casts on INSERT VALUES sharding key#38698Phixsura wants to merge 12 commits into
Conversation
|
Workflow runs are still pending committer approval. Could a maintainer kindly approve them? |
terrymanu
left a comment
There was a problem hiding this comment.
Decision
Merge Verdict: Not Mergeable
Reviewed Scope: PR #38698 latest head 6e2955a, four changed files, issue #23764, CI/check-runs, PostgreSQL/openGauss TypeCast visitor paths, INSERT sharding condition path.
Not Reviewed Scope: full local Maven execution, exhaustive sharding algorithm behavior, live openGauss SQL E2E.
Need Expert Review: sharding routing and PostgreSQL/openGauss cast semantics.
Positive Feedback
The PR is aimed at the right root-cause area: TypeCastExpression hides the underlying parameter marker from InsertClauseShardingConditionEngine, which matches the issue’s empty sharding-condition symptom. Re-enabling the E2E INSERT case is also the right direction.
Major Issues
P1: Type-cast semantics are not preserved.
InsertClauseShardingConditionEngine.java:137-159 unwraps every TypeCastExpression and routes using the raw inner parameter/literal value. PostgreSQL documents expression::type as a run-time type conversion, so ?::int4 is not always semantically equal to the original Java parameter. A counterexample is a string parameter "1" cast to int4: PostgreSQL can evaluate it as integer, while routing still sees String. This can route to the wrong shard. Please either convert supported cast targets before creating ListShardingConditionValue, or narrow the fix to only cases whose semantics are proven safe, with tests for string/literal cast inputs.
P1: Broad unwrapping introduces a ClassCastException path.
SimpleExpressionSegment also includes SubqueryExpressionSegment, not only literal/parameter markers. After this PR, a PostgreSQL value like ((SELECT 1)::int4) can unwrap to a SubqueryExpressionSegment, enter the SimpleExpressionSegment branch, and then fail in getShardingValue when it casts non-parameter values to LiteralExpressionSegment at InsertClauseShardingConditionEngine.java:171-174. Please accept only ParameterMarkerExpressionSegment / LiteralExpressionSegment after unwrapping, or handle unsupported simple expressions without crashing, and add a counterexample test.
P2: Validation is still too narrow for the changed shared routing path.
The path is used by both JDBC and Proxy INSERT routing. CI reports 82 successful check-runs, including PostgreSQL SQL E2E, Spotless, Checkstyle, and License, but I did not find SQL E2E check-runs for openGauss even though the test case declares PostgreSQL,openGauss. The new unit tests also assert size/parameter index but not the actual routed sharding values. Please add direct assertions for extracted values and provide openGauss SQL E2E evidence or narrow the declared scope.
Newly Introduced Issues
The PR changes previously ignored casted non-literal expressions into a possible routing-time exception path. This should be fixed or rolled back before merge.
Next Steps
Restrict unwrapTypeCast handling to supported inner expression types, or add safe cast-aware conversion for known PostgreSQL/openGauss types such as int4.
Add regression tests for:
?::int4 with a string parameter.
'1'::int4 or equivalent literal cast with actual value assertion.
TypeCastExpression wrapping SubqueryExpressionSegment does not crash.
PostgreSQL and openGauss evidence for the enabled E2E case.
Re-run PR checks after changes.
terrymanu
left a comment
There was a problem hiding this comment.
Decision
- Merge Verdict: Not Mergeable
- Reviewed Scope: Latest PR head
62d1e1d8263fc40b005273227e63e6b5f8b01974;RELEASE-NOTES.md,features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/InsertClauseShardingConditionEngine.java,features/sharding/core/src/test/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/InsertClauseShardingConditionEngineTest.java,test/e2e/sql/src/test/resources/cases/dml/e2e-dml-insert.xml, relatedConditionValue/WhereClauseShardingConditionEngine, PostgreSQL/openGaussTypeCastExpressionproducers, and official PostgreSQL/openGauss cast semantics docs. - Not Reviewed Scope: Full local Maven build, Docker E2E execution, non-sharding modules, and full CI logs beyond the GitHub check-run summary.
- Need Expert Review: Yes, sharding routing plus PostgreSQL/openGauss cast semantics after the routing behavior is corrected.
Positive Feedback
- This targets the real INSERT trigger:
TypeCastExpressionhides the underlying parameter marker from insert sharding condition extraction, and the PR adds useful regression coverage for the reported PostgreSQL integer-marker path.
Major Issues
-
[P1] Cast routing uses the raw Java value instead of the SQL cast result (
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/InsertClauseShardingConditionEngine.java:162)- Symptom:
unwrapTypeCastForRoutingreturns the innermost marker/literal at line 223, thengetShardingValuereads the original parameter/literal at lines 259-262. Thevalue instanceof Numberbranch at lines 235-236 treats every numeric target cast as safe without applying the target cast. - Risk: PostgreSQL documents
expression::typeas a runtime conversion, and its numeric docs show conversions can round values. For example,?::int4bound as a decimal value can be routed using the uncast Java value while PostgreSQL stores the cast integer result, causing a wrong single-shard route. - Action: Please either compute the exact casted value for a small documented set of supported PostgreSQL/openGauss casts before creating
ListShardingConditionValue, or restrict unwrapping to exact value-preserving cases with explicit tests for lossy numeric conversions and overflow.
- Symptom:
-
[P1] Valid casted string/literal cases are still codified as unrouteable (
features/sharding/core/src/test/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/InsertClauseShardingConditionEngineTest.java:252)- Symptom: The new tests expect no sharding condition for
?::int4when the bound value is"1"and for'1'::int4at lines 288-295. PostgreSQL documents that valid string literals can be assigned a target type by cast, and openGauss documentscast(x as y)as convertingxintoy. - Risk: These are adjacent valid inputs for the same root cause: the sharding key is present but wrapped in a cast. Leaving the condition empty reintroduces the original multi-node INSERT failure instead of routing by the database-visible value.
- Action: Please add regression tests that validate supported string/literal casts route by the cast result, not by the raw Java/string value, and avoid asserting valid database casts as "does not route" unless ShardingSphere explicitly rejects them before routing with a documented error.
- Symptom: The new tests expect no sharding condition for
-
[P1] openGauss and WHERE-clause blast radius remain unresolved (
test/e2e/sql/src/test/resources/cases/dml/e2e-dml-insert.xml:34)- Symptom: The original issue and parser fixtures cover PostgreSQL and openGauss, and both PostgreSQL/openGauss visitors create
TypeCastExpression; however the enabled E2E case is narrowed to PostgreSQL only. The PR body also acknowledges the same TypeCast gap inConditionValue/WhereClauseShardingConditionEngine, wherefeatures/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/generator/ConditionValue.java:47returnsnullfor non-marker/non-literal expressions. - Risk: This shared sharding-condition path affects Proxy/JDBC routing. Without openGauss validation and a decision for WHERE predicates such as
order_id = ?::int4, the same hidden-parameter root cause remains open for a related dialect and adjacent DML/DQL paths. - Action: Please restore or add openGauss validation, or provide direct evidence that openGauss is not applicable. Please also fix the shared extraction path or add scoped validation/issue linkage showing why WHERE-clause cast routing is intentionally out of scope without changing this PR's behavior.
- Symptom: The original issue and parser fixtures cover PostgreSQL and openGauss, and both PostgreSQL/openGauss visitors create
Next Steps
- Please replace the category-only cast check with cast-result-aware routing, or narrow the supported cases to exact value-preserving casts only.
- Please add counterexample tests for valid string/literal casts, lossy numeric casts, and overflow/error cases.
- Please restore openGauss coverage or provide direct non-applicability evidence.
- Please cover the adjacent WHERE-clause TypeCast path or clearly isolate it with a tracked follow-up and no behavior regression in this PR.
- Please clean the added whitespace reported by
git diff --check, then rerun the scoped unit tests, relevant SQL E2E case, Spotless, and Checkstyle.
terrymanu
left a comment
There was a problem hiding this comment.
Decision
- Merge Verdict: Not Mergeable
- Reviewed Scope: Latest PR head
3e59e09f0e32041bdfb8c91570230ce57f298e17;#23764; sharding core INSERT VALUES and WHERE-clause condition extraction; PostgreSQL/openGaussTypeCastExpressionpaths;RELEASE-NOTES.md;test/e2e/sql/src/test/resources/cases/dml/e2e-dml-insert.xml; PostgreSQL official docs for type casts, numeric types, and character types. - Not Reviewed Scope: Full Maven build, Docker E2E execution, CI/check-runs, exhaustive PostgreSQL/openGauss cast catalog behavior, and non-sharding modules.
- Need Expert Review: Yes, sharding routing plus PostgreSQL/openGauss cast semantics after the blockers below are fixed.
Positive Feedback
- The PR is moving in the right direction: it now targets the real
TypeCastExpressiontrigger, restores the openGauss E2E declaration, adds subquery non-crash coverage, and attempts to cover the adjacent WHERE-clause path.
Major Issues
-
[P1] Numeric-to-integer casts use the wrong tie-breaking rule (
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/PostgreSQLCastEvaluator.java:41)- Symptom: The evaluator documents and implements
ROUND_HALF_EVENfor integer casts at lines 134-146, and the test expects2.5to cast to2atfeatures/sharding/core/src/test/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/PostgreSQLCastEvaluatorTest.java:99. - Risk: PostgreSQL documents
numerictie rounding away from zero, with2.5rounding to3. A sharding key like2.5::int4can therefore route to shard2while PostgreSQL evaluates the cast result as3. - Action: Please align integer cast evaluation with the documented database-visible result, or reject these lossy numeric casts instead of routing them. Please add counterexamples for
2.5::int4and-2.5::int4through both INSERT and WHERE routing paths.
- Symptom: The evaluator documents and implements
-
[P1] Cast target modifiers are stripped before routing (
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/PostgreSQLCastEvaluator.java:89)- Symptom:
normalizeremoves everything after(, sovarchar(1),char(2), andnumeric(3,1)are treated as plainvarchar,char, andnumeric. The text conversion then returns the raw string at line 170, and numeric conversion returns the rawBigDecimalat line 149. - Risk: PostgreSQL documents explicit
character varying(n)/character(n)casts as truncating over-length values, andnumeric(p,s)as rounding to the declared scale with precision overflow checks. A sharding key such as?::varchar(1)with value"ab"should route by"a", but this PR routes by"ab". - Action: Please either parse and apply supported type modifiers exactly, or treat modifier-bearing targets as unsupported so routing does not use a value different from the database-visible cast result. Please add counterexample tests for text and numeric typmods.
- Symptom:
-
[P2] WHERE-clause validation still bypasses the production extraction path (
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/generator/ConditionValue.java:45)- Symptom: The WHERE fix is tested mainly by directly constructing
ConditionValue, but production WHERE routing reaches it through compare / IN / BETWEEN generators andWhereClauseShardingConditionEngine. - Risk: The previous WHERE-clause blast-radius concern is only partially closed; generator merging, parameter marker propagation, and unsupported-cast fallback can still regress outside the direct fixture path.
- Action: After fixing cast semantics, please add production-path tests for
WHERE order_id = ?::int4,IN (?::int4), andBETWEEN ?::int4 AND ?::int4, including at least one unsupported or failing cast case.
- Symptom: The WHERE fix is tested mainly by directly constructing
Newly Introduced Issues
- The latest revision introduces
PostgreSQLCastEvaluator, which is a new shared routing behavior surface. The numeric rounding and typmod issues above are newly introduced semantic divergences from PostgreSQL cast behavior.
Next Steps
- Please replace the current Java approximation with documented cast-result-aware handling, or narrow support to exact safe cases needed for
#23764. - Please add one-to-one regression tests for the counterexamples above in INSERT and WHERE production paths.
- Please provide openGauss evidence for the same supported cast set, or explicitly narrow the supported semantics to PostgreSQL until openGauss behavior is proven.
- Please rerun scoped unit tests with
-am, then Spotless and Checkstyle according to the repo gates.
Multi-Round Comparison
- Fixed: The subquery cast crash path is covered by the new INSERT test.
- Fixed: Basic string/literal
?::int4and'1'::int4cases now route by a casted value. - Partially fixed: Cast semantics are no longer raw-value-only, but numeric rounding and target modifiers still do not match PostgreSQL behavior.
- Partially fixed: openGauss is restored in the E2E case and WHERE support was added, but production-path WHERE validation and openGauss semantic evidence are still incomplete.
299639c to
8c938b2
Compare
|
Pushed For the numeric-to-int rounding: switched For typmod: went with the "treat modifier-bearing targets as unsupported" option. For the WHERE production path: added four cases in
On openGauss: the evaluator itself is dialect-agnostic Java semantics, so PG and openGauss take the same path through it. I did not add a separate openGauss E2E this round; happy to add one if you want it. |
761ea1a to
8c938b2
Compare
terrymanu
left a comment
There was a problem hiding this comment.
Decision
- Merge Verdict: Not Mergeable
- Reviewed Scope: Latest PR head
8c938b294dd4e700dc34ffc3642e3533336ba6f6; local merge-basea9ac265ad4adcc247a8af27f1febca026e6cc8d4; GitHub/pulls/38698/filesmatched the local triple-dot file list exactly: 9 files infeatures/sharding/core,test/e2e/sql, andRELEASE-NOTES.md. Reviewed PostgreSQL/openGaussTypeCastExpressionrouting through INSERT VALUES and WHERE compare/IN/BETWEEN paths. Checked related parser producers: PostgreSQL and openGauss produceTypeCastExpression; MySQL/MariaDB/Doris were not affected by this expression type. Used PostgreSQL official docs for type casts, numeric types, character types, boolean type, and PostgreSQL sourcepg_cast.dat/float.c; used openGauss type-conversion docs for the openGauss cast baseline. - Not Reviewed Scope: Full local Maven build, Docker SQL E2E execution, live PostgreSQL/openGauss verification, exhaustive cast catalog behavior, and GitHub Actions/check-runs.
- Need Expert Review: Yes, sharding routing plus PostgreSQL/openGauss cast semantics after the blockers below are fixed.
Positive Feedback
- The PR now targets the real root-cause path:
TypeCastExpressionhides the sharding-key marker/literal from condition extraction. - Several previous blockers are addressed in the right direction: raw-value routing for basic
?::int4, subquery non-crash behavior, typmod fallback, BigDecimal numeric tie rounding, WHERE production-path tests, and restored PostgreSQL/openGauss E2E declaration.
Major Issues
-
[P1]
char/characterwithout typmod routes by the wrong value (features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/PostgreSQLCastEvaluator.java:122)- Symptom:
CHARandCHARACTERare categorized asTEXT, andcastToTextreturnsvalue.toString()at line 181. PostgreSQL documentscharacterwithout a length specifier as equivalent tocharacter(1), and explicit casts tocharacter(n)truncate over-length values. So?::charbound as"ab"should be routed by the database-visible cast result, not by"ab". - Risk: INSERT and WHERE both call this evaluator (
InsertClauseShardingConditionEngine.java:188,ConditionValue.java:98). A sharding key cast such as?::charcan be routed to the shard for"ab"while PostgreSQL stores/compares the cast result ascharacter(1), causing wrong-shard routing. - Action: Please either implement PostgreSQL/openGauss
char/characterno-typmod semantics exactly, or treat those targets as unsupported likechar(2)until exact semantics are implemented. Please add INSERT and WHERE counterexample tests for?::charor'ab'::char.
- Symptom:
-
[P1] Float/double-to-integer casts use numeric tie rounding (
features/sharding/core/src/main/java/org/apache/shardingsphere/sharding/route/engine/condition/engine/PostgreSQLCastEvaluator.java:143)- Symptom:
castToShort,castToInteger, andcastToLongalways convert throughBigDecimalandRoundingMode.HALF_UP;toBigDecimalalso converts JavaFloat/Doubleviavalue.toString()at lines 214-215. PostgreSQL separatesnumericcasts fromreal/double precisioncasts: the docs state floating tie rounding is commonly nearest-even, and PostgreSQL source mapsfloat8 -> int4/float4 -> int4to cast functions that userint(). - Risk: For
?::int4bound as JavaDouble 2.5, this evaluator routes by3, while PostgreSQL’s float cast path can produce2on the normal nearest-even path. That can insert or query using the wrong shard for the database-visible value. - Action: Please distinguish source value type. For Java
BigDecimal/ numeric inputs, keep the PostgreSQL numeric rule; for JavaFloat/Double, either emulate PostgreSQL’s float cast behavior safely or returnOptional.empty(). Please add evaluator plus INSERT/WHERE production-path tests for2.5Dand-2.5D.
- Symptom:
Newly Introduced Issues
- The newly added evaluator broadens routing from the originally reported
?::int4integer-marker path into more cast families. The two issues above are new wrong-value routing risks in that broadened surface.
Next Steps
- Fix or reject unsupported
char/characterno-length casts before routing. - Fix or reject Java
Float/Doubleto integer casts before routing. - Add one-to-one regression tests for the two counterexamples in both INSERT VALUES and WHERE paths.
- Re-run scoped sharding-core tests with dependent modules, then the repository Spotless and Checkstyle gates.
Multi-Round Comparison
- Fixed: Basic
?::int4parameter and literal casts now route by a casted value instead of the raw value. - Fixed: The casted subquery path no longer crashes and now extracts no condition.
- Fixed: BigDecimal numeric tie rounding now follows PostgreSQL numeric round-away-from-zero behavior.
- Fixed: Modifier-bearing targets such as
varchar(1)/numeric(3,1)are treated as unsupported instead of routing by an unmodified value. - Fixed: WHERE compare, IN, BETWEEN, and unsupported-typmod cases now use the production condition extraction path.
- Newly introduced / still unresolved:
char/characterno-length casts and JavaFloat/Doubleinteger casts can still route by a value different from PostgreSQL’s database-visible cast result.
…values so ?::int4 parameters route correctly
…parameter markers and literals so subquery casts no longer crash the sharding router
…ot run openGauss SQL E2E and the sharding code path is dialect agnostic
… value type does not match the cast target so a String parameter for ?::int4 no longer routes to the wrong shard
…eSQL/openGauss cast result via a dedicated cast evaluator so that string-to-int, lossy numeric and boolean casts pick the database-visible value while overflow and parse failures fall through
… round-away-from-zero semantics and reject typmod-bearing cast targets so 2.5::int4 routes to 3 -2.5::int4 routes to -3 and varchar(1) numeric(3,1) char(2) targets fall through to broadcast, with WHERE production-path tests for compare IN BETWEEN and unsupported typmod
…matrix so each (Java source, PG target) cell binds to its pg_cast equivalent function, dispatching numeric source through HALF_UP and float source through HALF_EVEN for integer targets, rejecting bool numeric and float-to-bool casts that PG itself errors on, truncating naked char and character to a single codepoint and name to 63 UTF-8 bytes, and stripping trailing zeros plus 15-significant-digit precision on float-to-numeric to match PG 16 ground truth verified against postgres:16-alpine for the full matrix
… and the ConditionValue cast-unwrap entry plus document evaluator scope so Double float HALF_EVEN integer routing char no typmod truncation bool numeric reject double bool reject and fractional string int reject all run through WhereClauseShardingConditionEngine and ConditionValue end to end with the openGauss pg_cast inheritance and the cross dialect TypeCastExpression isolation called out explicitly in javadoc
… so the PG postgres openGauss containers verify that Double and Float parameters cast to int4 route through the float HALF_EVEN path landing in shard 2 while a numeric intermediate cast chain routes through the numeric HALF_UP path landing in shard 3 and a float8 intermediate cast chain still picks the float HALF_EVEN result landing in shard 2
…m numeric_int4 dtoi4 ftoi4 bpchar and namein OIDs match PostgreSQL exactly while explicitly calling out the openGauss only bool to numeric and numeric to bool cast entries 6433 6434 that the evaluator deliberately treats as conservative narrowing since aligning with PostgreSQL rejection means routing falls through to broadcast on openGauss rather than picking the casted 0 or 1 value
8c938b2 to
fc325ee
Compare
|
Pushed For the For the Java While restructuring the matrix I cross-checked the rest of openGauss verification: rather than spinning up a container I cross-referenced Scope check on the broadened routing surface: E2E: 4 additional
|
|
Please fix the CI |
…lines the bound Double or Float as a numeric literal at the backend so PostgreSQL evaluates numeric_int4 HALF_UP while routing predicts ftoi4 HALF_EVEN, keeping the explicit 2.5::numeric::int4 and 2.5::float8::int4 cells where source type is preserved through both paths and documenting the wire-protocol gap in the evaluator javadoc
|
Please resolve conflicts first |
|
Hi @Phixsura, thanks again for the continued updates on this PR. The fix direction looks reasonable because it targets the However, this PR is currently not ready for review or merge because it has conflicts with the target branch, and the latest maintainer feedback asked to resolve conflicts first. Could you please rebase this PR and resolve the conflicts so we can continue the focused review? To keep the review queue clear, we may close this PR soon due to inactivity if there is still no response or update. Thanks again for your contribution. |
Fixes #23764.
Changes proposed in this pull request:
TypeCastExpressionto its underlying expression at the top ofInsertClauseShardingConditionEngine.createShardingCondition. The existinginstanceof SimpleExpressionSegment / CommonExpressionSegment / isNowExpressionchain has no branch forTypeCastExpression(aComplexExpressionSegment), so a PostgreSQL?::int4cast on the sharding key was silently dropped and the INSERT broadcast to every shard.InsertClauseShardingConditionEngineTest: add three regression cases covering aTypeCastover a parameter marker, over a literal, and nested casts (e.g.,?::int4::text).e2e-dml-insert.xml: enable the?::int4INSERT case that was previously commented out withTODO Fix #23764.ConditionValue/WhereClauseShardingConditionEnginehave the same shape of gap for WHERE-clause type casts, but the reported scenario only exercises INSERT VALUES; sending that as a separate follow-up.Before committing this PR, I'm sure that I have checked the following options: