[SPARK-57472][SQL] Make FileTable.mergedOptions merge table and relation options case-insensitively#56520
Draft
matthewbayer wants to merge 1 commit into
Draft
Conversation
…relation options case-insensitively
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
FileTable.mergedOptionsmerges aFileTable's own options with the options carried by the table operation (the relation), with the operation's options taking precedence. This PR makes that merge case-insensitive so the operation value deterministically wins on a key that differs only in case.Previously the merge used a case-sensitive
++:If the table and the operation set the same option with different key casing (e.g.
lineSepvslinesep), both entries survive the++, andCaseInsensitiveStringMap's constructor then collapses them byHashMapiteration order — picking an arbitrary winner and silently dropping the other (logging"Converting duplicated key ... into CaseInsensitiveStringMap").The fix drops any table option the operation already sets (case-insensitively, via
CaseInsensitiveStringMap.containsKey) before merging:Why are the changes needed?
The documented "operation options take precedence" behavior (asserted by the existing
FileTableSuitetest added in SPARK-49519 / SPARK-50287) is not honored when the two option maps use different casing for the same key. The winner is determined byHashMapiteration order rather than precedence, which is non-deterministic and can silently drop the intended value.Does this PR introduce any user-facing change?
No behavior is intended to change for correctly-cased options. For options that differ only in case between the table and the operation, the operation value now deterministically wins (previously the winner was arbitrary). This only affects unreleased
master.How was this patch tested?
Extended the existing SPARK-49519 / SPARK-50287
FileTableSuitetest with a case-variant case (tablelineSepvs operationlinesep) across all file-based data sources, asserting the operation value wins for both read (newScanBuilder) and write (newWriteBuilder) and that the colliding table key does not survive as a separate entry.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.8)