fix: preserve clustering on managed Iceberg incremental models with partition_by#1496
Merged
sd-db merged 6 commits intoJun 5, 2026
Merged
Conversation
…artition_by Managed Iceberg stores partition_by as liquid clustering keys server-side, so the incremental reconciler computed an empty desired clustering (it only read liquid_clustered_by) while reading the existing clustering from the live table, producing a phantom diff that issued ALTER TABLE ... CLUSTER BY NONE after the first incremental MERGE and silently wiped the clustering. Treat partition_by as the desired clustering for managed Iceberg so desired matches what the server stores and no spurious wipe occurs. Closes #1495
7a791e4 to
0b75124
Compare
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||
jprakash-db
approved these changes
Jun 4, 2026
Collaborator
Author
|
/integration-test |
|
Integration tests dispatched for PR #1496 by @sd-db. Track progress in the Actions tab. |
The create path casefolds table_format (parse_model._get), so a model written as "Iceberg"/"ICEBERG" was resolved as Iceberg at create time but missed by the partition_by-as-clustering detection, re-opening the CLUSTER BY NONE wipe. Casefold before comparing so detection matches the create path.
|
Integration results for PR #1496 — UC cluster ✅ success · SQL warehouse ❌ failure · All-purpose cluster ✅ success · Shard coverage ✅ success |
|
Integration results for PR #1496 — UC cluster ✅ success · SQL warehouse ✅ success · All-purpose cluster ✅ success · Shard coverage ✅ success |
…g regression The regression test already verifies clustering survival via DESCRIBE DETAIL on the final table state, which is the actual server-side behavior. The negative log-string assertion checked an implementation detail of the emitted SQL and is redundant, so replace run_dbt_and_capture with a plain run_dbt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #1495
Description
Managed Iceberg incremental models configured with
partition_bysilently lost all clustering after the first incremental run — no error, just degraded queries.Root cause. Managed Iceberg normalizes
PARTITION BYinto liquid-clustering keys server-side, so the live table reportsclusteringColumns=[business_date]. ButLiquidClusteringProcessor.from_relation_configread onlyliquid_clustered_by(→ desired[]), so the reconciler saw a phantom removal and issuedALTER TABLE … CLUSTER BY NONE. SameCLUSTER BY NONEmechanism as #805 (Delta, errored loudly); this is managed Iceberg and fails silently.Fix. For managed Iceberg with no
liquid_clustered_by, treatpartition_byas the desired clustering so it matches what the server stores — no phantom diff, no wipe. Detection mirrorstblproperties.py.Testing
test_liquid_clustering.py): managed-Icebergpartition_by→ clustering, plus Delta/UniForm negatives.TestIcebergIncrementalPartitionClustering): clustering survives the incremental MERGE, noCLUSTER BY NONE. Fails pre-fix, passes post-fix.Checklist
CHANGELOG.mdand added information about my change to the "dbt-databricks next" section.