-
Notifications
You must be signed in to change notification settings - Fork 189
Description
Bug Report: TABLE_OR_VIEW_NOT_FOUND on constraint DROP during V2 incremental materialization
Describe the bug
When using use_materialization_v2: true with incremental models that have primary key constraints defined in schema YAML, every incremental run issues ALTER TABLE ... DROP CONSTRAINT followed by ALTER TABLE ... ADD CONSTRAINT — even when nothing has changed. The V2 constraint diff (get_changeset()) always detects a difference between the existing constraint (as reported by Databricks) and the desired constraint (from YAML config), regardless of whether the constraint is explicitly named or unnamed. This results in a DROP + ADD cycle on every single incremental run.
When running with multiple threads (e.g. threads: 8), multiple models issue concurrent ALTER TABLE ... DROP CONSTRAINT statements against tables in the same schema. This triggers a transient Unity Catalog metadata resolution failure, causing TABLE_OR_VIEW_NOT_FOUND on a table that demonstrably exists.
The model successfully completes the MERGE step but then fails during the process_config_changes phase when trying to drop the "old" constraint.
Steps To Reproduce
- Define an incremental model with a primary key constraint in schema YAML (named or unnamed — both reproduce):
# __schema_my_model.yml
models:
- name: my_model
constraints:
- type: primary_key
columns:
- my_model_key
warn_unenforced: false
config:
unique_key: ['my_model_key']- Set
use_materialization_v2: trueindbt_project.yml:
flags:
use_materialization_v2: true-
Run
dbt runwiththreads: 8(or any value > 1) on a project with multiple incremental models that have constraints defined, targeting a Databricks Unity Catalog workspace. -
On the second run (table already exists with constraints), the V2 materialization detects a constraint diff and issues
DROP CONSTRAINT+ADD CONSTRAINTon every incremental run. Under concurrent DDL load, some models fail withTABLE_OR_VIEW_NOT_FOUND.
Expected behavior
- If the constraint definition has not changed between runs (same type, same columns), no DDL should be issued — the constraint diff should recognize semantic equivalence.
- Alternatively, the
DROP CONSTRAINTstatement should useIF EXISTSto tolerate transient catalog metadata inconsistencies during concurrent DDL.
Screenshots and log output
Constraint diff log (model starts incremental run):
The V2 introspection detects a diff between the existing PK and the desired PK, even though both define the same columns:
[debug] [Thread-N ]: Applying constraints to relation
set_constraints={PrimaryKeyConstraint(name=None, columns=['my_model_key'])}
unset_constraints={PrimaryKeyConstraint(name='my_model_pk', columns=['my_model_key'])}
Both set_constraints and unset_constraints reference the same columns. The diff fires on every run. Explicitly naming the constraint in YAML (to match the Databricks-assigned name) does not prevent the diff — the DROP + ADD cycle persists regardless.
Concurrent DROP CONSTRAINT from multiple threads:
At 12:43:28, multiple threads issue ALTER TABLE ... DROP CONSTRAINT simultaneously:
12:43:28.160 [Thread-10]: ALTER TABLE `my_catalog`.`my_schema`.`model_a` DROP CONSTRAINT model_a_pk CASCADE;
12:43:28.197 [Thread-13]: ALTER TABLE `my_catalog`.`my_schema`.`model_b` DROP CONSTRAINT model_b_pk CASCADE;
Failure — TABLE_OR_VIEW_NOT_FOUND:
12:43:28.939 [Thread-13]: Database Error in model my_model (models/my_model.sql)
[TABLE_OR_VIEW_NOT_FOUND] The table or view `my_catalog`.`my_schema`.`my_model` cannot be found.
Verify the spelling and correctness of the schema and catalog.
SQLSTATE: 42P01
The table exists and is accessible before and after the run — this is a transient failure during concurrent DDL operations.
Full timeline for the failing model:
| Time | Thread | Action | Result |
|---|---|---|---|
| 12:43:16.215 | Thread-13 | CREATE OR REPLACE TEMPORARY VIEW my_model__dbt_tmp |
OK |
| 12:43:17.230 | Thread-13 | DESCRIBE TABLE EXTENDED my_model__dbt_tmp AS JSON |
OK |
| 12:43:17.417 | Thread-13 | DESCRIBE TABLE EXTENDED my_model AS JSON (V2 introspection) |
OK |
| 12:43:17–24 | Thread-13 | V2 introspection: tags, constraints, tblproperties, comments | OK |
| 12:43:24.451 | Thread-13 | MERGE INTO my_model USING my_model__dbt_tmp |
OK (3.33s) |
| 12:43:28.195 | Thread-13 | Constraint diff detected: unset_constraints={PK name='my_model_pk'} |
— |
| 12:43:28.197 | Thread-13 | ALTER TABLE my_model DROP CONSTRAINT my_model_pk CASCADE |
TABLE_OR_VIEW_NOT_FOUND |
The MERGE succeeds. The failure occurs during the post-MERGE process_config_changes phase — specifically in apply_config_changeset when it issues the DROP CONSTRAINT DDL.
System information
The output of dbt --version:
Core:
- installed: 1.11.6
- latest: 1.11.6 - Up to date!
Plugins:
- databricks: 1.11.5 - Up to date!
- spark: 1.10.1 - Up to date!
The operating system you're using:
Databricks-hosted runtime (Linux, via Databricks Workflows dbt task). Also reproduced from local dev on Windows 11.
The output of python --version:
Python 3.12.3
dbt packages:
packages:
- dbt-utils: 1.3.3
- dbt-date: 0.17.1
- dbt_artifacts: 2.10.0Additional context
Root cause analysis:
The issue has two layers:
-
Constraint diff always detects a change (causes unnecessary DDL on every run): The V2
get_changeset()comparison always treats the existing constraint (as introspected from Databricks) and the desired constraint (from YAML config) as different — even when the constraint type, columns, and name are identical. This produces a DROP + ADD on every single incremental run with no actual change. This occurs regardless of whether the constraint is explicitly named in YAML or left unnamed. -
Concurrent DDL + transient Unity Catalog failure (causes the error): With
threads: 8, multiple models run theirprocess_config_changesphase concurrently, all issuingALTER TABLE ... DROP CONSTRAINTagainst different tables in the same schema. This DDL concurrency causes transient Unity Catalog metadata resolution failures where the catalog briefly cannot resolve a table that exists.
Relevant source code:
The V2 incremental path calls process_config_changes(target_relation) at incremental.sql line 62, which calls apply_config_changeset() which issues the ALTER TABLE ... DROP CONSTRAINT / ADD CONSTRAINT statements.
Configuration at time of failure:
# dbt_project.yml
flags:
use_materialization_v2: true
# profiles.yml
threads: 8Project has ~1,791 models, running a full dbt run against the entire project.
Workaround:
Setting incremental_apply_config_changes: false in dbt_project.yml (or per-model in schema YAML) prevents the V2 path from running process_config_changes entirely, avoiding the constraint DROP/ADD cycle:
# dbt_project.yml
models:
my_project:
+incremental_apply_config_changes: falseThis works because process_config_changes (line 236-244 of incremental.sql) checks this config before proceeding:
{% macro process_config_changes(target_relation) %}
{% set apply_config_changes = config.get('incremental_apply_config_changes', True) | as_bool %}
{% if apply_config_changes %}
...
{% endif %}
{% endmacro %}Trade-off: This also disables automatic syncing of Databricks tags, table properties, and comments during incremental runs. These will only be applied during --full-refresh.
Note: Explicitly naming the constraint in YAML (to match the Databricks auto-generated name) does not resolve the issue — the diff still fires and the DROP + ADD cycle persists.