Skip to content

Commit

Permalink
Add unique_combination_of_columns test (#177)
Browse files Browse the repository at this point in the history
  • Loading branch information
Claire Carroll authored Dec 11, 2019
1 parent 18ab2fb commit 21a0b96
Show file tree
Hide file tree
Showing 4 changed files with 60 additions and 0 deletions.
24 changes: 24 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,30 @@ the lower bound of the next record (common for date ranges).
| 0 | 1 |
| 2 | 3 |
| 4 | 5 |

#### unique_combination_of_columns ([source](macros/schema_tests/unique_combination_of_columns.sql))
This test confirms that the combination of columns is unique. For example, the
combination of month and product is unique, however neither column is unique
in isolation.

We generally recommend testing this uniqueness condition by either:
* generating a [surrogate_key](#surrogate_key-source) for your model and testing
the uniqueness of said key, OR
* passing the `unique` test a coalesce of the columns (as discussed [here](https://docs.getdbt.com/docs/testing#section-testing-expressions)).

However, these approaches can become non-perfomant on large data sets, in which
case we recommend using this test instead.

**Usage:**
```yaml
- name: revenue_by_product_by_month
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- month
- product
```

---
### SQL helpers
#### get_query_results_as_dict ([source](macros/sql/get_query_results_as_dict.sql))
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
month,product,revenue
2019-01-01,jaffle,500
2019-01-01,lamington,100
2019-01-01,pavlova,600
2019-02-01,jaffle,300
2019-02-01,lamington,300
2019-02-01,pavlova,400
7 changes: 7 additions & 0 deletions integration_tests/models/schema_tests/schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -80,3 +80,10 @@ models:
upper_bound_column: coalesce(valid_to, '2099-01-01')
partition_by: subscription_id
gaps: required

- name: data_unique_combination_of_columns
tests:
- dbt_utils.unique_combination_of_columns:
combination_of_columns:
- month
- product
22 changes: 22 additions & 0 deletions macros/schema_tests/unique_combination_of_columns.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
{% macro test_unique_combination_of_columns(model) %}

{%- set columns = kwargs.get('combination_of_columns', kwargs.get('arg')) %}

{%- set columns_csv=columns | join(', ') %}

with validation_errors as (

select
{{ columns_csv }}
from {{ model }}

group by {{ columns_csv }}
having count(*) > 1

)

select count(*)
from validation_errors


{% endmacro %}

0 comments on commit 21a0b96

Please sign in to comment.