Skip to content

Commit

Permalink
dbt integration front page edits (#14937)
Browse files Browse the repository at this point in the history
## Summary & Motivation

This is part of a larger set of software + docs changes for making
Dagster more accessible to dbt users.

This makes some edits to the front page for the dbt integration, with a
few higher-level aims:
- Get the value prop across ASAP
- Limit Dagster-specific terminology early on that might feel
overwhelming
- Support different learning styles

Specific changes:
- Near the top, added an image & code sample of a dbt graph loaded into
Dagster.
- Made some tweaks to the intro text based on my understanding of what
language will get across best for learning Dagster + dbt users.
- Present the tutorial as one of a few different options for getting
started with Dagster & dbt. When the dbt-focused Cloud NUX is ready, we
can add that as another option.
- Added a link to a Dagster+dbt example project.
- Moved the section on how Dagster assets relate to dbt models farther
down.

## How I Tested These Changes
  • Loading branch information
sryza authored Jun 27, 2023
1 parent 5dbd4e6 commit 959a4dd
Show file tree
Hide file tree
Showing 4 changed files with 105 additions and 53 deletions.
93 changes: 40 additions & 53 deletions docs/content/integrations/dbt.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,69 +10,56 @@ description: Dagster can orchestrate dbt alongside other technologies.
<a href="/integrations/dbt-cloud">dbt Cloud with Dagster guide</a>!
</Note>

Dagster orchestrates dbt alongside other technologies, so you can combine dbt with Spark, Python, etc. in a single workflow. Dagster's [software-defined asset](/concepts/assets/software-defined-assets) abstractions make it simple to define data assets that depend on specific dbt models, or to define the computation required to compute the sources that your dbt models depend on. You could, for example:
Dagster orchestrates dbt alongside other technologies, so you can schedule dbt with Spark, Python, etc. in a single data pipeline.

- Run your dbt models after ingesting data into your data warehouse
- Selectively materialize dbt models and their dependencies
Dagster's [Software-defined Asset](/concepts/assets/software-defined-assets) approach allows Dagster to understand dbt at the level of individual dbt models. This means that you can:

Dagster has built-in support for loading dbt models, seeds, and snapshots as software-defined assets, enabling you to:
- Use Dagster's UI or APIs to run subsets of your dbt models, seeds, and snapshots.
- Track failures, logs, and run history for individual dbt models, seeds, and snapshots.
- Define dependencies between individual dbt models and other data assets. For example, put dbt models after the Fivetran-ingested table that they read from, or put a machine learning after the dbt models that it's trained from.

- Visualize and orchestrate a graph of dbt assets, and execute them with a single dbt invocation
- Version your dbt models by their defining SQL code, allowing Dagster to indicate when a model has changed
- View detailed historical metadata and logs for each asset
- Define Python computations that depend directly on tables updated using dbt
- Track data lineage through dbt and your other tools
An asset graph like this:

---
<!-- ![Dagster graph with dbt, Fivetran, and TensorFlow](/images/integrations/dbt/dagster-dbt-fivetran-tensorflow.png) -->

## Using dbt with Dagster
<Image
alt="Dagster graph with dbt, Fivetran, and TensorFlow"
src="/images/integrations/dbt/dagster-dbt-fivetran-tensorflow.png"
width={1834}
height={1220}
/>

Can be produced from code like this:

```python file=/integrations/dbt/potemkin_dag_for_cover_image.py startafter=start endbefore=end
fivetran_assets = dagster_fivetran.build_fivetran_assets(
connector_id="postgres",
table_names=["users", "orders"],
)

dbt_assets = dagster_dbt.load_assets_from_dbt_manifest("manifest.json")

<DbtModelAssetExplanation />

To learn how to load dbt models into Dagster as assets, check out the tutorial below or the quick version in the [dagster-dbt reference](/integrations/dbt/reference#loading-dbt-models-from-a-dbt-project).
@asset(compute_kind="tensorflow", non_argument_deps={"daily_order_summary"})
def predicted_orders():
...
```

---

## dbt and Dagster software-defined assets tutorial

In this tutorial, we'll walk you through integrating dbt with Dagster using dbt's example [jaffle shop project](https://github.com/dbt-labs/jaffle_shop), the [dagster-dbt library](/\_apidocs/libraries/dagster-dbt), and a data warehouse, such as [DuckDB](https://duckdb.org/).

By the end of the tutorial, you'll have a working dbt and Dagster integration and a handful of materialized Dagster assets, including a plotly chart powered by data computed from your dbt models.

<ArticleList>
<ArticleListItem
title="Overview and prerequisites"
href="/integrations/dbt/using-dbt-with-dagster"
></ArticleListItem>
<ArticleListItem
title="Part one: Set up a dbt project"
href="/integrations/dbt/using-dbt-with-dagster/part-one"
></ArticleListItem>
<ArticleListItem
title="Part two: Load dbt models as Dagster assets"
href="/integrations/dbt/using-dbt-with-dagster/part-two"
></ArticleListItem>
<ArticleListItem
title="Part three: Create and materialize upstream Dagster assets"
href="/integrations/dbt/using-dbt-with-dagster/part-three"
></ArticleListItem>
<ArticleListItem
title="Part four: Create and materialize a downstream asset"
href="/integrations/dbt/using-dbt-with-dagster/part-four"
></ArticleListItem>
</ArticleList>
## Getting started

There are a few ways to get started with Dagster and dbt:

- Take the [tutorial](/integrations/dbt/using-dbt-with-dagster). We'll walk you through setting up dbt and Dagster together on your computer, using dbt's example [jaffle shop project](https://github.com/dbt-labs/jaffle_shop), the [dagster-dbt library](/\_apidocs/libraries/dagster-dbt), and a data warehouse, such as [DuckDB](https://duckdb.org/). By the end, you'll have a working dbt and Dagster project and a handful of materialized Dagster assets, including a chart powered by data from your dbt models.
- Play around with a [working dbt + Dagster project](https://github.com/dagster-io/dagster/tree/master/examples/assets_dbt_python).
- Browse the [dagster-dbt integration reference](/integrations/dbt/reference) for short lessons on Dagster + dbt topics.
- Review the [API docs](/\_apidocs/libraries/dagster-dbt) for the dagster-dbt library.

---

## References

<ArticleList>
<ArticleListItem
title="dagster-dbt integration reference"
href="/integrations/dbt/reference"
></ArticleListItem>
<ArticleListItem
title="dagster-dbt API reference"
href="/\_apidocs/libraries/dagster-dbt"
></ArticleListItem>
</ArticleList>
## Understanding how dbt models relate to Dagster Software-defined assets

<DbtModelAssetExplanation />

To learn how to load dbt models into Dagster as assets, check out the [tutorial](/integrations/dbt/using-dbt-with-dagster) or the quick version in the [dagster-dbt reference](/integrations/dbt/reference#loading-dbt-models-from-a-dbt-project).
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions examples/assets_dbt_python/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,12 @@ Check out [Dagster Cloud](https://dagster.io/cloud) to get started.

### Option 2: Running it locally

To download this example into your working directory, run:

```bash
dagster project from-example --example assets_dbt_python --name assets_dbt_python
```

To install this example and its Python dependencies, run:

```bash
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
"""This is used to generate the image on code snippet on the dbt front page.
We pull off some dark magic so that generating the screenshot doesn't involve a whole setup with
Fivetran and a database.
"""

from dagster import asset


class dagster_fivetran:
@staticmethod
def build_fivetran_assets(connector_id, table_names):
@asset(compute_kind="fivetran")
def users():
...

@asset(compute_kind="fivetran")
def orders():
...

return [users, orders]


class dagster_dbt:
@staticmethod
def load_assets_from_dbt_manifest(manifest):
@asset(non_argument_deps={"users"}, compute_kind="dbt")
def stg_users():
"""Users with test accounts removed."""
...

@asset(non_argument_deps={"orders"}, compute_kind="dbt")
def stg_orders():
"""Cleaned orders table."""
...

@asset(non_argument_deps={"stg_users", "stg_orders"}, compute_kind="dbt")
def daily_order_summary():
"""Summary of daily orders, by user."""
raise ValueError()

return [stg_users, stg_orders, daily_order_summary]


# start
fivetran_assets = dagster_fivetran.build_fivetran_assets(
connector_id="postgres",
table_names=["users", "orders"],
)

dbt_assets = dagster_dbt.load_assets_from_dbt_manifest("manifest.json")


@asset(compute_kind="tensorflow", non_argument_deps={"daily_order_summary"})
def predicted_orders():
...


# end

1 comment on commit 959a4dd

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy preview for dagster ready!

✅ Preview
https://dagster-igy27i73l-elementl.vercel.app

Built with commit 959a4dd.
This pull request is being automatically deployed with vercel-action

Please sign in to comment.