Skip to content

Commit

Permalink
[components][docs] Add components doc on moving definitions into comp…
Browse files Browse the repository at this point in the history
…onents (#27721)

## Summary

Adds some brief, tested docs which migrate existing Python modules
containing assets, jobs, schedules etc to components subfolders.

https://dagster-docs-1c66510ji-elementl.vercel.app/guides/labs/components/migrating-definitions
## Test Plan

See rendered docs, new test.
  • Loading branch information
benpankow authored Feb 14, 2025
1 parent e5ff88a commit f6d779a
Show file tree
Hide file tree
Showing 37 changed files with 1,123 additions and 7 deletions.
1 change: 1 addition & 0 deletions docs/docs/guides/labs/components/existing-code-location.md
Original file line number Diff line number Diff line change
Expand Up @@ -81,5 +81,6 @@ Now, your code location is ready to use components! `dg` can be used to scaffold

## Next steps

- [Migrate existing definitions to components](./migrating-definitions)
- [Add a new component to your code location](./using-a-component)
- [Create a new component type](./creating-a-component)
67 changes: 67 additions & 0 deletions docs/docs/guides/labs/components/migrating-definitions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
---
title: 'Migrating existing Definitions to components'
sidebar_position: 350
---

:::note
This guide covers migrating existing Python `Definitions` to components. This guide presupposes a components-enabled project. See the [getting started guide](./) or [Making an existing code location components-compatible](./existing-code-location) guide for more information.
:::

When adding components to an existing Dagster code location, it is often useful to restructure your definitions into component folders, making it easier to eventually migrate them entirely to using components.

## Example project

Let's walk through an example of how to migrate existing definitions to components, with a project that has the following structure:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/1-tree.txt" />

The root `Definitions` object combines definitions from various nested modules:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/2-definitions-before.py" title="my_existing_project/definitions.py" />

Each of these modules contains a variety of Dagster definitions, including assets, jobs, and schedules.

Let's migrate the `elt` module to a component.

## Create a Definitions component

We'll start by creating a `Definitions` component for the `elt` module:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/3-scaffold.txt" />

This creates a new folder in `my_existing_project/components/elt-definitions`, with a `component.yaml` file. This component requires a `definitions_path` parameter, which points to a file which contains a `Definitions` object.

Let's begin by moving the `elt` module's contents to the new component folder:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/4-mv.txt" />

Next, let's create a new `definitions.py` file in the component folder, which will collect all of the `elt` module's definitions into a single `Definitions` object:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/5-elt-nested-definitions.py" title="my_existing_project/components/elt-definitions/definitions.py" />

Finally, we can update the `component.yaml` file to point to the new `definitions.py` file:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/6-component-yaml.txt" title="my_existing_project/components/elt-definitions/6-component.yaml" />

Now that our component is defined, we can update the root `definitions.py` file to no longer explicitly load the `elt` module's `Definitions`:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/7-definitions-after.py" title="my_existing_project/7-definitions-after.py" />

Now, our project structure looks like this:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/8-tree-after.txt" />

We can repeat the same process for our other modules.

## Fully migrated project

Once each of our definitions modules are migrated to components, our project is left with a standardized structure and minimal imports at the project root:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/9-tree-after-all.txt" />

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/migrating-definitions/10-definitions-after-all.py" title="my_existing_project/10-definitions-after-all.py" />

## Next steps

- [Add a new component to your code location](./using-a-component)
- [Create a new component type](./creating-a-component)
593 changes: 593 additions & 0 deletions docs/src/code-examples-content.js

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,6 @@ tree
│   ├── __init__.py
│   ├── assets.py
│   └── definitions.py
├── my_existing_project_tests
│   ├── __init__.py
│   └── test_assets.py
└── pyproject.toml

3 directories, 7 files
2 directories, 5 files
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
tree

.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── jobs.py
│   ├── components
│   ├── definitions.py
│   └── elt
│   ├── __init__.py
│   ├── assets.py
│   └── jobs.py
└── pyproject.toml

5 directories, 10 files
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
from pathlib import Path

import dagster_components as dg_components

defs = dg_components.build_component_defs(Path(__file__).parent / "components")
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
from pathlib import Path

import dagster_components as dg_components

import dagster as dg
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)
from my_existing_project.elt import assets as elt_assets
from my_existing_project.elt.jobs import sync_tables_daily_schedule, sync_tables_job

defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([elt_assets, analytics_assets]),
jobs=[sync_tables_job, regenerate_analytics_job],
schedules=[sync_tables_daily_schedule, regenerate_analytics_hourly_schedule],
),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
dg component scaffold 'definitions@dagster_components' elt-definitions

Using /.../my-existing-project/.venv/bin/dagster-components
Creating a Dagster component instance folder at /.../my-existing-project/my_existing_project/components/elt-definitions.
Using /.../my-existing-project/.venv/bin/dagster-components
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
mv my_existing_project/elt/* my_existing_project/components/elt-definitions && rm -rf my_existing_project/elt
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import dagster as dg

from . import assets
from .jobs import sync_tables_daily_schedule, sync_tables_job

defs = dg.Definitions(
assets=dg.load_assets_from_modules([assets]),
jobs=[sync_tables_job],
schedules=[sync_tables_daily_schedule],
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
type: definitions@dagster_components

params:
definitions_path: definitions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
from pathlib import Path

import dagster_components as dg_components
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)

import dagster as dg

defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([analytics_assets]),
jobs=[regenerate_analytics_job],
schedules=[regenerate_analytics_hourly_schedule],
),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
tree

.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── analytics
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   └── jobs.py
│   ├── components
│   │   └── elt-definitions
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   ├── component.yaml
│   │   ├── definitions.py
│   │   └── jobs.py
│   └── definitions.py
├── pyproject.toml
└── uv.lock

5 directories, 13 files
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
tree

.
├── README.md
├── my_existing_project
│   ├── __init__.py
│   ├── components
│   │   ├── analytics-definitions
│   │   │   ├── __init__.py
│   │   │   ├── assets.py
│   │   │   ├── component.yaml
│   │   │   ├── definitions.py
│   │   │   └── jobs.py
│   │   └── elt-definitions
│   │   ├── __init__.py
│   │   ├── assets.py
│   │   ├── component.yaml
│   │   ├── definitions.py
│   │   └── jobs.py
│   └── definitions.py
├── pyproject.toml
└── uv.lock

5 directories, 15 files

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Sample existing project for testing docs for the "Making an existing code location components-compatible" guide.
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from dagster import asset


@asset
def my_analytics_asset():
pass
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import dagster as dg

from .assets import my_analytics_asset

regenerate_analytics_job = dg.define_asset_job(
"regenerate_analytics_job",
selection=[my_analytics_asset],
)

regenerate_analytics_hourly_schedule = dg.ScheduleDefinition(
job=regenerate_analytics_job,
cron_schedule="0 * * * *",
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
from pathlib import Path

import dagster_components as dg_components

import dagster as dg
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)
from my_existing_project.elt import assets as elt_assets
from my_existing_project.elt.jobs import sync_tables_daily_schedule, sync_tables_job

defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([elt_assets, analytics_assets]),
jobs=[sync_tables_job, regenerate_analytics_job],
schedules=[sync_tables_daily_schedule, regenerate_analytics_hourly_schedule],
),
dg_components.build_component_defs(Path(__file__).parent / "components"),
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
from dagster import asset


@asset
def customers_table(): ...


@asset
def orders_table(): ...


@asset
def products_table(): ...
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
import dagster as dg

from .assets import customers_table, orders_table, products_table

sync_tables_job = dg.define_asset_job(
"sync_tables_job",
selection=[customers_table, orders_table, products_table],
)

sync_tables_daily_schedule = dg.ScheduleDefinition(
job=sync_tables_job,
cron_schedule="0 0 * * *",
)
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
[project]
name = "my_existing_project"
version = "0.1.0"
description = "Add your description here"
readme = "README.md"
requires-python = ">=3.9,<3.13"
dependencies = [
"dagster",
"dagster-components",
]

[project.optional-dependencies]
dev = [
"dagster-webserver",
"pytest>8",
]

[build-system]
requires = ["setuptools"]
build-backend = "setuptools.build_meta"

[tool.dg]
is_code_location = true

[tool.dagster]
module_name = "my_existing_project.definitions"
code_location_name = "my_existing_project"

[tool.setuptools.packages.find]
exclude=["my_existing_project_tests"]
Loading

1 comment on commit f6d779a

@github-actions
Copy link

@github-actions github-actions bot commented on f6d779a Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deploy preview for dagster-docs ready!

✅ Preview
https://dagster-docs-7y6lmrcq9-elementl.vercel.app

Built with commit f6d779a.
This pull request is being automatically deployed with vercel-action

Please sign in to comment.