Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC-878 Reorganize and clean up Components docs #27848

Merged
merged 27 commits into from
Feb 19, 2025
Merged
Show file tree
Hide file tree
Changes from 26 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
387 changes: 0 additions & 387 deletions docs/docs-beta/src/code-examples-content.js

This file was deleted.

Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
---
title: 'Adding components to your project with Python'
sidebar_position: 300
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::

In some cases, you may want to add a component to your project with Python rather than a `component.yaml` file.

:::note Prerequisites

Before adding a component with Python, you must either [create a project with components](/guides/labs/components/building-pipelines-with-components/creating-a-code-location-with-components) or [migrate an existing code location to components](/guides/labs/components/incrementally-adopting-components/existing-code-location).

:::

1. First, create a new subdirectory in your `components/` directory to contain the component definition.
2. In the subdirectory, create a `component.py` file to define your component instance. In this file, you will define a single `@component`-decorated function that instantiates the component type that you're interested in:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/python-components/component.py" language="python" />

This function needs to return an instance of your desired component type. In the example above, we've used this functionality to customize the `translator` argument of the `DbtProjectcomponent` class.
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
title: "Adding components to your project"
sidebar_position: 200
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::

To add components to your project, you can instantiate them from the command line, which will create a new directory inside your `components/` folder that contains a `component.yaml` file.

If you want to use Python to add components to your project instead, see "[Adding components to your project with Python](adding-components-python)".

:::note Prerequisites

Before adding a component with Python, you must either [create a project with components](/guides/labs/components/building-pipelines-with-components/creating-a-code-location-with-components) or [migrate an existing code location to components](/guides/labs/components/incrementally-adopting-components/existing-code-location).

:::

## Finding a component

You can view the available component types in your environment by running the following command:

```bash
dg component-type list
```

This will display a list of all the component types that are available in your project. To see more information about a specific component type, you can run:

```bash
dg component-type docs <component-name>
```

This will display a webpage containing documentation for the specified component type.

## Instantiating a component

Once you've decided on the component type that you'd like to use, you can instantiate it by running:

```bash
dg component generate <component-type> <component-name>
```

This will create a new directory inside your `components/` folder that contains a `component.yaml` file. Some component types may also generate additional files as needed.

## Configuration

### Basic configuration

The `component.yaml` is the primary configuration file for a component. It contains two top-level fields:

- `type`: The type of the component defined in this directory
- `params`: A dictionary of parameters that are specific to this component type. The schema for these parameters is defined by the `get_schema` method on the component class.

To see a sample `component.yaml` file for your specific component, you can run:

```bash
dg component-type docs <component-name>
```

### Component templating

Each `component.yaml` file supports a rich templating syntax, powered by `jinja2`.

#### Templating environment variables

A common use case for templating is to avoid exposing environment variables (particularly secrets) in your YAML files. The Jinja scope for a `component.yaml` file contains an `env` function that can be used to insert environment variables into the template:

```yaml
component_type: my_snowflake_component

params:
account: {{ env('SNOWFLAKE_ACCOUNT') }}
password: {{ env('SNOWFLAKE_PASSWORD') }}
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
title: 'Creating a code location with components'
sidebar_position: 100
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::

:::note Prerequisites

Before creating a project with components, you must follow the [steps to install `uv` and `dg`](/guides/labs/components/index.md#installation).

:::

After [installing dependencies](/guides/labs/components/index.md#installation), you can scaffold a components-ready code location for your project. In the example below, we scaffold a code location called `jaffle-platform`:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/index/2-scaffold.txt" />

This command builds a code location and initializes a new Python virtual environment inside of it. When using `dg`'s default environment management behavior, you won't need to worry about activating this virtual environment yourself.

## Overview of files and directories

Let's have a look at the scaffolded files:

<CliInvocationExample path="docs_beta_snippets/docs_beta_snippets/guides/components/index/3-tree.txt" />

You can see that we have a fairly standard Python project structure. The following files and directories are included:

- A Python package `jaffle_platform`-- the name is an underscored inflection of the
project root directory (`jaffle_platform`).
- An (empty) `jaffle_platform_tests` test package
- A `uv.lock` file
- A `pyproject.toml` file

### pyproject.toml

The `pyproject.toml` contains a `tool.dagster` and `tool.dg` section that look like
this:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/index/4-pyproject.toml" language="TOML" title="jaffle-platform/pyproject.toml" />

#### tool.dagster section

The `tool.dagster` section of `pyproject.toml` is not `dg`-specific. This section specifies that a set of definitions can be loaded from the `jaffle_platform.definitions` module.

#### tool.dg section

The `tool.dg` section contains two settings requiring more explanation: `is_code_location` and `is_component_lib`.

##### is_code_location setting

`is_code_location = true` specifies that this project is a `dg`-managed Dagster code location. Code locations created with components are regular Dagster code locations with a particular structure.

To understand the structure, let's look at the content of `jaffle_platform/definitions.py`:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/index/5-definitions.py" language="Python" title="jaffle-platform/jaffle_platform/definitions.py" />

This call to `build_component_defs` will:

- discover the set of components defined in the project
- compute a set of `Definitions` from each component
- merge the component-specific definitions into a single `Definitions` object

`is_code_location` is telling `dg` that the project is structured in this way and therefore contains component instances. In the current project, component instances will be placed in the default location at `jaffle_platform/components`.

##### is_component_lib setting

`is_component_lib = true` specifies that the project is a component library. This means that the project may contain component types that can be referenced when generating component instances.

In a typical code location, most components are likely to be instances of types defined in external libraries (e.g. `dagster-components`), but you can also define custom component types scoped to your project. That is why `is_component_lib` is set to `true` by default. Any scaffolded component types in `jaffle_platform` will be placed in the default location at `jaffle_platform/lib`.

You can also see that this module is registered under the `dagster.components` entry point in `pyproject.toml`. This is what makes the components discoverable to `dg`:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/index/6-pyproject.toml" language="TOML" title="jaffle-platform/pyproject.toml" />

## Next steps

After scaffolding your code location with components, you can [add more components](adding-components) to complete your pipeline.
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: 'Customizing components'
sidebar_position: 400
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::

You can customize the behavior of a component beyond what is available in the `component.yaml` file.

To do so, you can create a subclass of your desired component in a file named `component.py` in the same directory as your `component.yaml` file. This subclass should be annotated with the `@component_type` decorator, which will define a local name for this component:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/custom-subclass/basic-subclass.py" language="python" />

You can then update the `type:` field in your `component.yaml` file to reference this new component type. The new type name will be `.<component-name>`, where the leading `.` indicates that this is a local component type:

```yaml
type: .custom_subclass

params:
...
```

## Customizing execution

By convention, most library components have an `execute()` method that defines the core runtime behavior of the component. This can be overridden by subclasses of the component to customize this behavior.

For example, we can create a subclass of the `SlingReplicationCollectioncomponent` that adds a debug log message during execution:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/custom-subclass/debug-mode.py" language="python" />

## Adding component-level templating scope

By default, the scopes available for use in the template are:

- `env`: A function that allows you to access environment variables.
- `automation_condition`: A scope allowing you to access all static constructors of the `AutomationCondition` class.

However, it can be useful to add additional scope options to your component type. For example, you may have a custom automation condition that you'd like to use in your component.

To do so, you can define a function that returns an `AutomationCondition` and define a `get_additional_scope` method on your subclass:

<CodeExample path="docs_beta_snippets/docs_beta_snippets/guides/components/custom-subclass/custom-scope.py" language="python" />

This can then be used in your `component.yaml` file:

```yaml
component_type: .custom_subclass

params:
...
transforms:
- attributes:
automation_condition: "{{ custom_cron('@daily') }}"
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
title: 'Building pipelines with components'
sidebar_position: 20
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::

import DocCardList from '@theme/DocCardList';

<DocCardList />
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
title: 'Troubleshooting Components'
sidebar_position: 500
unlisted: true
---

:::info

This feature is still in development and might change in patch releases. It’s not production ready, and the documentation may also evolve. Stay tuned for updates.

:::
Loading
Loading