Example Project for Using Dagster in Data Operations

This repository contains an example project that accompanies a blog post outlining how to effectively use Dagster for data operations, focusing on reading and writing data with the proper use of resources and IO managers.

Project Overview

This project serves as a practical example demonstrating how to set up and utilize Dagster for data processing tasks, including:

Structuring clean and maintainable Dagster pipelines.
Reading and writing data efficiently.
Utilizing resources and IO managers for flexible data operations.

The Makefile included simplifies the setup and execution of common tasks within the project.

Installation

To get started with this project, follow these steps:

Clone the repository:

git clone https://yourrepo.com/example-dagster-project.git
cd dagster-blogpost-repo

Create a virtual environment:

make venv
source venv/bin/activate

Install dependencies:

make install-pip-tools

make compile-dependencies

make install-dependencies-base

Usage

After installation, you can use the Makefile commands to interact with the project:

Get a proper .env file Rename .env.example to .env or copy its content to a new file named .env.
Create a dagster_home folder
Create a new directory in root named dagster_home via

mkdir dagster_home

In this folder, dagster stores logs, historical runs, etc. Copy the absolute path to this folder into the .env file and put it next to DAGSTER_HOME=/your/path/to/dagster_home.

Start the locally hosted MinIO S3 instance via

docker compose up -d

In the CLI output you should see a section like

dagster-blogpost-repo-minio-1 | Status: 1 Online, 0 Offline.

dagster-blogpost-repo-minio-1 | S3-API: http://111.11.0.1:port http://111.1.1.1:port

The second S3-API link will open MINIO in your browser. Go ahead and fill in the MINIO_ROOT_USER and the MINIO_ROOT_PASSWORD stored in your .env file.

Do not forget to tear the MinIO S3 Docker container down when you are done via

docker compose down

Start the Dagster UI for development:

make dagster_ui

Once the Dagster UI is set-up you can access it in your browser and start jobs interactively. Of course you can start all of the jobs as well via the command line as follows

Execute the preprocessing job:

make preprocessing_job

Execute the training job:

make training_job

For a detailed explanation of each step and how to use Dagster for data operations, refer to the accompanying blog post.

Contributing

We welcome contributions! Please open an issue or submit a pull request for any enhancements, bug fixes, or improvements.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For any questions or comments about this project, please open an issue in the repository, and we'll get back to you as soon as possible.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.docker/minio		.docker/minio
core		core
core_tests		core_tests
requirements		requirements
.env.example		.env.example
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Example Project for Using Dagster in Data Operations

Table of Contents

Project Overview

Installation

Usage

Contributing

License

Contact

About

Releases

Packages

Languages

codecentric/dagster-blog-post-working-with-io-manager

Folders and files

Latest commit

History

Repository files navigation

Example Project for Using Dagster in Data Operations

Table of Contents

Project Overview

Installation

Usage

Contributing

License

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages