Skip to content

jumpingchu/E-commerce-Data-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

E-commerce-Data-Pipeline

flow chart

This is my project followed by Airbyte tutorial: "Building a seamless and efficient data pipeline for e-commerce analytics".

The tutorial dived into the practical implementation of a data workflow using Airbyte for data integration, dbt for transformation, Dagster for orchestration, and BigQuery for data warehousing.

Tech Stack

  • Python
  • Terraform (Infrastruction as Code)
  • Airbyte (Data ingestion)
  • dbt (Data transformation)
  • Dagster (Pipeline orchestration)
  • BigQuery (Data warehouse)

Requirements

Python

Version 3.10 or 3.11 (Don't use 3.12 because some libraries are incompatible)

Install Python libraries with pip

pip install dbt-core dbt-bigquery
pip install dagster dagster-webserver dagster-dbt dagster-airbyte

Setting ENV variables

  • For dbt to interact with BigQuery (by GCP Service Account)
    export DBT_BIGQUERY_KEYFILE_PATH=path/to/credentials.json
  • For Dagster to interact with dbt
    export DAGSTER_DBT_PARSE_PROJECT_ON_LOAD=1
  • For Dagster to interact with Airbyte
    export AIRBYTE_PASSWORD=password

Reference

About

A project followed by Airbyte tutorial.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published