An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
python airflow spark apache-spark scheduler s3 data-engineering data-lake warehouse redshift data-migration livy etl-framework apache-airflow emr-cluster etl-pipeline etl-job data-engineering-pipeline airflow-dag goodreads-data-pipeline
-
Updated
Mar 9, 2020 - Python