FANNG Stock Metrics Pipeline Project

Overview and Problem Statement

This project is dedicated to the comprehensive analysis and visualization of FANNG (Facebook, Amazon, Apple, Netflix, Google) stock data, leveraging a robust data pipeline to process extensive historical stock data sourced from Kaggle. The project encompasses the ingestion of third-party data, applying initial processing using Apache Spark to load data into a data lake, followed by detailed transformation and calculation steps orchestrated via Apache Airflow and dbt. These steps ensure data sanity and accuracy in metric calculations, such as MACD and EMA20. The ultimate goal is to provide a dynamic dashboard that presents these key financial metrics, offering actionable insights into stock performance trends and aiding in informed investment decisions.

Technologies Used

Cloud: Google Cloud Platform (GCP)
Data Ingestion: Apache Spark
Data Lake Storage: Google Cloud Storage (GCS)
Data Warehousing: BigQuery
ETL/ELT Process: dbt (data build tool)
Workflow Orchestration: Apache Airflow
Analytics and Visualization: Looker
Programming Languages: SQL, Python
Version Control: Git

Data Pipeline Diagram

This diagram illustrates the flow of data from source to visualization, showcasing how each technology is utilized within the pipeline.

Prerequisites

Before you begin setting up this project, ensure you have the following:

A Google Cloud account with billing enabled.
Access to Google Cloud services like BigQuery and Google Cloud Storage.
Apache Spark and Apache Airflow installed either locally or in a cloud environment.
Looker or another compatible visualization tool set up to connect to your BigQuery datasets.

Project Build & Setup

Follow Setup.md

Dashboard

Here is the link.

Testing

Tests are added to dbt models. To further improvements, Airflow tests should be added. Also CI/CD process should be added.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
airflow/dags		airflow/dags
fanng_dbt		fanng_dbt
spark		spark
terraform		terraform
.gitignore		.gitignore
README.md		README.md
Setup.md		Setup.md
dashboard.png		dashboard.png
diagram.gif		diagram.gif
lineage.png		lineage.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FANNG Stock Metrics Pipeline Project

Overview and Problem Statement

Technologies Used

Data Pipeline Diagram

Prerequisites

Project Build & Setup

Dashboard

Testing

About

Releases

Packages

Languages

azurey0/FANNG_stock_pipeline

Folders and files

Latest commit

History

Repository files navigation

FANNG Stock Metrics Pipeline Project

Overview and Problem Statement

Technologies Used

Data Pipeline Diagram

Prerequisites

Project Build & Setup

Dashboard

Testing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages