World Econnomic Growth Forecast

The International Monetary Fund (IMF), since its inception, has played a pivotal role in stabilizing the international landscape of finance and economics. Its comprehensive databases and analytical prowess have been instrumental in aiding member countries to navigate through economic fluctuations and policy-making processes. In the contemporary globalized economy, the IMF's insights are more valuable than ever, providing a wealth of information that can be leveraged to predict future economic trends and potential market shifts.

Building on this foundation, our project aims to harness the vast repository of financial data curated by the IMF to develop robust models for forecasting Gross Domestic Product (GDP) growth across various economies. By employing advanced machine learning techniques and deep learning algorithms, we intend to analyze patterns within the data that could indicate future economic outcomes. This predictive model could serve as a crucial tool for economists, policymakers, and investors alike, offering a glimpse into the economic trajectory of nations in the ever-evolving financial landscape.

Data Source

World Economic Outlook 2024 Database

The dataset was obtained from the International Monetary Fund’s website (IMF). The WEO database contains a vast amount of data on various selected macroeconomic indicators for individual countries, regions, and the world as a whole, including national accounts, inflation, unemployment rates, balance of payments, and fiscal indicators.

Source : Access the dataset here

Data Card

Size: 58 columns, 8624 rows
File Format: *.xls
Data Format: Grouped by Subject then Country

Variables

Variable Name	Role	Type	Description
WEO Country Code	ID	Integer	Unique code for each country
WEO Subject Code	ID	String	Code for GDP Parameters
Country	Feature	String	Countries of the world (196)
Subject Descriptor	Feature	Categorical	Various Factors affecting GDP
Units	ID	String	Unit of GDP Factors
Scale	ID	String	Scale of units
Country/Series-specific notes	ID	String	Information about source
Years (multiple columns)	Feature	Continuous	Years considered (1980-2029)
Estimate Start After	Feature	Integer	Year till which data is collected

Setup Instructions

Prerequisites

Ensure you have the following installed on your system:

Docker
Python > 3.8
Google Cloud Platform

Step-by-Step Setup

Clone the Repository

git clone https://github.com/namansnghl/World-Econ-Growth-Forecast
cd <repository-directory>

Check pyhton version

python --version

Install Python Dependencies

pip install -r requirements.txt

Install airflow

pip install "apache-airflow[celery]==2.9.1" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.9.1/constraints-3.8.txt"

Make sure to change the python version above - "constraints-3.x"

Check if you have enough memory to run docker (recommended 4GB)

docker run --rm "debian:bullseye-slim" bash -c 'numfmt --to iec $(echo $(($(getconf _PHYS_PAGES) * $(getconf PAGE_SIZE))))'

Initialize docker-compose.yaml

curl -LfO 'https://airflow.apache.org/docs/apache-airflow/2.9.1/docker-compose.yaml'

Initialize the database (only first time)

docker compose up airflow-init

Run Airflow with Docker

docker compose up

Visit localhost:8080 login with credentials

user:airflow
password:airflow

Run the DAG by clicking on the play button on the right side of the window
Ignore the example DAGS by setting load examples as false in docker-compose.yaml

AIRFLOW__CORE__LOAD_EXAMPLES: 'false'

DVC Setup

pip install dvc

Initialize DVC

dvc init

Add files to DVC

dvc add <file-path>

Tools Used for MLOps

GitHub
Jupyter Notebook
Docker
Airflow
DVC
MLflow
TensorFlow
Flask
Streamlit
Google Cloud Platform (GCP)

GitHub: GitHub hosts the project's source code, documentation, and manages issues and pull requests. It has 3 branches main, test-stage and dev. pytest is configured with GitActions. It builds and tests on every push.
Jupyter Notebook: We have used it to experiment with different data cleaning and feature engineering techniques, as well as to visualize initial model results.
DVC (Data Version Control): For managing datasets and machine learning models versioning, ensuring reproducibility and efficient data handling. DVC allows us to track data and model versions, making it easy to reproduce results and collaborate on data-intensive tasks without version conflicts. It is configured with GCP. .dvc files are stored in Git which is the hashed version of the file in the cloud
PyTest: For writing and running unit tests to ensure code quality and functionality of individual components.
GitHub Actions: To automate workflows for continuous integration and continuous deployment (CI/CD), including running tests, building Docker images, and deploying models.
Docker: Containerizes applications and their dependencies, ensuring consistency across different environments and simplifying deployment.Dockerfile is hosted on project root.
Airflow: It manages the entire data pipeline, scheduling and monitoring tasks to ensure timely and reliable execution of the data processing and model training workflows.
TensorFlow: It provides a comprehensive ecosystem for developing, training, and deploying machine learning models. TensorFlow is used to build and train the predictive models in this project, leveraging its powerful APIs and tools to handle complex data and modeling tasks.
MLFlow: Managing the machine learning lifecycle, including experimentation, reproducibility, and model deployment, along with tracking metrics and parameters.

MLflow Model monitoring: Tracked and saved various elements during the training process, including hyperparameters like the number of epochs and batch size, and key performance metrics such as loss, validation loss, mean absolute error (MAE), mean squared error (MSE), and R-squared (R²). Additionally, MLflow logged the trained LSTM model and its architecture as artifacts, ensuring that we could reproduce and evaluate the model's performance across different runs.

MLflow Experiment Tracking:

FastAPI/Flask: It serves as the web framework for building RESTful APIs to use the machine learning models as services for integration with other applications.
Logging: Implements logging mechanisms to track the performance, errors, and usage of the deployed models. Logging provides insights into the model's behavior and performance in production, helping to identify and troubleshoot issues quickly.
Looker: Utilized for business intelligence and data visualization to create interactive dashboards and reports that provide insights into model performance and business impact.
Streamlit: Creates interactive web applications for visualizing data, model predictions, and performance metrics, facilitating easier engagement.
Google Cloud Platform: Provides scalable cloud infrastructure for hosting, managing, and deploying machine learning models and applications. GCP offers the necessary infrastructure to deploy models at scale, ensuring high availability and performance for the deployed services.

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
.dvc		.dvc
.github/workflows		.github/workflows
assets		assets
backend		backend
dags		dags
data		data
frontend		frontend
gcp		gcp
logs		logs
models		models
notebooks		notebooks
src		src
tests		tests
utilities		utilities
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
dockerfile		dockerfile
main.py		main.py
readme_creation.sh		readme_creation.sh
requirements.txt		requirements.txt
streamlit.py		streamlit.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

World Econnomic Growth Forecast

Data Source

World Economic Outlook 2024 Database

Data Card

Variables

Setup Instructions

Prerequisites

Step-by-Step Setup

Tools Used for MLOps

About

Releases

Packages

Contributors 5

Languages

License

namansnghl/World-Econ-Growth-Forecast

Folders and files

Latest commit

History

Repository files navigation

World Econnomic Growth Forecast

Data Source

World Economic Outlook 2024 Database

Data Card

Variables

Setup Instructions

Prerequisites

Step-by-Step Setup

Tools Used for MLOps

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages