A full-stack data analytics platform built on AWS, featuring automated data pipelines, interactive dashboards, and API integration, all orchestrated with Docker containers.
This project is a comprehensive data analytics platform that:
- Collects CO2 emissions reports from EU-MRV system through Python scripts and stores it in AWS S3
- Processes raw data through an ETL pipeline using AWS Glue
- Provides data access through a REST API (AWS API Gateway)
- Visualizes data through an Apache Superset dashboard
- Features a React-based landing page
- Frontend: React.js landing page
- Backend:
- AWS API Gateway for REST endpoints
- Python ETL scripts for data processing
- Apache Superset for data visualization
- Infrastructure:
- AWS EC2 for hosting
- Docker containers for service orchestration
- Nginx as reverse proxy
- AWS S3 for data storage
- AWS Route 53 for domain management
- Cloud & Infrastructure:
- AWS (EC2, S3, API Gateway, Route 53)
- Docker & Container Orchestration
- Nginx
- Backend & Data:
- Python
- Apache Superset
- ETL Pipeline
- API Development
- Frontend:
- React.js
- HTML/CSS
- JavaScript
- AWS Account
- Docker and Docker Compose
- Node.js
- Python 3.x
- Clone the repository:
git clone [repository-url]
cd [repository-name]
- Start the frontend application:
cd frontend
npm install
npm start
- Run the Docker containers:
docker-compose up -d
- Initialize Superset (first time only):
docker-compose -f superset-docker-compose.yml exec superset superset-init
-
Configure AWS services:
- Set up EC2 instance
- Configure S3 bucket
- Set up API Gateway
- Configure Route 53 for domain management
-
Deploy application:
docker-compose -f docker-compose.prod.yml up -d
- Set up SSL certificates:
sudo certbot --nginx -d yourdomain.com -d www.yourdomain.com -d dashboard.yourdomain.com
project/
├── docker-compose.yml # Main application composition
├── superset-docker-compose.yml # Superset setup
├── frontend/ # React application
│ ├── public/
│ ├── src/
│ └── Dockerfile
├── backend/ # Backend
│ ├── src/
│ ├── app/
│ └── Dockerfile
│ └── compose.yml
├── nginx/ # Nginx configuration
│ └── conf.d/
└── superset/ # Superset configuration
└── superset_config.py
- Move containers to AWS ECS
- Automate infrastructure creation and deletion with Terraform
- Set up monitoring and alerting with CloudWatch
- Add data quality checks
- Implement error handling and retry mechanisms
- Implement an automated testing suite with Pytests
- Create automated data backup system
This project is licensed under the terms of the MIT license.
For any inquiries please email at [email protected]