Skip to content

SergioAcostaTer/cloud-computing-aws

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

126 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Cloud Computing - AWS Projects

AWS Python Node.js License: MIT

Cloud Computing coursework โ€” Universidad de Las Palmas de Gran Canaria
Two production-grade AWS projects demonstrating streaming data pipelines and serverless application architectures


๐Ÿ“š Table of Contents


๐ŸŒŸ Overview

This repository contains two comprehensive cloud computing projects built on AWS, demonstrating expertise in:

  • Streaming data architectures with Kinesis, Firehose, and Glue
  • Serverless applications with Lambda and API Gateway
  • Infrastructure as Code using CloudFormation
  • Containerization with Docker and ECS Fargate
  • Real-time data processing and ETL workflows
  • Cost optimization strategies (87% savings demonstrated)

Both projects are production-ready, fully documented, and deployable via automated scripts.


๐Ÿ“ Projects

1. ETL Data Science Pipeline

Energy Consumption Analytics โ€” Real-time streaming data pipeline for energy monitoring and analysis.

๐ŸŽฏ Features

  • Real-time ingestion via Amazon Kinesis Data Streams
  • Automated ETL with AWS Glue jobs (daily & monthly aggregations)
  • Dynamic partitioning using Lambda transformations
  • Data cataloging with Glue Crawler
  • Scalable storage in S3 with raw/processed separation

๐Ÿ—๏ธ Architecture

Custom Data (JSON) โ†’ Kinesis Stream โ†’ Firehose (+ Lambda) โ†’ S3 (raw)
                                                                โ†“
                                           Glue Crawler โ†’ Data Catalog
                                                                โ†“
                                           Glue ETL Jobs โ†’ S3 (processed)
                                                                โ†“
                                           Final Crawler โ†’ Athena-ready tables

๐Ÿš€ Quick Deploy

# Configure AWS credentials
aws configure

# Deploy entire pipeline (automated)
cd etl-data-science/scripts
.\00_run_all.ps1

๐Ÿ“Š Key Metrics

  • Processing time: ~5 minutes for 1000 records
  • Cost: ~$15-25/month (varies with data volume)
  • Scalability: Handles 1000+ records/second

โ†’ Full Documentation


2. Bitcoin Positions Tracker

Production-grade cryptocurrency portfolio manager โ€” Full-stack application with dual AWS architectures.

๐ŸŽฏ Features

  • Real-time Bitcoin prices via Binance WebSocket
  • CRUD API for position management
  • Dual deployment options: ECS Fargate vs. Lambda (serverless)
  • Auto-generated API docs (OpenAPI/Swagger)
  • Modern web interface with live P&L calculations
  • API Gateway authentication with usage plans

๐Ÿ—๏ธ Architectures

Option A: ECS Fargate (Containerized)

Browser โ†’ API Gateway โ†’ VPC Link โ†’ NLB โ†’ ECS Fargate โ†’ DynamoDB
         (with API key)                    (Express.js)

Option B: Lambda (Serverless) โญ Recommended

Browser โ†’ API Gateway โ†’ Lambda Functions (5) โ†’ DynamoDB
         (with API key)

๐Ÿš€ Quick Deploy

Serverless (5 minutes)

# 1. Package Lambda code
cd full-app-deployment/backend/lambda/lambdas
npm install --omit=dev && cd ..
Compress-Archive -Path lambdas/* -DestinationPath lambda-code.zip

# 2. Upload to S3
aws s3 mb s3://bitcoin-lambda-deploy
aws s3 cp lambda-code.zip s3://bitcoin-lambda-deploy/

# 3. Deploy stack
aws cloudformation deploy \
  --template-file deploy.yml \
  --stack-name bitcoin-tracker-lambda \
  --parameter-overrides LambdaCodeBucket=bitcoin-lambda-deploy \
  --capabilities CAPABILITY_NAMED_IAM

ECS Fargate (15 minutes)

cd full-app-deployment/backend/ecs
make build && make push  # Build and push Docker image
aws cloudformation deploy \
  --template-file deploy.yml \
  --stack-name bitcoin-tracker-ecs \
  --capabilities CAPABILITY_NAMED_IAM

๐Ÿ“ธ Screenshots

Live Trading Dashboard
Dashboard

Interactive API Docs
API Docs

โ†’ Full Documentation


๐Ÿ›๏ธ Architecture Highlights

ETL Pipeline Architecture

  • Decoupled design: Producer โ†’ Stream โ†’ Processor โ†’ Storage
  • Automated orchestration: Sequential Glue job execution
  • Cost-optimized: Pay-per-use with Firehose buffering
  • Scalable: Auto-sharding in Kinesis, parallel Glue jobs

Application Architecture Comparison

Aspect ECS Fargate Lambda (Serverless)
Deployment Container orchestration Individual functions
Scaling Manual task count Automatic (0โ†’1000)
Cold start Always warm ~200ms first request
Cost $57/month $7.53/month โฌ‡๏ธ87%
Complexity VPC, NLB, ECS service API Gateway + Lambda
Best for Traditional apps Event-driven workloads

๐Ÿ› ๏ธ Technologies Used

ETL Data Science

  • Streaming: Amazon Kinesis Data Streams, Kinesis Firehose
  • Processing: AWS Lambda (Python 3.9), AWS Glue (PySpark)
  • Storage: Amazon S3 (partitioned), Glue Data Catalog
  • Analytics: Amazon Athena (queryable tables)
  • IaC: PowerShell automation scripts

Bitcoin Tracker

  • Backend: Node.js 18, Express.js, AWS SDK v3
  • Compute: AWS Lambda / ECS Fargate
  • API: API Gateway (REST), OpenAPI 3.0
  • Database: DynamoDB (serverless NoSQL)
  • Networking: VPC, VPC Endpoints, Network Load Balancer
  • Frontend: Vanilla JavaScript, Binance WebSocket API
  • IaC: CloudFormation (1300+ lines YAML)

โšก Quick Start

Prerequisites

# AWS CLI configured
aws configure

# For ETL project
- PowerShell 5.1+
- Python 3.9+

# For Bitcoin tracker
- Node.js 18+
- Docker (for ECS deployment)

Clone Repository

git clone https://github.com/yourusername/ulpgc-cloud-computing-aws.git
cd ulpgc-cloud-computing-aws

Deploy Projects

ETL Pipeline

cd etl-data-science/scripts
.\00_run_all.ps1  # Full automated deployment

Bitcoin Tracker (Serverless)

cd full-app-deployment/backend/lambda
./deploy.ps1  # PowerShell script handles everything

๐Ÿ“– Project Details

ETL Data Science Pipeline

Key Components

  1. Producer (kinesis.py): Reads custom JSON data and streams to Kinesis
  2. Lambda Processor: Adds partition keys for dynamic S3 organization
  3. Glue Crawler: Catalogs raw data schema
  4. Glue ETL Jobs: Daily and monthly aggregations
  5. Final Crawler: Makes processed data queryable in Athena

Custom Data Format

{
  "devices": [
    {
      "type": "HVAC System",
      "id": "IOT-HEAT-01",
      "data": {
        "label": "Living Room Heater",
        "readings": [
          {
            "timestamp": "2025-12-12T13:00:00",
            "value": 2903.6,
            "percentage": 0.73,
            "voltage_v": 235.3,
            "current_a": 12.34,
            "temperature_c": 21.4,
            "status": "active"
          }
        ]
      }
    }
  ]
}

S3 Structure

s3://bucket/
โ”œโ”€โ”€ raw/energy_consumption/processing_date=YYYY-MM-DD/
โ”œโ”€โ”€ processed/
โ”‚   โ”œโ”€โ”€ daily/fecha_reporte=YYYY-MM-DD/
โ”‚   โ””โ”€โ”€ monthly/fecha_mes=YYYY-MM/
โ”œโ”€โ”€ scripts/
โ”œโ”€โ”€ config/
โ””โ”€โ”€ errors/

Bitcoin Positions Tracker

API Endpoints

Method Endpoint Auth Description
GET / โŒ API information
GET /health โŒ Health check
GET /openapi.json โŒ Swagger spec
POST /positions โœ… Create position
GET /positions โœ… List all positions
GET /positions/{id} โœ… Get position by ID
PUT /positions/{id} โœ… Update position
DELETE /positions/{id} โœ… Delete position

Authentication: x-api-key header (managed via API Gateway Usage Plans)

Lambda Functions (Serverless Architecture)

  1. CRUD Operations (crud.js): POST, PUT, DELETE
  2. Read Operations (read.js): GET all, GET by ID
  3. OpenAPI Docs (openapi.js): Swagger specification
  4. Health Check (health.js): Service status
  5. Root Handler (root.js): Landing page

Frontend Features

  • Real-time prices: WebSocket connection to Binance
  • Live P&L: Automatic profit/loss calculation
  • Responsive design: Mobile-first, Binance-inspired UI
  • Toast notifications: User feedback system
  • Dark theme: Professional trading interface

๐Ÿ’ฐ Cost Analysis

ETL Pipeline (Monthly Estimates)

Service Usage Cost
Kinesis Stream 1M records/day, 1 shard $11.00
Firehose 1M records/day $2.50
S3 Storage 100GB $2.30
Glue Crawler 2 runs/day $0.88
Glue ETL Jobs 4 runs/day, 2 DPUs $9.60
Total ~$26.28

Real production cost depends on data volume and frequency

Bitcoin Tracker (Monthly Estimates)

Component ECS Fargate Lambda Savings
Compute $44.16 $0.53 99% โฌ‡๏ธ
API Gateway $7.50 $7.00 7% โฌ‡๏ธ
DynamoDB $5.00 $5.00 -
Total $57/mo $7.53/mo 87% โฌ‡๏ธ

Annual savings: $593.64 (Lambda vs. ECS)

Assumes: 1M requests/month, 100ms avg duration, 256MB memory


๐ŸŽ“ Academic Context

Prรกctica 7 - ETL Data Science

Requirements Met:

  • โœ… S3 bucket with proper folder structure
  • โœ… Kinesis producer with custom data format (not default examples)
  • โœ… Firehose consumer with Lambda transformation
  • โœ… Glue crawler, data catalog, and ETL jobs
  • โœ… Complete documentation with diagrams and cost analysis

Grade: Pending evaluation

Prรกctica Entregable - Full App Deployment

Requirements Met:

  • โœ… DynamoDB table with CRUD operations
  • โœ… API Gateway with endpoint protection
  • โœ… Dual architecture (coupled ECS + decoupled Lambda)
  • โœ… Auto-generated API documentation (OpenAPI)
  • โœ… Frontend interface for testing
  • โœ… CloudFormation for automated deployment
  • โœ… Detailed cost comparison and justification

Grade: Pending evaluation


๐Ÿš€ Advanced Features

ETL Pipeline

  • Sequential job execution to prevent race conditions
  • Dynamic partitioning for query optimization
  • Error handling with dedicated S3 error prefix
  • Automated cleanup script (99_cleanup.ps1)

Bitcoin Tracker

  • CORS support for cross-origin requests
  • Rate limiting (50 req/s, burst 100)
  • VPC Endpoints for private ECR/S3 access (ECS version)
  • Health monitoring with uptime tracking
  • Reconnection logic for WebSocket resilience

๐Ÿ“ Documentation

Each project includes comprehensive documentation:

  • Architecture diagrams (system flow, AWS service interactions)
  • Deployment guides (step-by-step with screenshots)
  • API documentation (auto-generated OpenAPI specs)
  • Code comments (explaining key logic)
  • Cost breakdowns (monthly and annual projections)

Additional Resources


๐Ÿงน Cleanup

ETL Pipeline

cd etl-data-science/scripts
.\99_cleanup.ps1

Bitcoin Tracker

# Serverless
aws cloudformation delete-stack --stack-name bitcoin-tracker-lambda

# ECS
aws cloudformation delete-stack --stack-name bitcoin-tracker-ecs

โš ๏ธ Important: Always delete CloudFormation stacks to avoid ongoing charges!


๐Ÿค Contributing

This repository is primarily for academic purposes. However, suggestions and improvements are welcome:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/improvement)
  3. Commit changes (git commit -m 'Add improvement')
  4. Push to branch (git push origin feature/improvement)
  5. Open a Pull Request

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ‘จโ€๐Ÿ’ป Author

Sergio Acosta Quintana
Computer Engineering Student @ ULPGC
Cloud Computing Coursework โ€” 2025


๐Ÿ™ Acknowledgments

  • Universidad de Las Palmas de Gran Canaria โ€” Cloud Computing course
  • AWS โ€” For free tier resources and comprehensive documentation
  • Binance โ€” WebSocket API for real-time cryptocurrency data
  • Open Source Community โ€” For the amazing tools and libraries

โญ Star this repository if you found it useful!

Built with โ˜๏ธ on AWS | Las Palmas de Gran Canaria, Canary Islands ๐Ÿ‡ฎ๐Ÿ‡จ

About

Cloud Computing AWS: Full Stack & Data Science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published