Skip to content

Conversation

r4770
Copy link

@r4770 r4770 commented Jun 15, 2025

🎯 Overview

This PR adds production-ready infrastructure components to the DataViz service, including health checks, metrics, structured logging, and a complete Helm chart for Kubernetes deployment.

📋 Summary of Changes

🏥 Health & Monitoring Infrastructure

  • Health Check Endpoints: Added /api/health, /api/ready, /api/startup for Kubernetes probes
  • Prometheus Metrics: Application metrics at /api/metrics (uptime, memory, database status, user/chart counts)
  • Structured JSON Logging: Centralized logger with configurable levels for production monitoring
  • Database Health Monitoring: Connection testing with timeout handling

🐳 Container & Orchestration

  • Complete Helm Chart: Production-ready Kubernetes deployment with multi-environment support
  • Docker Compose: Local development stack to test infrastructure changes without devcontainer
  • Auto-Scaling: HPA with CPU/memory thresholds and custom scaling behavior
  • Security: Network policies, pod security contexts, secret management

🛠️ Development Experience

  • Enhanced Local Setup: Docker Compose with PostgreSQL, Redis, and PgAdmin for testing
  • Development Container: Hot-reload Docker setup with proper health checks
  • Environment Parity: Same monitoring and health checks from dev to production

🗂️ Key Files Added

Infrastructure Components

routes/health.ts             # Health check endpoints for K8s probes
lib/logger.ts                # Structured logging system
docker-compose.yaml          # Local testing environment
Dockerfile.dev              # Development container with hot reload

Kubernetes Deployment

charts/dataviz-srv/          # Complete Helm chart
├── Chart.yaml               # Chart metadata with PostgreSQL dependency
├── values.yaml              # Default production configuration
├── values-dev.yaml          # Development overrides
├── values-production.yaml   # Production settings with external DB
├── values-external-db.yaml  # Azure PostgreSQL configuration
└── templates/               # K8s manifests (deployment, service, ingress, HPA, etc.)

🔧 Enhanced Features

Health Checks for Kubernetes

  • Liveness Probe: /api/health - basic service health
  • Readiness Probe: /api/ready - database connectivity check
  • Startup Probe: /api/startup - application initialization verification

Production Monitoring

  • Prometheus Integration: ServiceMonitor for metrics collection
  • Structured Logs: JSON output with request tracking and error correlation
  • Resource Monitoring: Memory usage, database status, application metrics

Multi-Environment Support

  • Development: Single replica, debug logging, internal PostgreSQL
  • Production: Multi-replica, external database, auto-scaling, security policies
  • Flexible Database: Internal PostgreSQL or external managed databases (Azure, AWS RDS)

🚀 Local Testing

The Docker Compose setup allows testing all infrastructure components locally:

# Start full stack with monitoring
docker-compose up --build

# Test health endpoints
curl http://localhost:3003/api/health
curl http://localhost:3003/api/ready
curl http://localhost:3003/api/metrics

# Access PgAdmin at http://localhost:5050

📊 Kubernetes Deployment

# Development environment
helm upgrade --install dataviz-dev ./charts/dataviz-srv \
  --namespace dataviz-dev \
  --values charts/dataviz-srv/values-dev.yaml

# Production deployment
helm upgrade --install dataviz-prod ./charts/dataviz-srv \
  --namespace dataviz \
  --values charts/dataviz-srv/values-production.yaml

🛡️ Production Features

  • Auto-scaling: 2-10 replicas based on CPU/memory usage
  • Security: Non-root containers, network policies, secret management
  • Reliability: Pod disruption budgets, health checks, graceful shutdown
  • Monitoring: Prometheus metrics, structured logging, database health

🔄 Backward Compatibility

All existing APIs and functionality remain unchanged. The additions are:

  • New health/metrics endpoints
  • Enhanced logging (configurable, defaults to existing behavior)
  • Optional Docker Compose for local development
  • Kubernetes deployment capability

✅ Benefits

  1. Kubernetes Ready: Complete production deployment with best practices
  2. Local Testing: Docker Compose eliminates devcontainer dependency for infrastructure testing
  3. Production Monitoring: Health checks, metrics, and structured logging
  4. Scalable: Auto-scaling and resource management
  5. Maintainable: Comprehensive documentation and testing infrastructure

This PR enables production deployment while providing better local development tools for testing the new infrastructure components.

r4770 added 4 commits June 15, 2025 00:56
- Add health check routes (/api/health, /api/ready, /api/metrics)
- Add Docker Compose for local development
- Add Dockerfile.dev for development builds
- Update email service with development mode
- Fix middleware order for health endpoints
- Update .gitignore and .dockerignore
- Add centralized logger with JSON output
- Fix duplicate 404 logging issue
- Clean error responses without stack traces
- Add request logging middleware
- Support LOG_LEVEL environment variable
- Add production-ready Helm chart with configurable options
- Support internal PostgreSQL or external Azure Database
- Include health checks, security contexts, and network policies
- Add HPA with custom scaling behavior
- Include pre-install migration job and post-install tests
- Support multiple environments (dev/staging/prod)
- Add ServiceMonitor for Prometheus integration
- Add comprehensive documentation with Mermaid diagrams
- Document API workflows, database schema, and architecture
- Include local development setup (containerized and non-containerized)
- Add Fly.io deployment instructions
- Document data format requirements with examples
- Add monitoring, troubleshooting, and API usage examples
@r4770 r4770 self-assigned this Jun 23, 2025
@r4770 r4770 requested a review from Lorezz June 23, 2025 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants