This repository contains all the code and resources needed for creating the infrastructure used in the IAM project, or more precisely for DinoPark.
The repository is structured in top-level folders containing the different kind of resources needed to create AWS resources (terraform folder), configuring the different Kubernetes cluster (kubernetes folder), documentation, scripts which automate common operations and dockerfiles specific to the IAM project.
The Terraform code and Kubernetes manifests are organized by folders representing the different environments as observed by the infrastructure, and inisde there are organized by the AWS region where reside. This layout clearly has the disadvantage of repeating code, but has the advantage of knowing exactly what is deployed and where just looking at the folders. In this way you don't have to look at templates and render values in your head but just go and look at the code. While this has helped us during the beginning, once we reach a more mature state we might decide to use Helm for templating Kubernetes manifests or modify Terraform resources to create modules..
The Terraform code, as stated above is divided in folder representing environments and location. Also the resources needed for both environments staging and production will go to the global folder, like for example a policy allowing other AWS account to fetch metrics.
Inside of each environment and location the code is organized in independent modules. This means that each of the components are maintaining its own state file rather than sharing one for all the resources. This design has mostly 2 implications that can be considered an advantage or a disadvantage. The first one is that you can issue terraform delete
only affecting the resource that you want, think for example if we stop using Graylog, issuing a terraform destroy
on the Graylog folder will delete the ES cluster and DNS name but leave the rest of the infrastructure as it is. The second implication is that in order to share the state we use remote_state
pointing to the state file used by the resource for example most of the services need to know the VPC id and are adding it as a remote state.
The Kubernetes manifests present on this repository are organized in a similar fashion to the Terraform ones. Inside the Kubernetes folder we can find 2 more folder each one corresponding to one of the clusters: one for production and one for staging. Here there is all the infrastructure resources, the manifests containing application specific resources like namespaces and deployments are living in each application repository.
Documentation about differen topics like managing users in EKS, deploying applications into the cluster, troubleshooting problems and monitoring applications can be found in different files inside the docs folder. These different files are intended for developers and administrators.
Here is a table of content with the different topics:
- 1 EKS cluster management
- 1.1 Introduction
- 1.2 Overview
- 1.2.1 Terraform
- 1.2.2 Remote state
- 1.2.3 EKS workers
- 1.3 Deploy your first EKS cluster
- 1.3.1 Requirements
- 1.3.2 Terraform options
- 1.3.3 Create resources
- 1.3.4 Test cluster authentication
- 1.3.5 Add workers
- 1.3.6 Cleanup
- 1.4 Upgrades
- 1.4.1 EKS cluster
- 1.4.2 EKS Workers
- 1.4.3 Prometheus operator
- 2 Kubernetes administration
- 2.1 Introduction
- 2.2 User management
- 2.2.1 Allow Codebuild to deploy
- 2.2.2 Add a new user
- 2.3 Kube2IAM
- 3 Deploying applications to the cluster
- 3.1 Introduction
- 3.2 Overview
- 3.3 CI/CD Pipeline using Terraform
- 3.3.1 Create the Pipeline
- 3.3.2 The build stage
- 3.3.3 The deployment stage
- 3.4 CI/CD Pipeline using AWS Console
- 3.4.1 Create the Pipeline
- 3.4.2 The build stage
- 3.4.3 The deployment stage
- 4 Monitoring applications
- 4.1 Metrics in Kubernetes
- 4.2 Central logging stack
- 4.2.1 Intro
- 4.2.2 Components
- 4.2.3 Deployment
- 4.2.4 Configuration
- 4.2.5 Usage
- 5 Runbooks and Troubleshooting
- 5.1 Disaster recovery and backups
- 5.2 General problems
- 5.3 Cluster services
- 5.3.1 MongoDB
- 5.4 Applications
- 5.4.1 SSO Dashboard