Skip to content

Commit

Permalink
Merge pull request #13 from comet-ml/eks-storage
Browse files Browse the repository at this point in the history
Configure ephemeral storage for EKS worker nodes
  • Loading branch information
burmek authored Oct 23, 2023
2 parents 5fff3c0 + 9ab6225 commit 52f7a30
Show file tree
Hide file tree
Showing 7 changed files with 34 additions and 7 deletions.
12 changes: 7 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Terraform module for deploying infrastructure components to run CometML.

**Infrastructure Deployment:**
- Follow the steps below to deploy directly from the GitHub repository.
- Clone the repository to your local machine: `git clone https://github.com/comet-ml/dply-terraform-aws.git`
- Clone the repository to your local machine: `git clone https://github.com/comet-ml/terraform_aws_comet.git`
- Move into the deployment directory: `cd terraform-aws-comet`
- Initialize the directory: `terraform init`
- Within terraform.tfvars, set your module toggles to enable the desired infrastructure components and set any related inputs
Expand All @@ -36,7 +36,7 @@ terraform {
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | ~> 5.1 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | ~>2.10 |
| <a name="requirement_helm"></a> [helm](#requirement\_helm) | ~> 2.10 |
| <a name="requirement_kubernetes"></a> [kubernetes](#requirement\_kubernetes) | ~> 2.21 |

## Providers
Expand Down Expand Up @@ -79,12 +79,14 @@ terraform {
| <a name="input_comet_vpc_id"></a> [comet\_vpc\_id](#input\_comet\_vpc\_id) | ID of an existing VPC to provision resources in | `string` | `null` | no |
| <a name="input_eks_aws_cloudwatch_metrics"></a> [eks\_aws\_cloudwatch\_metrics](#input\_eks\_aws\_cloudwatch\_metrics) | Enables AWS Cloudwatch Metrics in the EKS cluster | `bool` | `true` | no |
| <a name="input_eks_aws_load_balancer_controller"></a> [eks\_aws\_load\_balancer\_controller](#input\_eks\_aws\_load\_balancer\_controller) | Enables the AWS Load Balancer Controller in the EKS cluster | `bool` | `true` | no |
| <a name="input_eks_cert_manager"></a> [eks\_cert\_manager](#input\_eks\_cert\_manager) | Enables cert-manager in the EKS cluster | `bool` | `true` | no |
| <a name="input_eks_cert_manager"></a> [eks\_cert\_manager](#input\_eks\_cert\_manager) | Enables cert-manager in the EKS cluster | `bool` | `false` | no |
| <a name="input_eks_cluster_name"></a> [eks\_cluster\_name](#input\_eks\_cluster\_name) | Name for EKS cluster | `string` | `"comet-eks"` | no |
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | Kubernetes version of the EKS cluster | `string` | `"1.26"` | no |
| <a name="input_eks_external_dns"></a> [eks\_external\_dns](#input\_eks\_external\_dns) | Enables ExternalDNS in the EKS cluster | `bool` | `true` | no |
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | Kubernetes version of the EKS cluster | `string` | `"1.27"` | no |
| <a name="input_eks_external_dns"></a> [eks\_external\_dns](#input\_eks\_external\_dns) | Enables ExternalDNS in the EKS cluster | `bool` | `false` | no |
| <a name="input_eks_external_dns_r53_zones"></a> [eks\_external\_dns\_r53\_zones](#input\_eks\_external\_dns\_r53\_zones) | Route 53 zones for external-dns to have access to | `list(string)` | <pre>[<br> "arn:aws:route53:::hostedzone/XYZ"<br>]</pre> | no |
| <a name="input_eks_mng_ami_type"></a> [eks\_mng\_ami\_type](#input\_eks\_mng\_ami\_type) | AMI family to use for the EKS nodes | `string` | `"AL2_x86_64"` | no |
| <a name="input_eks_mng_desired_size"></a> [eks\_mng\_desired\_size](#input\_eks\_mng\_desired\_size) | Desired number of nodes in EKS cluster | `number` | `3` | no |
| <a name="input_eks_mng_disk_size"></a> [eks\_mng\_disk\_size](#input\_eks\_mng\_disk\_size) | Size of the storage disks for nodes in EKS cluster | `number` | `500` | no |
| <a name="input_eks_mng_max_size"></a> [eks\_mng\_max\_size](#input\_eks\_mng\_max\_size) | Maximum number of nodes in EKS cluster | `number` | `6` | no |
| <a name="input_eks_mng_name"></a> [eks\_mng\_name](#input\_eks\_mng\_name) | Name for the EKS managed nodegroup | `string` | `"mng"` | no |
| <a name="input_eks_node_types"></a> [eks\_node\_types](#input\_eks\_node\_types) | Node instance types for EKS managed node group | `list(string)` | <pre>[<br> "m5.4xlarge"<br>]</pre> | no |
Expand Down
1 change: 1 addition & 0 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,7 @@ module "comet_eks" {
eks_node_types = var.eks_node_types
eks_mng_desired_size = var.eks_mng_desired_size
eks_mng_max_size = var.eks_mng_max_size
eks_mng_disk_size = var.eks_mng_disk_size
eks_aws_load_balancer_controller = var.eks_aws_load_balancer_controller
eks_cert_manager = var.eks_cert_manager
eks_aws_cloudwatch_metrics = var.eks_aws_cloudwatch_metrics
Expand Down
4 changes: 3 additions & 1 deletion modules/comet_eks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
| Name | Source | Version |
|------|--------|---------|
| <a name="module_eks"></a> [eks](#module\_eks) | terraform-aws-modules/eks/aws | ~> 19.9 |
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | 0.2.0 |
| <a name="module_eks_blueprints_addons"></a> [eks\_blueprints\_addons](#module\_eks\_blueprints\_addons) | aws-ia/eks-blueprints-addons/aws | 1.9.1 |
| <a name="module_irsa-ebs-csi"></a> [irsa-ebs-csi](#module\_irsa-ebs-csi) | terraform-aws-modules/iam/aws//modules/iam-assumable-role-with-oidc | 4.7.0 |

## Resources
Expand All @@ -36,8 +36,10 @@
| <a name="input_eks_cluster_name"></a> [eks\_cluster\_name](#input\_eks\_cluster\_name) | Name for the EKS cluster | `string` | n/a | yes |
| <a name="input_eks_cluster_version"></a> [eks\_cluster\_version](#input\_eks\_cluster\_version) | Kubernetes version for the EKS cluster | `string` | n/a | yes |
| <a name="input_eks_external_dns"></a> [eks\_external\_dns](#input\_eks\_external\_dns) | Enables ExternalDNS in the EKS cluster | `bool` | n/a | yes |
| <a name="input_eks_external_dns_r53_zones"></a> [eks\_external\_dns\_r53\_zones](#input\_eks\_external\_dns\_r53\_zones) | Route 53 zones for external-dns to have access to | `list(string)` | n/a | yes |
| <a name="input_eks_mng_ami_type"></a> [eks\_mng\_ami\_type](#input\_eks\_mng\_ami\_type) | AMI family to use for the EKS nodes | `string` | n/a | yes |
| <a name="input_eks_mng_desired_size"></a> [eks\_mng\_desired\_size](#input\_eks\_mng\_desired\_size) | Desired number of nodes in EKS cluster | `number` | n/a | yes |
| <a name="input_eks_mng_disk_size"></a> [eks\_mng\_disk\_size](#input\_eks\_mng\_disk\_size) | Size of the storage disks for nodes in EKS cluster | `number` | n/a | yes |
| <a name="input_eks_mng_max_size"></a> [eks\_mng\_max\_size](#input\_eks\_mng\_max\_size) | Maximum number of nodes in EKS cluster | `number` | n/a | yes |
| <a name="input_eks_mng_name"></a> [eks\_mng\_name](#input\_eks\_mng\_name) | Name for the EKS managed nodegroup | `string` | n/a | yes |
| <a name="input_eks_node_types"></a> [eks\_node\_types](#input\_eks\_node\_types) | Node instance types for EKS managed node group | `list(string)` | n/a | yes |
Expand Down
11 changes: 11 additions & 0 deletions modules/comet_eks/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,17 @@ module "eks" {
min_size = var.eks_mng_desired_size
max_size = var.eks_mng_max_size
desired_size = var.eks_mng_desired_size
block_device_mappings = {
xvda = {
device_name = "/dev/xvda"
ebs = {
volume_size = var.eks_mng_disk_size
volume_type = "gp3"
encrypted = false
delete_on_termination = true
}
}
}

iam_role_additional_policies = var.s3_enabled ? { comet_s3_access = var.comet_ec2_s3_iam_policy } : {}
}
Expand Down
5 changes: 5 additions & 0 deletions modules/comet_eks/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,11 @@ variable "eks_mng_max_size" {
type = number
}

variable "eks_mng_disk_size" {
description = "Size of the storage disks for nodes in EKS cluster"
type = number
}

variable "eks_aws_load_balancer_controller" {
description = "Enables the AWS Load Balancer Controller in the EKS cluster"
type = bool
Expand Down
6 changes: 6 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,12 @@ variable "eks_mng_max_size" {
default = 6
}

variable "eks_mng_disk_size" {
description = "Size of the storage disks for nodes in EKS cluster"
type = number
default = 500
}

variable "eks_aws_load_balancer_controller" {
description = "Enables the AWS Load Balancer Controller in the EKS cluster"
type = bool
Expand Down
2 changes: 1 addition & 1 deletion versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ terraform {
}
helm = {
source = "hashicorp/helm"
version = "~>2.10"
version = "~> 2.10"
}
}
}

0 comments on commit 52f7a30

Please sign in to comment.