Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-arch-builders: extend AWS tofu bits to work for x86_64 or aarch64 #986

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
171 changes: 11 additions & 160 deletions multi-arch-builders/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,87 +3,11 @@ Here are some rough instructions for bringing up multi-arch builders.

### aarch64

The aarch64 builder runs on an AWS bare metal node. We use a bare
metal node for `/dev/kvm` access.
The aarch64 builder runs on an AWS bare metal node (we use a bare
metal node for `/dev/kvm` access) and deployed via terraform/tofu.
Change directory into and follow the directions at:

```bash
# Add your credentials to the environment.
HISTCONTROL='ignoreboth'
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=XXXX
export AWS_SECRET_ACCESS_KEY=YYYYYYYY
```

Create the Ignition config

```bash
cat builder-common.bu | butane --pretty --strict > builder-common.ign
cat coreos-aarch64-builder.bu | butane --pretty --strict --files-dir=. > coreos-aarch64-builder.ign
```

Bring the instance up with appropriate details:

```bash
NAME="coreos-aarch64-builder-$(date +%Y%m%d)"
AMI=''
TYPE='a1.metal'
DISK='200'
SUBNET='subnet-050b478f586723c62'
SECURITY_GROUPS='sg-0ff537e445349ca0e'
USERDATA="${PWD}/coreos-aarch64-builder.ign"
aws ec2 run-instances \
--output json \
--image-id $AMI \
--instance-type $TYPE \
--subnet-id $SUBNET \
--security-group-ids $SECURITY_GROUPS \
--user-data "file://${USERDATA}" \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=${NAME}}]" \
--block-device-mappings "VirtualName=/dev/xvda,DeviceName=/dev/xvda,Ebs={VolumeSize=${DISK},VolumeType=gp3}" \
> out.json
```

Wait for the instance to come up (`a1.metal` instances can take 5-10 minutes to
come up) and log in:

```bash
INSTANCE=$(jq --raw-output .Instances[0].InstanceId out.json)
IP=$(aws ec2 describe-instances --instance-ids $INSTANCE --output json \
| jq -r '.Reservations[0].Instances[0].PublicIpAddress')
ssh "core@${IP}"
```

Make sure the instance came up fine:

```bash
sudo systemctl --failed
```

Now that the instance is up we can re-assign the floating IP address.
This removes the IP from the existing instance (if there is one) so you'll
want to make sure no jobs are currently running on the existing instance
by checking to make sure Jenkins is idle (i.e. no build-cosa or multi-arch
aarch64 jobs are running).

```bash
# Grab the instance ID and associate the IP address
INSTANCE=$(jq --raw-output .Instances[0].InstanceId out.json)
EIP='18.233.54.49'
EIPID='eipalloc-4305254a'
aws ec2 associate-address --instance-id $INSTANCE --allow-reassociation --allocation-id $EIPID
```

Now you should be able to `ssh "core@${EIP}"`.

NOTE: Just this once ignore the ssh host key changed warning if you see it.


Once you are ready the old builder can be taken down:

```
OLDINSTANCEID=<foo> # use `aws ec2 describe-instances` to find
aws ec2 terminate-instances --instance-ids $OLDINSTANCEID
```
- [Provisioning aarch64 builder](provisioning/aarch64/README.md)

### ppc64le

Expand Down Expand Up @@ -217,84 +141,11 @@ ibmcloud is instance-delete $OLDINSTANCEID

### x86_64

The x86_64 builder runs on an AWS node without `/dev/kvm` access. Right now this
builder only builds the COSA container image and does not do FCOS builds so it
doesn't need `/dev/kvm`. If that need changes then we can switch the instance type
in the future.
The x86_64 builder runs in AWS. It is currently not used to build
Fedora CoreOS, but is used to build and push the x86_64 version
of various container images. This detail means that this builder
doesn't need to be an AWS bare metal node for `/dev/kvm` access.
It is deployed via terraform/tofu. Change directory into and follow
the directions at:

```bash
# Add your credentials to the environment.
HISTCONTROL='ignoreboth'
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=XXXX
export AWS_SECRET_ACCESS_KEY=YYYYYYYY
```

Create the Ignition config

```bash
cat builder-common.bu | butane --pretty --strict > builder-common.ign
cat coreos-x86_64-builder.bu | butane --pretty --strict --files-dir=. > coreos-x86_64-builder.ign
```

Bring the instance up with appropriate details:

```bash
NAME="coreos-x86_64-builder-$(date +%Y%m%d)"
AMI=''
TYPE='c6a.xlarge'
DISK='100'
SUBNET='subnet-050b478f586723c62'
SECURITY_GROUPS='sg-0ff537e445349ca0e'
USERDATA="${PWD}/coreos-x86_64-builder.ign"
aws ec2 run-instances \
--output json \
--image-id $AMI \
--instance-type $TYPE \
--subnet-id $SUBNET \
--security-group-ids $SECURITY_GROUPS \
--user-data "file://${USERDATA}" \
--tag-specifications "ResourceType=instance,Tags=[{Key=Name,Value=${NAME}}]" \
--block-device-mappings "VirtualName=/dev/xvda,DeviceName=/dev/xvda,Ebs={VolumeSize=${DISK},VolumeType=gp3}" \
> out.json
```

Wait for the instance to come up and log in:

```bash
INSTANCE=$(jq --raw-output .Instances[0].InstanceId out.json)
IP=$(aws ec2 describe-instances --instance-ids $INSTANCE --output json \
| jq -r '.Reservations[0].Instances[0].PublicIpAddress')
ssh "core@${IP}"
```

Make sure the instance came up fine:

```bash
sudo systemctl --failed
```

Now that the instance is up we can re-assign the floating IP address.
This removes the IP from the existing instance (if there is one) so you'll
want to make sure no jobs are currently running on the existing instance
by checking to make sure Jenkins is idle (i.e. no build-cosa jobs are running).

```bash
# Grab the instance ID and associate the IP address
INSTANCE=$(jq --raw-output .Instances[0].InstanceId out.json)
EIP='34.199.112.205'
EIPID='eipalloc-01bfbeca9d47b2202'
aws ec2 associate-address --instance-id $INSTANCE --allow-reassociation --allocation-id $EIPID
```

Now you should be able to `ssh "core@${EIP}"`.

NOTE: Just this once ignore the ssh host key changed warning if you see it.


Once you are ready the old builder can be taken down:

```
OLDINSTANCEID=<foo> # use `aws ec2 describe-instances` to find
aws ec2 terminate-instances --instance-ids $OLDINSTANCEID
```
- [Provisioning x86_64 builder](provisioning/x86_64/README.md)
1 change: 0 additions & 1 deletion multi-arch-builders/coreos-aarch64-builder.bu
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
# This butane config will do the following:
#
# - Merge in the builder-common.ign Ignition file
# - Allow the builder user to log in with the associated ssh key
# - Set a hostname
#
Expand Down
5 changes: 0 additions & 5 deletions multi-arch-builders/coreos-x86_64-builder.bu
Original file line number Diff line number Diff line change
@@ -1,15 +1,10 @@
# This butane config will do the following:
#
# - Merge in the builder-common.ign Ignition file
# - Allow the builder user to log in with the associated ssh key
# - Set a hostname
#
variant: fcos
version: 1.4.0
ignition:
config:
merge:
- local: builder-common.ign
passwd:
users:
- name: builder
Expand Down
12 changes: 6 additions & 6 deletions multi-arch-builders/provisioning/aarch64/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

```bash
# Add your credentials to the environment.
# Be aware for aarch64 the region is us-east-2
# Be aware for rhcos the region is us-east-2
HISTCONTROL='ignoreboth'
export AWS_DEFAULT_REGION=us-east-2
export AWS_ACCESS_KEY_ID=XXXX
Expand Down Expand Up @@ -61,7 +61,7 @@ export TF_VAR_itpaas_splunk_repo=...
# If you plan to make changes to the code as modules/plugins, go ahead and run it:
tofu init -upgrade
# To destroy it run:
tofu destroy -target aws_instance.coreos-aarch64-builder
tofu destroy -target aws_instance.coreos-builder
```
## Generating additional resources with unique names

Expand All @@ -74,18 +74,18 @@ To achieve this, you'll need to manually edit the resource name
in the Tofu configuration.

```
resource "aws_instance" "coreos-aarch64-builder"
resource "aws_instance" "coreos-builder"
```
Make sure the resource name is unique, in this case
if I already have a resource named `coreos-aarch64-builder`,
I need to change it to `coreos-aarch64-devel-builder` for example.
if I already have a resource named `coreos-builder`,
I need to change it to `coreos-devel-builder` for example.

I may also want to update the project var:

```
variable "project" {
type = string
default = "coreos-aarch64-devel-builder"
default = "coreos-devel-builder"
}
```

Expand Down
6 changes: 6 additions & 0 deletions multi-arch-builders/provisioning/aarch64/architecture.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# We are deploying for? aarch64 here
variable "arch" {
type = string
default = "aarch64"
}

42 changes: 25 additions & 17 deletions multi-arch-builders/provisioning/aarch64/main.tf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does TF support includes? I feel like it might be clearer to rename this file, move it one level up and have the aarch64 and x86_64 bits just define the arch and include this one?

Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,6 @@ provider "aws" {}
provider "ct" {}
provider "http" {}

variable "project" {
type = string
default = "coreos-aarch64-builder"
}

# Which distro are we deploying a builder for? Override the
# default by setting the env var: TF_VAR_distro=rhcos
variable "distro" {
Expand All @@ -40,6 +35,11 @@ check "health_check_distro" {
}
}

locals {
project = "coreos-${var.arch}-builder"
}



# Variables used for splunk deployment, which is only
# for RHCOS builders. Define them in the environment with:
Expand Down Expand Up @@ -74,10 +74,10 @@ check "health_check_rhcos_splunk_vars" {

locals {
fcos_snippets = [
file("../../coreos-aarch64-builder.bu"),
file("../../coreos-${var.arch}-builder.bu"),
]
rhcos_snippets = [
file("../../coreos-aarch64-builder.bu"),
file("../../coreos-${var.arch}-builder.bu"),
templatefile("../../builder-splunk.bu", {
SPLUNK_HOSTNAME = var.splunk_hostname
SPLUNK_SIDECAR_REPO = var.splunk_sidecar_repo
Expand All @@ -95,15 +95,16 @@ data "aws_region" "aws_region" {}

# Gather information about the AWS image for the current region
data "http" "stream_metadata" {
url = "https://builds.coreos.fedoraproject.org/streams/stable.json"

url = var.distro == "rhcos" ? "https://builds.coreos.fedoraproject.org/streams/stable.json" : "https://builds.coreos.fedoraproject.org/streams/testing.json"
request_headers = {
Accept = "application/json"
}
}
# Lookup the aarch64 AWS image for the current AWS region
# Lookup the AWS image for the current AWS region/architecture
# Also set the instance_type based on the architecture
locals {
ami = lookup(jsondecode(data.http.stream_metadata.body).architectures.aarch64.images.aws.regions, data.aws_region.aws_region.name).image
ami = lookup(lookup(jsondecode(data.http.stream_metadata.body).architectures, var.arch).images.aws.regions, data.aws_region.aws_region.name).image
instance_type = var.arch == "aarch64" ? "m6g.metal" : "c6a.xlarge"
}

variable "rhcos_aws_vpc_prod" {
Expand All @@ -122,22 +123,29 @@ locals {
}


resource "aws_instance" "coreos-aarch64-builder" {
resource "aws_instance" "coreos-builder" {
tags = {
Name = "${var.project}-${formatdate("YYYYMMDD", timestamp())}"
Name = "${local.project}-${formatdate("YYYYMMDD", timestamp())}"
}
ami = local.ami
user_data = data.ct_config.butane.rendered
instance_type = "m6g.metal"
instance_type = local.instance_type
vpc_security_group_ids = [aws_security_group.sg.id]
subnet_id = local.aws_subnet_id
root_block_device {
volume_size = "200"
volume_size = "300"
volume_type = "gp3"
}
associate_public_ip_address = var.distro == "fcos" ? "true" : "false"
associate_public_ip_address = "false"
}

# associate the elastic IP
resource "aws_eip_association" "aws_eip_association" {
count = var.distro == "fcos" ? 1 : 0
instance_id = aws_instance.coreos-builder.id
public_ip = var.arch == "aarch64" ? "18.233.54.49" : "34.199.112.205"
}

output "instance_ip_addr" {
value = var.distro == "rhcos" ? aws_instance.coreos-aarch64-builder.private_ip : aws_instance.coreos-aarch64-builder.public_ip
value = var.distro == "rhcos" ? aws_instance.coreos-builder.private_ip : aws_eip_association.aws_eip_association[0].public_ip
}
6 changes: 3 additions & 3 deletions multi-arch-builders/provisioning/aarch64/networks.tf
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ resource "aws_vpc" "vpc" {
count = var.distro == "fcos" ? 1 : 0
cidr_block = "172.31.0.0/16"
tags = {
Name = "${var.project}-vpc"
Name = "${local.project}-vpc"
}
}

Expand All @@ -27,7 +27,7 @@ resource "aws_subnet" "private_subnets" {
cidr_block = element(var.private_subnet_cidrs, count.index)
availability_zone = element(data.aws_availability_zones.azs.names, count.index)
tags = {
Name = "${var.project}-private-subnet-${count.index + 1}"
Name = "${local.project}-private-subnet-${count.index + 1}"
}
}

Expand All @@ -40,7 +40,7 @@ resource "aws_route_table" "internet_route" {
gateway_id = aws_internet_gateway.gw[0].id
}
tags = {
Name = "${var.project}-ig"
Name = "${local.project}-ig"
}
}

Expand Down
Loading