This is a demonstration of how the setup of a load balanced database cluster with one master and two slaves can be automated using Packer, Terraform and bash scripts. In this example, we will configure one HAProxy server to load balance traffic to the three back-end MySQL servers. We will configure a total of two MySQL users for HAProxy, one to perform a health check and verify services and another to access MySQL for data retrieval. Please follow the instructions carefully to get the desired result.
- Google Cloud Platform
- MySQL
- Packer
- Terraform
- HAProxy
- Bash
Before you can test out the scripts, the following must have been taken care of:
-
You should have an account on Google Cloud Platform. You can visit here to sign up if you don't.
-
Create a project from your GCP console for the test purpose.
-
Create a service account key from the Google Cloud console and download in
json
format. You can find a guide in the Setting Up Authentication section of this post.After downloading the key, place it in the root of the repository and rename the file to
gcp_account.json
. -
Add the following parameters as project-wide metadata (similar to environment variables). These variables will be used in the configuration process. Follow the same pattern shown in the sample file, but feel free to update the values to what ever you choose.
-
Install Packer and Terraform if you do not already have them installed on your local machine.
This project was designed to be hosted in a Public and Privately Routed VPC on Google Cloud. The details of the considerations made while designing this are contained in this document.
This step will create the base images that will be used to build the master, slave and HAProxy servers. These images will be preconfigured with setups for these purposes respectively.
-
Clone the repository and navigate to the project folder in your terminal.
-
Change directory to the
packer
directory:cd packer
-
Provide your project id as an environment variable:
export PROJECT_ID=<your-project-id>
-
Validate that the packer template files for these images are correct. The command should return a message indicating successful validation respectively:
packer validate ./templates/master.json packer validate ./templates/slave1.json packer validate ./templates/slave2.json packer validate ./templates/ha_proxy.json
-
Build the packer images for the master, slave and HAProxy servers with the following commands respectively:
packer build ./templates/master.json packer build ./templates/slave1.json packer build ./templates/slave2.json packer build ./templates/ha_proxy.json
You will find your custom images in the registry once they are successfully created:
With the base images in place, we can now proceed to spin up our servers with them.
-
Move out from the
packer
directory intoterraform
:cd ../terraform
-
Initialize Terraform and download necessary modules:
terraform init
-
View a plan of the resources that will be added to your infrastructure:
terraform plan
-
Run
terraform apply
to set up the infrastructure, and follow the prompts to provide required confirmation.
Once the apply
step is done, the required resources would have been added to your project. Take note of the following instances (servers) that were created:
- db01: Master database server
- db02 & db03: Slave database servers
- ha-proxy: HAProxy loadbalancer server
- nat-instance: The NAT gateway instance from which you can access the master and slave servers.
The following configurations has been done for the master server (via the custom base image):
- Installed MySQL and updated its configuration to make the server a master.
- Created a replication user
- Created a test database
pets
, with adogs
table
However, the slave servers only come with MySQL installed out of the box. The remaining setup will be done in the steps below.
-
Log in to the
nat-instance
server by using theSSH
link as shown in the screenshot below: -
Once in the
nat-instance
, log in to the master server aspacker
user:gcloud beta compute ssh db01 --internal-ip --zone us-central1-a sudo su - packer
-
Using a different browser window, log in to one slave server, say db02 as
packer
user as well:gcloud beta compute ssh db02 --internal-ip --zone us-central1-b sudo su - packer
-
From master, execute
create_haproxy_users.sh
: to configure the users required for load balancing:sudo chmod +x ./create_haproxy_users.sh ./create_haproxy_users.sh
You will prompted for your SQL password when executing the scipt. Ensure to enter the correct
sql_root_password
value as specified in the metadata. -
From master, take a dump of the database using
mysqldump
:sudo mysqldump -uroot -p<your-sql-root-password> --all-databases --master-data > masterdump.sql
Note that
<your-sql-root-password>
should match what you added to the metadata.After taking the dump, run
ls
to confirm that the dump filemasterdump.sql
exists. -
From master, create an
ssh
key that will be added to the slave servers forssh
authentication:ssh-keygen -t rsa
Just press
enter
when prompted to enter a file and passphrase for your key pair.Once the key pair has been created, display the content of your
id_rsa.pub
key and copy the output for the next step:cat ~/.ssh/id_rsa.pub
-
From slave, add the public key from 6. to the
authorized_keys
file:echo public_key_string >> ~/.ssh/authorized_keys
Substitute the
public_key_string
with the output from thecat ~/.ssh/id_rsa.pub
command you must have copied. -
From master, copy the dump file to the slave:
scp -i ./.ssh/id_rsa masterdump.sql packer@<internal-ip-of-db02>:~/masterdump.sql
Substitute
<internal-ip-of-db02>
with the internal IP of db02. After the copy step, confirm that the file now exists on the slave with anls
command. -
From slave: now we have a copy of the dump file on the slave, execute a script to finish up the slave replication setup:
sudo chmod +x ./slave_replication.sh ./slave_replication.sh
If you get an output Slave IO state OK at the end of the script execution, then that means that your slave replication was set up correctly.
With the first slave server sorted out, you should set up the second slave server (db03) as well. SSH
into db03 through the NAT gateway instance and repeat steps 7. to 9..
You don't need to generate a new ssh
key pair for the master, just work with the one created in step 6.
Also when running the command to copy the dump file to the slave, be sure to update the destination IP address to match that of db03, ie:
scp -i ./.ssh/id_rsa masterdump.sql packer@<internal-ip-of-db03>:~/masterdump.sql
-
From master, login to mysql:
sudo mysql -uroot -p<your-sql-root-password>
-
Once in MySQL prompt, add a new entry to the
dogs
table in the database:INSERT INTO pets.dogs values ('bingo');"
-
Confirm that the entry was added successfully:
SELECT * from pets.dogs;
You should see
fluffy
andbingo
in the result output. -
From any slave, login to mysql:
sudo mysql -uroot -p<your-sql-root-password>
-
Retrieve the contents of the
dogs
table:SELECT * from pets.dogs;
Voila! You should see
fluffy
andbingo
in the result as well. You can repeat steps 4. and 5. for the second slave instance and hopefully, the result will be the same.
At this point, the database cluster should be up, with replication correctly set up. What's left is to update the configuration of the load balancer to begin to listen for requests, and distribute traffic to the servers based on algorithm specified in the config.
-
SSH
into HAProxy's server. This is a public facing instance with a public IP so you can connect directly using the SSH shortcut on the VM instances page (on Google Cloud) -
Log in as the user
packer
and execute the setup script in the home directory:sudo su - packer chmod +x ha_proxy.sh ./ha_proxy.sh
The setup script:
-
Installs MySQL
-
Installs and enables HAProxy
-
Creates a copy of the existing configuration file as a backup:
- Working copy:
/etc/haproxy/haproxy.cfg
- Backup copy:
/etc/haproxy/haproxy.cfg.original
If you ever need to restore the original config file, always comment out line
23
and24
. This was done programatically via the setup script: - Working copy:
One error I encountered while testing the script was this:
...... About to install MySQL.... E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily unavailable) E: Unable to lock the administration directory (/var/lib/dpkg/), is another process using it?
This error usually means that there is already an apt process running. I got past this by waiting a few minutes before executing the script again.
If the script executes successfully, run
sudo service haproxy status
to ensure that HAProxy is active and running. -
-
Switch to the home directory of the root user:
sudo su cd ~
Open the working copy of the config file using vim and add the following blocks of configuration:
vi /etc/haproxy/haproxy.cfg
listen mysql-cluster bind <haproxy-internal-ip>:3306 mode tcp option mysql-check user haproxy_check balance roundrobin server db01 <db01-internal-ip>:3306 check server db02 <db02-internal-ip>:3306 check server db03 <db03-internal-ip>:3306 check
Substitute
<haproxy-internal-ip>
,<db01-internal-ip>
,<db02-internal-ip>
and<db03-internal-ip>
with the respective internal IPs of these servers:- Line
41
specifies the user (already created in the base image) that will perform a check to ascertail if the MySQL servers are are up or not (viamysql-cluster
configuration) - Line
42
specifies the load balancing algorithm to be used. Round Robin works by selecting servers sequentially from the cluster. The load balancer will select the first server on its list for the first request, then move down the list in order, starting over at the top when it reaches the end.
listen stats bind *:8080 mode http option httplog stats enable stats uri / stats realm Strictly\ Private stats auth username:password
This config block enables HAProxy's web UI so we can see the statistics of load balancing.
username
andpassword
instats auth
will be used to log in to the web UI, so you can update it to wahtever values you wish.Your config file now look like this after adding these blocks:
- Line
-
Once the updated configuration has been saved, reload HAProxy and check the status to ensure that it reloaded correctly with no error:
service haproxy reload service haproxy status :q # COMMENT: to exit the status page
The prompts Proxy mysql-cluster started. and Proxy stats started. from the status output indicate that the load balancing configuration was applied successfully.
-
Access the web UI:
http://<Public IP of Load Balancer>:8080/
If everything goes well, you will have your database servers up (indicated by the green color code) on the dashboard like so:
-
If everything still looks good at this point, then it's time to test that the load balancing actually works:
for i in `seq 1 6`; do mysql -h 0.0.0.0 -uhaproxy_root -e "show variables like 'server_id'"; done
Substitute
0.0.0.0
with the private IP of the load balancer (same as that specified on line39
of the screenshot in 3.The output of the above command sholud look like this:
The command queries the cluster six times for a server's
server_id
and the result demonstrates round robin load balancing where the servers are picked sequentially to fulfil the request. The first reqquest goes todb01
which has aserver_id
of 1 (as the master). The next request goes todb02
with aserver_id
of 2, and the the next request goes todb03
with aserver_id
of 3. With the list of servers exhausted, the subsequent request go round again starting from the first server on the list.