Multi-cluster Ingress for GKE is a cloud-hosted Ingress controller for GKE clusters. It's a Google-hosted service that supports deploying shared load balancing resources across clusters and across regions.
- Disaster recovery for internet traffic across clusters or regions
- Flexible migration between clusters
- Low-latency serving of traffic to globally distributed GKE clusters
- Multi-cluster Ingress Concepts
- Setting Up Multi-cluster Ingress
- Deploying Ingress Across Clusters
- Google Cloud External HTTP(S) Load Balancing
- GKE clusters on GCP
- All versions of GKE supported
- Tested and validated with 1.25.4-gke.1600 on Jan 3rd 2023
This recipe demonstrates deploying Multi-cluster Ingress across two clusters to expose two different Services hosted across both clusters. The cluster gke-1
is in us-west1-a
and gke-2
is hosted in us-east1-b
, demonstrating multi-regional load balancing across clusters. All Services will share the same MultiClusterIngress and load balancer IP, but the load balancer will match traffic and send it to the right region, cluster, and Service depending on the request.
There are two applications in this example, foo and bar. Each is deployed on both clusters. The External HTTP(S) Load Balancer is designed to route traffic to the closest (to the client) available backend with capacity. Traffic from clients will be load balanced to the closest backend cluster depending on the traffic matching specified in the MultiClusterIngress resource.
The two clusters in this example can be backends to MCI only if they are registered through Hub. Hub is a central registry of clusters that determines which clusters MCI can function across. A cluster must first be registered to Hub before it can be used with MCI.
There are two Custom Resources (CRs) that control multi-cluster load balancing - the MultiClusterIngress (MCI) and the MultiClusterService (MCS). The MCI below describes the desired traffic matching and routing behavior. Similar to an Ingress resource, it can specify host and path matching with Services. This MCI specifies two host rules and a default backend which will recieve all traffic that does not have a match. The serviceName
field in this MCI specifies the name of an MCS resource.
apiVersion: networking.gke.io/v1
kind: MultiClusterIngress
metadata:
name: foobar-ingress
namespace: multi-cluster-demo
spec:
template:
spec:
backend:
serviceName: default-backend
servicePort: 8080
rules:
- host: foo.example.com
http:
paths:
- backend:
serviceName: foo
servicePort: 8080
- host: bar.example.com
http:
paths:
- backend:
serviceName: bar
servicePort: 8080
Similar to the Kubernetes Service, the MultiClusterService (MCS) describes label selectors and other backend parameters to group pods in the desired way. This foo
MCS specifies that all Pods with the following characteristics will be selected as backends for foo
:
- Pods with the label
app: foo
- In the
multi-cluster-demo
Namespace - In any of the clusters that are registered as members to the Hub
If more clusters are added to the Hub, then any Pods in those clusters that match these characteristics will also be registered as backends to foo
.
apiVersion: networking.gke.io/v1
kind: MultiClusterService
metadata:
name: foo
namespace: multi-cluster-demo
annotations:
beta.cloud.google.com/backend-config: '{"ports": {"8080":"backend-health-check"}}'
spec:
template:
spec:
selector:
app: foo
ports:
- name: http
protocol: TCP
port: 8080
targetPort: 8080
Each of the three MCS's referenced in the foobar-ingress
MCI have their own manifest to describe the matching parameters of that MCS. A BackendConfig resource is also referenced. This allows settings specific to a Service to be configured. We use it here to configure the health check that the Google Cloud load balancer uses.
apiVersion: cloud.google.com/v1
kind: BackendConfig
metadata:
name: backend-health-check
namespace: multi-cluster-demo
spec:
healthCheck:
requestPath: /healthz
port: 8080
type: HTTP
Now that you have the background knowledge and understanding of MCI, you can try it out yourself.
-
Download this repo and navigate to this folder
$ git clone https://github.com/GoogleCloudPlatform/gke-networking-recipes.git Cloning into 'gke-networking-recipes'... $ cd gke-networking-recipes/ingress/multi-cluster/mci-basic
-
Set up Environment variables
export PROJECT=$(gcloud config get-value project) # or your preferred project export GKE1_ZONE=GCP_CLOUD_REGION # Pick a supported Region/Zone for cluster gke-1 export GKE2_ZONE=GCP_CLOUD_REGION # Pick a supported Region/Zone for cluster gke-2 # TODO(abdelfettah@) make sure the zones are consistent across all the recipes
-
Deploy the two clusters
gke-1
andgke-2
as specified in cluster setup and follow the steps for cluster registration with Hub and enablement of Multi-cluster Ingress.There are two manifests in this folder:
- app.yaml is the manifest for the foo and bar Deployments. This manifest should be deployed on both clusters.
- ingress.yaml is the manifest for the MultiClusterIngress and MultiClusterService resources. These will be deployed only on the
gke-1
cluster as this was set as the config cluster and is the cluster that the MCI controlller is listening to for updates.
-
Separately log in to each cluster and deploy the app.yaml manifest. You can configure these contexts as shown here.
$ kubectl --context=gke-1 apply -f app.yaml namespace/multi-cluster-demo created deployment.apps/foo created deployment.apps/bar created deployment.apps/default-backend created $ kubectl --context=gke-2 apply -f app.yaml namespace/multi-cluster-demo created deployment.apps/foo created deployment.apps/bar created deployment.apps/default-backend created
-
Check workloads are deployed and running in both clusters
$ kubectl --context=gke-2 get deploy -n multi-cluster-demo NAME READY UP-TO-DATE AVAILABLE AGE bar 2/2 2 2 44m default-backend 1/1 1 1 44m foo 2/2 2 2 44m $ kubectl --context=gke-1 get deploy -n multi-cluster-demo NAME READY UP-TO-DATE AVAILABLE AGE bar 2/2 2 2 44m default-backend 1/1 1 1 44m foo 2/2 2 2 44m
-
Now log into
gke-1
and deploy the ingress.yaml manifest.$ kubectl --context=gke-1 apply -f ingress.yaml multiclusteringress.networking.gke.io/foobar-ingress created multiclusterservice.networking.gke.io/foo created multiclusterservice.networking.gke.io/bar created multiclusterservice.networking.gke.io/default-backend created backendconfig.cloud.google.com/backend-health-check created
-
It can take up to 10 minutes for the load balancer to deploy fully. Inspect the MCI resource to watch for events that indicate how the deployment is going. Then capture the IP address for the MCI ingress resource.
$ kubectl --context=gke-1 describe mci/foobar-ingress -n multi-cluster-demo Name: foobar-ingress Namespace: multi-cluster-demo Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"networking.gke.io/v1","kind":"MultiClusterIngress","metadata":{"annotations":{},"name":"foobar-ingress","namespace":"multi-... networking.gke.io/last-reconcile-time: Saturday, 14-Nov-20 21:46:46 UTC API Version: networking.gke.io/v1 Kind: MultiClusterIngress Metadata: Resource Version: 144786 Self Link: /apis/networking.gke.io/v1/namespaces/multi-cluster-demo/multiclusteringresses/foobar-ingress UID: 47fe4406-9660-4968-8eea-0a2f028f03d2 Spec: Template: Spec: Backend: Service Name: default-backend Service Port: 8080 Rules: Host: foo.example.com Http: Paths: Backend: Service Name: foo Service Port: 8080 Host: bar.example.com Http: Paths: Backend: Service Name: bar Service Port: 8080 Status: Cloud Resources: Backend Services: mci-8se3df-8080-multi-cluster-demo-bar mci-8se3df-8080-multi-cluster-demo-default-backend mci-8se3df-8080-multi-cluster-demo-foo Firewalls: mci-8se3df-default-l7 Forwarding Rules: mci-8se3df-fw-multi-cluster-demo-foobar-ingress Health Checks: mci-8se3df-8080-multi-cluster-demo-bar mci-8se3df-8080-multi-cluster-demo-default-backend mci-8se3df-8080-multi-cluster-demo-foo Network Endpoint Groups: zones/us-east1-b/networkEndpointGroups/k8s1-b1f3fb3a-multi-cluste-mci-default-backend-svc--80-c7b851a2 zones/us-east1-b/networkEndpointGroups/k8s1-b1f3fb3a-multi-cluster--mci-bar-svc-067a3lzs8-808-45cc57ea zones/us-east1-b/networkEndpointGroups/k8s1-b1f3fb3a-multi-cluster--mci-foo-svc-820zw3izx-808-c453c71e zones/us-west1-a/networkEndpointGroups/k8s1-0dfd9a8f-multi-cluste-mci-default-backend-svc--80-f964d3fc zones/us-west1-a/networkEndpointGroups/k8s1-0dfd9a8f-multi-cluster--mci-bar-svc-067a3lzs8-808-cd95ae93 zones/us-west1-a/networkEndpointGroups/k8s1-0dfd9a8f-multi-cluster--mci-foo-svc-820zw3izx-808-3996ee76 Target Proxies: mci-8se3df-multi-cluster-demo-foobar-ingress URL Map: mci-8se3df-multi-cluster-demo-foobar-ingress VIP: 35.201.75.57 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal ADD 50m multi-cluster-ingress-controller multi-cluster-demo/foobar-ingress Normal UPDATE 49m (x2 over 50m) multi-cluster-ingress-controller multi-cluster-demo/foobar-ingress
# capture the IP address for the MCI resource $ export MCI_ENDPOINT=$(kubectl --context=gke-1 get mci -n multi-cluster-demo -o yaml | grep "VIP" | awk 'END{ print $2}')
-
Now use the IP address from the MCI output to reach the load balancer. Try hitting the load balancer on the different host rules to confirm that traffic is being routed correctly. We use
jq
to filter the output to make it easier to read but you could drop thejq
portion of the command to see the full output.# Hitting the default backend $ curl -s ${MCI_ENDPOINT} | jq -r '.zone, .cluster_name, .pod_name' us-west1-a gke-1 default-backend-6b9bd45db8-gzdjc # Hitting the foo Service $ curl -s -H "host: foo.example.com" ${MCI_ENDPOINT} | jq -r '.zone, .cluster_name, .pod_name' us-west1-a gke-1 foo-7b994cdbd5-wxgpk # Hitting the bar Service $ curl -s -H "host: bar.example.com" ${MCI_ENDPOINT} | jq -r '.zone, .cluster_name, .pod_name' us-west1-a gke-1 bar-5bdf58646c-rbbdn
-
Now to demonstrate the health checking and failover ability of MCI, let's crash the pods in
gke-1
for one of the Services. We'll update the replicas of thefoo
Deployment to zero so that there won't be any available backends in that cluster. To confirm that traffic is not dropped, we can set a continuous curl to watch as traffic fails over. In one shell, start a continous curl against thefoo
Service.$ while true; do curl -s -H "host: foo.example.com" ${MCI_ENDPOINT} | jq -c '{cluster: .cluster_name, pod: .pod_name}'; sleep 2; done {"cluster":"gke-1","pod":"foo-7b994cdbd5-p2n59"} {"cluster":"gke-1","pod":"foo-7b994cdbd5-2jnks"} {"cluster":"gke-1","pod":"foo-7b994cdbd5-2jnks"} {"cluster":"gke-1","pod":"foo-7b994cdbd5-p2n59"} ...
Note: Traffic will be load balanced to the closest cluster to the client. If you are curling from your laptop then your traffic will be directed to the closest GKE cluster to you. Whichever cluster is recieving traffic in this step will be the closest one to you so fail pods in that cluster in the next step and watch traffic failover to the other cluster.
-
Open up a second shell to scale the replicas down to zero.
# Do this in the same cluster where the response came from in the previous step $ kubectl --context=gke-1 scale --replicas=0 deploy foo -n multi-cluster-demo deployment.apps/foo scaled $ kubectl get deploy -n multi-cluster-demo foo NAME READY UP-TO-DATE AVAILABLE AGE foo 0/0 0 0 63m
-
Watch how traffic switches from one cluster to another as the Pods dissappear from
gke-1
. Because thefoo
Pods from both clusters are active-active backends to the load balancer, there is no traffic interuption or delay when switching over traffic from one cluster to the other. Traffic is seamllessly routed to the available backends in the other cluster.... {"cluster":"gke-1","pod":"foo-7b994cdbd5-2jnks"} {"cluster":"gke-1","pod":"foo-7b994cdbd5-p2n59"} {"cluster":"gke-1","pod":"foo-7b994cdbd5-2jnks"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-hnfsv"} # <----- cutover happens here {"cluster":"gke-2","pod":"foo-7b994cdbd5-hnfsv"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-hnfsv"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-97wmt"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-97wmt"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-97wmt"} {"cluster":"gke-2","pod":"foo-7b994cdbd5-hnfsv"} ...
kubectl --context=gke-1 delete -f app.yaml
kubectl --context=gke-1 delete -f ingress.yaml # this is unnecessary as the namespace hosting these CRDs has already been deleted in the previous step
kubectl --context=gke-2 delete -f app.yaml