Skip to content
This repository has been archived by the owner on Nov 2, 2021. It is now read-only.

Latest commit

 

History

History
239 lines (184 loc) · 7.78 KB

README.md

File metadata and controls

239 lines (184 loc) · 7.78 KB

Note: The demo artifacts in this repo correspond to litmus 1.x and have been transferred to https://github.com/litmuschaos/litmus/tree/master/demo/1.x. This repo is no longer active.

Litmus Kubernetes Demo Environment

The purpose of this repository is to familiarize oneself with running litmus chaos experiments in a realistic app environment running multiple services on different Kubernetes clusters.

It makes to spin up a fully deployed GKE cluster or EKS cluster easy with a microservice application or even you can spin up a KinD (Kubernetes-in-Docker) cluster which is a lightweight easy to use and handle for the applications and performing chaos. Sock Shop, and Litmus Chaos Engine to create chaos scenarios.

After cloning this repository, start the litmus demo container, and using the start command to create the fully deployed cluster, you will be able to run Litmus Chaos experiments using the test command in the cluster. You can find all the experiment configuration under the /litmus directory of this repository and the script to deploy and run them in manage.py.

It currently works with KinD, GKE and EKS so either you can use a KinD cluster by following the below steps or you would need a Google Cloud account to run this on GKE environment or an AWS account to run this on EKS environment and the support for Azure is planned in future.

Requirements

  1. Docker 18.09 or greater

Setup Docker Container

You can setup & run the demo from a containerized environment by following the below mentioned steps.

Build Docker Image git clone https://github.com/litmuschaos/litmus-demo.git

docker build -t litmuschaos/litmus-demo .

OR

make build

Run docker container interactive, now you can run any commands mentioned here with python3.

docker run -v /var/run/docker.sock:/var/run/docker.sock --net="host" -it --entrypoint bash litmuschaos/litmus-demo
$ python3 -h

OR

make exec

You can run commands inside the container {-h, start, test, list, stop} ...

$ ./runcmd -h

You can also run the manage.py demo script in a non containerized environment for which you have to install the dependencies.You can refer Get Started with LitmusChaos in Minutes blog for setting up non containerized litmus demo environment.

Startup

To start the GKE cluster and deploy all the required components:

for kind cluster

./manage.py start --platform kind

for GKE cluster

./manage.py start --platform GKE --project {GC_PROJECT} --key {ZE_KEY}

for EKS cluster

./manage.py start --platform EKS --name {EKS_CLUSTER_NAME}

Flag values for start

Flag Description Default
--platform or -pt Set the platform to start with demo enviroment. Available platforms are kind and GKE. Support for other platforms will also be added. Default value is kind
--name or -n Required when --platform is GKE. It sets GKE cluster name Default value is litmus-k8s-demo
--zone or -z Required when --platform is GKE. It sets GCloud Zone to spin GKE cluster up in Default value is us-central1-a
--project or -p Required when --platform is GKE. It sets GCloud Project to spin GKE cluster up in No Default value

Test

To run all the Litmus ChaosEngine experiments:

./manage.py test

You can optionally add the --wait= argument to change the wait time between experiments in minutes. By default, it is 1 min.

To run a specific experiment (found under the ./litmus directory):

./manage.py test --test=pod-delete

Flag values for test

Flag Description Default
--test or -t Name of test to run based on yaml file name under /litmus folder. Default value is * (all)
--wait or -w Number of minutes to wait between experiments. Default value is 1 (in min)
--type or -ty Select the type of chaos to be performed, it can have values pod for pod level chaos,node for infra/node level chaos and all to perform all chaos. Default value is all
--platform or -pt Set the platform to perform chaos. Available platforms are kind and GKE. Default value is kind
--report or -r Set report flag to yes for generating pdf report of the experiment result summary Default value is no

Usage

To see full command-line options use the -h flag:

./manage.py -h

This will output the following:

usage: manage.py [-h] {start,test,list,stop} ...

Spin up Litmus Demo Environment on Kubernetes.

positional arguments:**
  {start,test,list,stop}
    start               Start a Cluster with the demo environment deployed.
    test                Run Litmus ChaosEngine Experiments inside litmus demo
                        environment.
    list                List all available Litmus ChaosEngine Experiments
                        available to run.
    stop                Shutdown the Cluster with the demo environment
                        deployed.

Notes

  • To view application deployment picked, success/failure of reconcile operations (i.e., creation of chaos-runner pod or lack thereof), check the chaos operator logs. Ex:
kubectl logs -f chaos-operator-ce-6899bbdb9-jz6jv -n litmus
  • To view the parameters with which the experiment job is created, the status of experiment, the success of chaosengine patch operation, and cleanup of the experiment pod, check the logs of the chaos-runner pod. Ex:
kubectl logs sock-chaos-runner -n sock-shop
  • To view the logs of the chaos experiment itself, use the value retain in .spec.jobCleanupPolicy of the chaosengine CR
kubectl logs container-kill-1oo8wv-85lsl -n sock-shop

(The detailed troubleshooting faq here: https://docs.litmuschaos.io/docs/faq-troubleshooting/)

  • To re-run the chaosexperiment, cleanup and re-create the chaosengine CR
kubectl delete chaosengine sock-chaos -n sock-shop
kubectl apply -f litmus/chaosengine.yaml

Generate PDF of the experiment result summary

We can also generate the pdf report of the experiment result summary using --report flag as follow:

./manage.py test --report=yes

It will generate a pdf report of name chaos-report.pdf in the current location containing chaos result summary.

List

Lists all the available Litmus Chaos Experiments in this repo under the ./litmus directory for a particular platform:

./manage.py list --platform <platform-name>

Shutdown

To shut down and destroy the cluster when you're finished:

for kind cluster

./manage.py --platform kind stop

for GKE cluster

./manage.py --platform GKE stop --project {GC_PROJECT}

for EKS cluster

./manage.py --platform EKS stop --name {EKS_CLUSTER_NAME} --awsregion {EKS_REGION_NAME}