From f6ab0090fb9755f1ae8d83cac9aba81137a50b81 Mon Sep 17 00:00:00 2001 From: ksatchit Date: Thu, 21 Nov 2024 11:31:06 +0000 Subject: [PATCH] Automated deployment: Thu Nov 21 11:31:06 UTC 2024 25ff937cba5cf5d30122cf4fad513d12916a99cd --- ROADMAP/index.html | 55 +++++++++++++++++++++++++--------------- search/search_index.json | 2 +- 2 files changed, 36 insertions(+), 21 deletions(-) diff --git a/ROADMAP/index.html b/ROADMAP/index.html index c97b68fe263..8794a1a9ad4 100644 --- a/ROADMAP/index.html +++ b/ROADMAP/index.html @@ -4142,9 +4142,9 @@
  • - + - In-Progress (Under Active Development) + In-Progress (Under Design OR Active Development) @@ -4236,9 +4236,9 @@
  • - + - In-Progress (Under Active Development) + In-Progress (Under Design OR Active Development) @@ -4289,49 +4289,64 @@

    CompletedIn-Progress (Under Active Development)

    +

    In-Progress (Under Design OR Active Development)

    +
    +

    Backlog

    + -
    -

    Backlog

    - diff --git a/search/search_index.json b/search/search_index.json index 0e65502dd91..352858edbb8 100644 --- a/search/search_index.json +++ b/search/search_index.json @@ -1 +1 @@ -{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"ROADMAP/","title":"Roadmap","text":""},{"location":"ROADMAP/#litmus-roadmap","title":"LITMUS ROADMAP","text":"

    This document captures only the high level roadmap items. For the detailed backlog, see issues list.

    "},{"location":"ROADMAP/#completed","title":"Completed","text":""},{"location":"ROADMAP/#in-progress-under-active-development","title":"In-Progress (Under Active Development)","text":""},{"location":"ROADMAP/#backlog","title":"Backlog","text":""},{"location":"experiments/api/contents/","title":"Litmus API Documentation","text":"Name Description References AUTH Server Contains AUTH Server API documentation AUTH Server GraphQL Server Contains GraphQL Server API documentation GraphQL Server"},{"location":"experiments/categories/contents/","title":"Experiments","text":"

    The experiment execution is triggered upon creation of the ChaosEngine resource (various examples of which are provided under the respective experiments). Typically, these chaosengines are embedded within the 'steps' of a Litmus Chaos Workflow here. However, one may also create the chaos engines directly by hand, and the chaos-operator reconciles this resource and triggers the experiment execution.

    Provided below are tables with links to the individual experiment docs for easy navigation

    "},{"location":"experiments/categories/contents/#kubernetes-experiments","title":"Kubernetes Experiments","text":"

    It contains chaos experiments which apply on the resources, which are running on the kubernetes cluster. It contains Generic experiments.

    Following Kubernetes Chaos experiments are available:

    "},{"location":"experiments/categories/contents/#generic","title":"Generic","text":"

    Chaos actions that apply to generic Kubernetes resources are classified into this category. Following chaos experiments are supported under Generic Chaos Chart

    "},{"location":"experiments/categories/contents/#pod-chaos","title":"Pod Chaos","text":"Experiment Name Description User Guide Container Kill Kills the container in the application pod container-kill Disk Fill Fillup Ephemeral Storage of a Resourced disk-fill Pod Autoscaler Scales the application replicas and test the node autoscaling on cluster pod-autoscaler Pod CPU Hog Exec Consumes CPU resources on the application container by invoking a utility within the app container base image pod-cpu-hog-exec Pod CPU Hog Consumes CPU resources on the application container pod-cpu-hog Pod Delete Deletes the application pod pod-delete Pod DNS Error Disrupt dns resolution in kubernetes po pod-dns-error Pod DNS Spoof Spoof dns resolution in kubernetes pod pod-dns-spoof Pod IO Stress Injects IO stress resources on the application container pod-io-stress Pod Memory Hog Exec Consumes Memory resources on the application container by invoking a utility within the app container base image pod-memory-hog-exec Pod Memory Hog Consumes Memory resources on the application container pod-memory-hog Pod Network Corruption Injects Network Packet Corruption into Application Pod pod-network-corruption Pod Network Duplication Injects Network Packet Duplication into Application Pod pod-network-duplication Pod Network Latency Injects Network latency into Application Pod pod-network-latency Pod Network Loss Injects Network loss into Application Pod pod-network-loss Pod HTTP Latency Injects HTTP latency into Application Pod pod-http-latency Pod HTTP Reset Peer Injects HTTP reset peer into Application Pod pod-http-reset-peer Pod HTTP Status Code Injects HTTP status code chaos into Application Pod pod-http-status-code Pod HTTP Modify Body Injects HTTP modify body into Application Pod pod-http-modify-body Pod HTTP Modify Header Injects HTTP Modify Header into Application Pod pod-http-modify-header"},{"location":"experiments/categories/contents/#node-chaos","title":"Node Chaos","text":"Experiment Name Description User Guide Docker Service Kill Kills the docker service on the application node docker-service-kill Kubelet Service Kill Kills the kubelet service on the application node kubelet-service-kill Node CPU Hog Exhaust CPU resources on the Kubernetes Node node-cpu-hog Node Drain Drains the target node node-drain Node IO Stress Injects IO stress resources on the application node node-io-stress Node Memory Hog Exhaust Memory resources on the Kubernetes Node node-memory-hog Node Restart Restarts the target node node-restart Node Taint Taints the target node node-taint"},{"location":"experiments/categories/contents/#application-chaos","title":"Application Chaos","text":"

    While Chaos Experiments under the Generic category offer the ability to induce chaos into Kubernetes resources, it is difficult to analyze and conclude if the chaos induced found a weakness in a given application. The application specific chaos experiments are built with some checks on pre-conditions and some expected outcomes after the chaos injection. The result of the chaos experiment is determined by matching the outcome with the expected outcome.

    Experiment Name Description User Guide Spring Boot App Kill Kill the spring boot application spring-boot-app-kill Spring Boot CPU Stress Stress the CPU of the spring boot application spring-boot-cpu-stress Spring Boot Memory Stress Stress the memory of the spring boot application spring-boot-memory-stress Spring Boot Latency Inject latency to the spring boot application network spring-boot-latency Spring Boot Exception Raise exceptions to the spring boot application spring-boot-exceptions Spring Boot Faults It injects the multiple spring boot faults simultaneously on the target pods spring-boot-faults"},{"location":"experiments/categories/contents/#load-chaos","title":"Load Chaos","text":"

    Load chaos contains different chaos experiments to test the app/platform service availability. It will install all the experiments which can be used to inject load into the services like VMs, Pods and so on.

    Experiment Name Description User Guide k6 Load Generator Generate load using single js script k6-loadgen"},{"location":"experiments/categories/contents/#cloud-infrastructure","title":"Cloud Infrastructure","text":"

    Chaos experiments that inject chaos into the platform resources of Kubernetes are classified into this category. Management of platform resources vary significantly from each other, Chaos Charts may be maintained separately for each platform (For example, AWS, GCP, Azure, etc)

    Following Platform Chaos experiments are available:

    "},{"location":"experiments/categories/contents/#aws","title":"AWS","text":"Experiment Name Description User Guide EC2 Stop By ID Stop the EC2 instance matched by instance id ec2-stop-by-id EC2 Stop By Tag Stop the EC2 instance matched by instance tag ec2-stop-by-tag EBS Loss By ID Detach the EBS volume matched by volume id ebs-loss-by-id EBS Loss By Tag Detach the EBS volume matched by volume tag ebs-loss-by-tag"},{"location":"experiments/categories/contents/#gcp","title":"GCP","text":"Experiment Name Description User Guide GCP VM Instance Stop Stop the gcp vm instance gcp-vm-instance-stop GCP VM Disk Loss Detach the gcp disk gcp-vm-disk-loss"},{"location":"experiments/categories/contents/#azure","title":"Azure","text":"Experiment Name Description User Guide Azure Instance Stop Stop the azure instance azure-instance-stop Azure Disk Loss Detach azure disk from instance azure-disk-loss"},{"location":"experiments/categories/contents/#vmware","title":"VMWare","text":"Experiment Name Description User Guide VM Poweroff Poweroff the vmware VM vm-poweroff"},{"location":"experiments/categories/aws/AWS-experiments-tunables/","title":"AWS experiments tunables","text":"

    It contains the AWS specific experiment tunables.

    "},{"location":"experiments/categories/aws/AWS-experiments-tunables/#managed-nodegroup","title":"Managed Nodegroup","text":"

    It specifies whether aws instances are part of managed nodeGroups. If instances belong to the managed nodeGroups then provide MANAGED_NODEGROUP as enable else provide it as disable. The default value is disabled.

    Use the following example to tune this:

    # it provided as enable if instances are part of self managed groups\n# it is applicable for [ec2-terminate-by-id, ec2-terminate-by-tag]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-terminate-by-tag\n    spec:\n      components:\n        env:\n        # if instance is part of a managed node-group\n        # supports enable and disable values, default value: disable\n        - name: MANAGED_NODEGROUP\n          value: 'enable'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instances>'\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/AWS-experiments-tunables/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-terminate-by-tag\n    spec:\n      components:\n        env:\n         # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: REGION\n          value: '<region for instances>'\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-id/","title":"EBS Loss By ID","text":""},{"location":"experiments/categories/aws/ebs-loss-by-id/#introduction","title":"Introduction","text":"

    Scenario: Detach EBS Volume

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws/ebs-loss-by-id/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws/ebs-loss-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ebs-loss-by-id-sa\n  namespace: default\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ebs-loss-by-id-sa\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ebs-loss-by-id-sa\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ebs-loss-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: ebs-loss-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EBS_VOLUME_ID Comma separated list of volume IDs subjected to ebs detach chaos REGION The region name for the target volumes

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The time duration between the attachment and detachment of the volumes (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple volumes Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ebs-loss-by-id/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#detach-volumes-by-id","title":"Detach Volumes By ID","text":"

    It contains comma separated list of volume IDs subjected to ebs detach chaos. It can be tuned via EBS_VOLUME_ID ENV.

    Use the following example to tune this:

    # contains ebs volume id \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-id-sa\n  experiments:\n  - name: ebs-loss-by-id\n    spec:\n      components:\n        env:\n        # id of the ebs volume\n        - name: EBS_VOLUME_ID\n          value: 'ebs-vol-1'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/","title":"EBS Loss By Tag","text":""},{"location":"experiments/categories/aws/ebs-loss-by-tag/#introduction","title":"Introduction","text":"

    Scenario: Detach EBS Volume

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ebs-loss-by-tag-sa\n  namespace: default\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ebs-loss-by-tag-sa\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ebs-loss-by-tag-sa\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ebs-loss-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: ebs-loss-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EBS_VOLUME_TAG Provide the common tag for target volumes. It'll be in form of key:value (Ex: 'team:devops') REGION The region name for the target volumes

    Variables Description Notes VOLUME_AFFECTED_PERC The Percentage of total ebs volumes to target Defaults to 0 (corresponds to 1 volume), provide numeric value only TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The time duration between the attachment and detachment of the volumes (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple volumes Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ebs-loss-by-tag/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#target-single-volume","title":"Target single volume","text":"

    It will detach a random single ebs volume with the given EBS_VOLUME_TAG tag and REGION region.

    Use the following example to tune this:

    # contains the tags for the ebs volumes \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-tag-sa\n  experiments:\n  - name: ebs-loss-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ebs volume\n        - name: EBS_VOLUME_TAG\n          value: 'key:value'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_TAG>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#target-percent-of-volumes","title":"Target Percent of volumes","text":"

    It will detach the VOLUME_AFFECTED_PERC percentage of ebs volumes with the given EBS_VOLUME_TAG tag and REGION region.

    Use the following example to tune this:

    # target percentage of the ebs volumes with the provided tag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-tag-sa\n  experiments:\n  - name: ebs-loss-by-tag\n    spec:\n      components:\n        env:\n        # percentage of ebs volumes filter by tag\n        - name: VOLUME_AFFECTED_PERC\n          value: '100'\n        # tag of the ebs volume\n        - name: EBS_VOLUME_TAG\n          value: 'key:value'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_TAG>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-id/","title":"EC2 Stop By ID","text":""},{"location":"experiments/categories/aws/ec2-stop-by-id/#introduction","title":"Introduction","text":"

    Scenario: Stop EC2 Instance

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws/ec2-stop-by-id/#warning","title":"WARNING","text":"

    If the target EC2 instance is a part of a self-managed nodegroup: Make sure to drain the target node if any application is running on it and also ensure to cordon the target node before running the experiment so that the experiment pods do not schedule on it.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws/ec2-stop-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ec2-stop-by-id-sa\n  namespace: default\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ec2-stop-by-id-sa\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ec2-stop-by-id-sa\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ec2-stop-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: ec2-stop-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_ID Instance ID of the target ec2 instance. Multiple IDs can also be provided as a comma(,) separated values Multiple IDs can be provided as id1,id2 REGION The region name of the target instace

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance stop. Defaults to 30s MANAGED_NODEGROUP Set to enable if the target instance is the part of self-managed nodegroups Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ec2-stop-by-id/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#stop-instances-by-id","title":"Stop Instances By ID","text":"

    It contains comma separated list of instances IDs subjected to ec2 stop chaos. It can be tuned via EC2_INSTANCE_ID ENV.

    Use the following example to tune this:

    # contains the instance id to be stopped\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-stop-by-id-sa\n  experiments:\n  - name: ec2-stop-by-id\n    spec:\n      components:\n        env:\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-1'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/","title":"EC2 Stop By Tag","text":""},{"location":"experiments/categories/aws/ec2-stop-by-tag/#introduction","title":"Introduction","text":"

    Scenario: Stop EC2 Instance

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#warning","title":"WARNING","text":"

    If the target EC2 instance is a part of a self-managed nodegroup: Make sure to drain the target node if any application is running on it and also ensure to cordon the target node before running the experiment so that the experiment pods do not schedule on it.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ec2-stop-by-tag-sa\n  namespace: default\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ec2-stop-by-tag-sa\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ec2-stop-by-tag-sa\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ec2-stop-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: ec2-stop-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_TAG Instance Tag to filter the target ec2 instance. The EC2_INSTANCE_TAG should be provided as key:value ex: team:devops REGION The region name of the target instace

    Variables Description Notes INSTANCE_AFFECTED_PERC The Percentage of total ec2 instance to target Defaults to 0 (corresponds to 1 instance), provide numeric value only TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination. Defaults to 30s MANAGED_NODEGROUP Set to enable if the target instance is the part of self-managed nodegroups Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ec2-stop-by-tag/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#target-single-instance","title":"Target single instance","text":"

    It will stop a random single ec2 instance with the given EC2_INSTANCE_TAG tag and the REGION region.

    Use the following example to tune this:

    # target the ec2 instances with matching tag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-stop-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instance>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#target-percent-of-instances","title":"Target Percent of instances","text":"

    It will stop the INSTANCE_AFFECTED_PERC percentage of ec2 instances with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    # percentage of ec2 instances, needs to terminate with provided tags\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-stop-by-tag-sa\n  experiments:\n  - name: ec2-stop-by-tag\n    spec:\n      components:\n        env:\n        # percentage of ec2 instance filterd by tags \n        - name: INSTANCE_AFFECTED_PERC\n          value: '100'\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instance>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/","title":"AWS SSM experiments tunables","text":"

    It contains the aws-ssm specific experiment tunables.

    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#cpu-cores","title":"CPU Cores","text":"

    It stressed the CPU_CORE cpu cores of the EC2_INSTANCE_ID ec2 instance and REGION region for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # provide the cpu cores to stress the ec2 instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # cpu cores for the stress\n        - name: CPU_CORE\n          value: '1'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#memory-percentage","title":"Memory Percentage","text":"

    It stressed the MEMORY_PERCENTAGE percentage of free space of the EC2_INSTANCE_ID ec2 instance and REGION region for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # provide the memory pecentage to stress the instance memory\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # memory percentage for the stress\n        - name: MEMORY_PERCENTAGE\n          value: '80'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#ssm-docs","title":"SSM Docs","text":"

    It contains the details of the SSM docs i.e, name, type, the format of ssm-docs.

    Use the following example to tune this:

    ## provide the details of the ssm document details\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # name of the ssm docs\n        - name: DOCUMENT_NAME\n          value: 'AWS-SSM-Doc'\n        # format of the ssm docs\n        - name: DOCUMENT_FORMAT\n          value: 'YAML'\n        # type of the ssm docs\n        - name: DOCUMENT_TYPE\n          value: 'command'\n        # path of the ssm docs\n        - name: DOCUMENT_PATH\n          value: ''\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#workers-count","title":"Workers Count","text":"

    It contains the NUMBER_OF_WORKERS workers for the stress.

    Use the following example to tune this:

    # workers details used to stress the instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # number of workers used for stress\n        - name: NUMBER_OF_WORKERS\n          value: '1'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: CPU_CORE\n          value: '1'\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/","title":"AWS SSM Chaos By ID","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#introduction","title":"Introduction","text":"

    Scenario: AWS SSM Chaos

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  namespace: default\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n- apiGroups: [\"\"]\n  resources: [\"pods\"]\n  verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n# Performs CRUD operations on the events inside chaosengine and chaosresult\n- apiGroups: [\"\"]\n  resources: [\"events\"]\n  verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n# Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n- apiGroups: [\"\"]\n  resources: [\"secrets\",\"configmaps\"]\n  verbs: [\"get\",\"list\",]\n# Track and get the runner, experiment, and helper pods log \n- apiGroups: [\"\"]\n  resources: [\"pods/log\"]\n  verbs: [\"get\",\"list\",\"watch\"]  \n# for creating and managing to execute comands inside target container\n- apiGroups: [\"\"]\n  resources: [\"pods/exec\"]\n  verbs: [\"get\",\"list\",\"create\"]\n# for configuring and monitor the experiment job by the chaos-runner pod\n- apiGroups: [\"batch\"]\n  resources: [\"jobs\"]\n  verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n# for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n- apiGroups: [\"litmuschaos.io\"]\n  resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: aws-ssm-chaos-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: aws-ssm-chaos-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_ID Instance ID of the target ec2 instance. Multiple IDs can also be provided as a comma(,) separated values Multiple IDs can be provided as id1,id2 REGION The region name of the target instace

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive chaos injection Defaults to 60s AWS_SHARED_CREDENTIALS_FILE Provide the path for aws secret credentials Defaults to /tmp/cloud_config.yml DOCUMENT_NAME Provide the name of addded ssm docs (if not using the default docs) Default to LitmusChaos-AWS-SSM-Doc DOCUMENT_FORMAT Provide the format of the ssm docs. It can be YAML or JSON Defaults to YAML DOCUMENT_TYPE Provide the document type of added ssm docs (if not using the default docs) Defaults to Command DOCUMENT_PATH Provide the document path if added using configmaps Defaults to the litmus ssm docs path INSTALL_DEPENDENCIES Select to install dependencies used to run stress-ng with default docs. It can be either True or False Defaults to True NUMBER_OF_WORKERS Provide the number of workers to run stress-chaos with default ssm docs Defaults to 1 MEMORY_PERCENTAGE Provide the memory consumption in percentage on the instance for default ssm docs Defaults to 80 CPU_CORE Provide the number of cpu cores to run stress-chaos on EC2 with default ssm docs Defaults to 0. It means it'll consume all the available cpu cores on the instance SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#common-and-aws-ssm-specific-tunables","title":"Common and AWS-SSM specific tunables","text":"

    Refer the common attributes and AWS-SSM specific tunable to tune the common tunables for all experiments and aws-ssm specific tunables.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#stress-instances-by-id","title":"Stress Instances By ID","text":"

    It contains comma separated list of instances IDs subjected to ec2 stop chaos. It can be tuned via EC2_INSTANCE_ID ENV.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # comma separated list of ec2 instance id(s)\n        # all instances should belongs to the same region(REGION)\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01,instance-02'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/","title":"AWS SSM Chaos By Tag","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#introduction","title":"Introduction","text":"

    Scenario: AWS SSM Chaos

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  namespace: default\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n- apiGroups: [\"\"]\n  resources: [\"pods\"]\n  verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n# Performs CRUD operations on the events inside chaosengine and chaosresult\n- apiGroups: [\"\"]\n  resources: [\"events\"]\n  verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n# Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n- apiGroups: [\"\"]\n  resources: [\"secrets\",\"configmaps\"]\n  verbs: [\"get\",\"list\",]\n# Track and get the runner, experiment, and helper pods log \n- apiGroups: [\"\"]\n  resources: [\"pods/log\"]\n  verbs: [\"get\",\"list\",\"watch\"]  \n# for creating and managing to execute comands inside target container\n- apiGroups: [\"\"]\n  resources: [\"pods/exec\"]\n  verbs: [\"get\",\"list\",\"create\"]\n# for configuring and monitor the experiment job by the chaos-runner pod\n- apiGroups: [\"batch\"]\n  resources: [\"jobs\"]\n  verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n# for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n- apiGroups: [\"litmuschaos.io\"]\n  resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: aws-ssm-chaos-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: aws-ssm-chaos-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_TAG Instance Tag to filter the target ec2 instance The EC2_INSTANCE_TAG should be provided as key:value ex: chaos:ssm REGION The region name of the target instace

    Variables Description Notes INSTANCE_AFFECTED_PERC The Percentage of total ec2 instance to target Defaults to 0 (corresponds to 1 instance), provide numeric value only TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive chaos injection Defaults to 60s AWS_SHARED_CREDENTIALS_FILE Provide the path for aws secret credentials Defaults to /tmp/cloud_config.yml DOCUMENT_NAME Provide the name of addded ssm docs (if not using the default docs) Default to LitmusChaos-AWS-SSM-Doc DOCUMENT_FORMAT Provide the format of the ssm docs. It can be YAML or JSON Defaults to YAML DOCUMENT_TYPE Provide the document type of added ssm docs (if not using the default docs) Defaults to Command DOCUMENT_PATH Provide the document path if added using configmaps Defaults to the litmus ssm docs path INSTALL_DEPENDENCIES Select to install dependencies used to run stress-ng with default docs. It can be either True or False Defaults to True NUMBER_OF_WORKERS Provide the number of workers to run stress-chaos with default ssm docs Defaults to 1 MEMORY_PERCENTAGE Provide the memory consumption in percentage on the instance for default ssm docs Defaults to 80 CPU_CORE Provide the number of cpu cores to run stress-chaos on EC2 with default ssm docs Defaults to 0. It means it'll consume all the available cpu cores on the instance SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#common-and-aws-ssm-specific-tunables","title":"Common and AWS-SSM specific tunables","text":"

    Refer the common attributes and AWS-SSM specific tunable to tune the common tunables for all experiments and aws-ssm specific tunables.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#target-single-instance","title":"Target single instance","text":"

    It will stress a random single ec2 instance with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-tag-sa\n  experiments:\n  - name: aws-ssm-chaos-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ec2 instances\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the ec2 instances>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#target-percent-of-instances","title":"Target Percent of instances","text":"

    It will stress the INSTANCE_AFFECTED_PERC percentage of ec2 instances with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-tag-sa\n  experiments:\n  - name: aws-ssm-chaos-by-tag\n    spec:\n      components:\n        env:\n        # percentage of the ec2 instances filtered by tags\n        - name: INSTANCE_AFFECTED_PERC\n          value: '100'\n        # tag of the ec2 instances\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the ec2 instances>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/","title":"Azure Disk Loss","text":""},{"location":"experiments/categories/azure/azure-disk-loss/#introduction","title":"Introduction","text":"

    Scenario: Detach the virtual disk from instance

    "},{"location":"experiments/categories/azure/azure-disk-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/azure/azure-disk-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/azure/azure-disk-loss/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/azure/azure-disk-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: azure-disk-loss-sa\n  namespace: default\n  labels:\n    name: azure-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: azure-disk-loss-sa\n  namespace: default\n  labels:\n    name: azure-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/azure/azure-disk-loss/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes VIRTUAL_DISK_NAMES Name of virtual disks to target. Provide comma separated names for multiple disks RESOURCE_GROUP The resource group of the target disk(s)

    Variables Description Notes SCALE_SET Whether disk is connected to Scale set instance Accepts \"enable\"/\"disable\". Default is \"disable\" TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance poweroff. Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/azure/azure-disk-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/azure/azure-disk-loss/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/azure/azure-disk-loss/#detach-virtual-disks-by-name","title":"Detach Virtual Disks By Name","text":"

    It contains comma separated list of disk names subjected to disk loss chaos. It can be tuned via VIRTUAL_DISK_NAMES ENV.

    Use the following example to tune this:

    # detach multiple azure disks by their names \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated names of the azure disks attached to VMs\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/#detach-virtual-disks-attached-to-scale-set-instances-by-name","title":"Detach Virtual Disks Attached to Scale Set Instances By Name","text":"

    It contains comma separated list of disk names attached to scale set instances subjected to disk loss chaos. It can be tuned via VIRTUAL_DISK_NAMES and SCALE_SET ENV.

    Use the following example to tune this:

    # detach multiple azure disks attached to scale set VMs by their names\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated names of the azure disks attached to scaleset VMs\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n        # VM belongs to scaleset or not\n        - name: SCALE_SET\n          value: 'enable'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '10'\n         # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/","title":"Azure Instance Stop","text":""},{"location":"experiments/categories/azure/azure-instance-stop/#introduction","title":"Introduction","text":"

    Scenario: Stop the azure instance

    "},{"location":"experiments/categories/azure/azure-instance-stop/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/azure/azure-instance-stop/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/azure/azure-instance-stop/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/azure/azure-instance-stop/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: azure-instance-stop-sa\n  namespace: default\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: azure-instance-stop-sa\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: azure-instance-stop-sa\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: azure-instance-stop-sa\nsubjects:\n- kind: ServiceAccount\n  name: azure-instance-stop-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/azure/azure-instance-stop/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes AZURE_INSTANCE_NAMES Instance name of the target azure instance For AKS nodes, the instance name is from the scale set section in Azure and not the node name from AKS node pool RESOURCE_GROUP The resource group of the target instance

    Variables Description Notes SCALE_SET Whether instance is part of Scale set Accepts \"enable\"/\"disable\". Default is \"disable\" TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance power off. Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/azure/azure-instance-stop/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/azure/azure-instance-stop/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/azure/azure-instance-stop/#stop-instances-by-name","title":"Stop Instances By Name","text":"

    It contains comma separated list of instance names subjected to instance stop chaos. It can be tuned via AZURE_INSTANCE_NAME ENV.

    Use the following example to tune this:

    ## contains the azure instance details\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of azure instance names\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of AZURE_INSTANCE_NAME>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/#stop-scale-set-instances","title":"Stop Scale Set Instances","text":"

    It contains comma separated list of instance names subjected to instance stop chaos belonging to Scale Set or AKS. It can be tuned via SCALE_SET ENV.

    Use the following example to tune this:

    ## contains the azure instance details for scale set instances or AKS nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of azure instance names\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of Scale set>'\n        # accepts enable/disable value. default is disable\n        - name: SCALE_SET\n          value: 'enable'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '10'\n         # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        - name: RESOURCE_GROUP\n          value: '<resource group of AZURE_INSTANCE_NAME>'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/","title":"Common tunables for all experiments","text":"

    It contains tunables, which are common for all the experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#duration-of-the-chaos","title":"Duration of the chaos","text":"

    It defines the total time duration of the chaos injection. It can be tuned with the TOTAL_CHAOS_DURATION ENV. It is provided in a unit of seconds.

    Use the following example to tune this:

    # define the total chaos duration\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#ramp-time","title":"Ramp Time","text":"

    It defines the period to wait before and after the injection of chaos. It can be tuned with the RAMP_TIME ENV. It is provided in a unit of seconds.

    Use the following example to tune this:

    # waits for the ramp time before and after injection of chaos \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # waits for the time interval before and after injection of chaos\n        - name: RAMP_TIME\n          value: '10' # in seconds\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#sequence-of-chaos-execution","title":"Sequence of chaos execution","text":"

    It defines the sequence of the chaos execution in the case of multiple targets. It can be tuned with the SEQUENCE ENV. It supports the following modes:

    Use the following example to tune this:

    # define the order of execution of chaos in case of multiple targets\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # define the sequence of execution of chaos in case of mutiple targets\n        # supports: serial, parallel. default: parallel\n        - name: SEQUENCE\n          value: 'parallel'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#name-of-chaos-library","title":"Name of chaos library","text":"

    It defines the name of the chaos library used for the chaos injection. It can be tuned with the LIB ENV.

    Use the following example to tune this:

    # lib for the chaos injection\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # defines the name of the chaoslib used for the experiment\n        - name: LIB\n          value: 'litmus'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#instance-id","title":"Instance ID","text":"

    It defines a user-defined string that holds metadata/info about the current run/instance of chaos. Ex: 04-05-2020-9-00. This string is appended as a suffix in the chaosresult CR name. It can be tuned with INSTANCE_ID ENV.

    Use the following example to tune this:

    # provide to append user-defined suffix in the end of chaosresult name\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # user-defined string appended as suffix in the chaosresult name\n        - name: INSTANCE_ID\n          value: '123'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#image-used-by-the-helper-pod","title":"Image used by the helper pod","text":"

    It defines the image, which is used to launch the helper pod, if applicable. It can be tuned with the LIB_IMAGE ENV. It is supported by [container-kill, network-experiments, stress-experiments, dns-experiments, disk-fill, kubelet-service-kill, docker-service-kill, node-restart] experiments.

    Use the following example to tune this:

    # it contains the lib image used for the helper pod\n# it support [container-kill, network-experiments, stress-experiments, dns-experiments, disk-fill,\n# kubelet-service-kill, docker-service-kill, node-restart] experiments\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # nane of the lib image\n        - name: LIB_IMAGE\n          value: 'litmuschaos/go-runner:latest'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/","title":"GCP VM Disk Loss By Label","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#introduction","title":"Introduction","text":"

    Scenario: detach the gcp disk

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  namespace: default\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-disk-loss-by-label-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-disk-loss-by-label-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID The ID of the GCP Project of which the disk volumes are a part of All the target disk volumes should belong to a single GCP Project DISK_VOLUME_LABEL Label of the targeted non-boot persistent disk volume The DISK_VOLUME_LABEL should be provided as key:value or key if the corresponding value is empty ex: disk:target-disk ZONES The zone of target disk volumes Only one zone can be provided i.e. all target disks should lie in the same zone

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between the successive chaos iterations (sec) Defaults to 30s DISK_AFFECTED_PERC The percentage of total disks filtered using the label to target Defaults to 0 (corresponds to 1 disk), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple disks Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#detach-volumes-by-label","title":"Detach Volumes By Label","text":"

    It contains the label of disk volumes to be subjected to disk loss chaos. It will detach all the disks with the label DISK_VOLUME_LABEL in zone ZONES within the GCP_PROJECT_ID project. It re-attaches the disk volume after waiting for the specified TOTAL_CHAOS_DURATION duration.

    NOTE: The DISK_VOLUME_LABEL accepts only one label and ZONES also accepts only one zone name. Therefore, all the disks must lie in the same zone.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-by-label-sa\n  experiments:\n  - name: gcp-vm-disk-loss-by-label\n    spec:\n      components:\n        env:\n        - name: DISK_VOLUME_LABEL\n          value: 'disk:target-disk'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-by-label-sa\n  experiments:\n  - name: gcp-vm-disk-loss-by-label\n    spec:\n      components:\n        env:\n        - name: CHAOS_INTERVAL\n          value: '15'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n\n        - name: DISK_VOLUME_LABEL\n          value: 'disk:target-disk'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/","title":"GCP VM Disk Loss","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#introduction","title":"Introduction","text":"

    Scenario: detach the gcp disk

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  namespace: default\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-disk-loss-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-disk-loss-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID The ID of the GCP Project of which the disk volumes are a part of All the target disk volumes should belong to a single GCP Project DISK_VOLUME_NAMES Target non-boot persistent disk volume names Multiple disk volume names can be provided as disk1,disk2,... ZONES The zones of respective target disk volumes Provide the zone for every target disk name as zone1,zone2... in the respective order of DISK_VOLUME_NAMES DEVICE_NAMES The device names of respective target disk volumes Provide the device name for every target disk name as deviceName1,deviceName2... in the respective order of DISK_VOLUME_NAMES

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between the successive chaos iterations (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple disks Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#detach-volumes-by-names","title":"Detach Volumes By Names","text":"

    It contains comma separated list of volume names subjected to disk loss chaos. It will detach all the disks with the given DISK_VOLUME_NAMES disk names and corresponding ZONES zone names and the DEVICE_NAMES device names in GCP_PROJECT_ID project. It reattached the volume after waiting for the specified TOTAL_CHAOS_DURATION duration.

    NOTE: The DISK_VOLUME_NAMES contains multiple comma-separated disk names. The comma-separated zone names should be provided in the same order as disk names.

    Use the following example to tune this:

    ## details of the gcp disk\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-sa\n  experiments:\n  - name: gcp-vm-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated list of disk volume names\n        - name: DISK_VOLUME_NAMES\n          value: 'disk-01,disk-02'\n        # comma separated list of zone names corresponds to the DISK_VOLUME_NAMES\n        # it should be provided in same order of DISK_VOLUME_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # comma separated list of device names corresponds to the DISK_VOLUME_NAMES\n        # it should be provided in same order of DISK_VOLUME_NAMES\n        - name: DEVICE_NAMES\n          value: 'device-01,device-02'\n        # gcp project id to which disk volume belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-sa\n  experiments:\n  - name: gcp-vm-disk-loss\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: DISK_VOLUME_NAMES\n          value: 'disk-01,disk-02'\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        - name: DEVICE_NAMES\n          value: 'device-01,device-02'\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/","title":"GCP VM Instance Stop By Label","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#introduction","title":"Introduction","text":"

    Scenario: stop the gcp vm

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  namespace: default\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-instance-stop-by-label-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-instance-stop-by-label-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID GCP project ID to which the VM instances belong All the VM instances must belong to a single GCP project INSTANCE_LABEL Name of target VM instances The INSTANCE_LABEL should be provided as key:value or key if the corresponding value is empty ex: vm:target-vm ZONES The zone of the target VM instances Only one zone can be provided i.e. all target instances should lie in the same zone

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s MANAGED_INSTANCE_GROUP Set to enable if the target instance is the part of a managed instance group Defaults to disable INSTANCE_AFFECTED_PERC The percentage of total VMs filtered using the label to target Defaults to 0 (corresponds to 1 instance), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#target-gcp-instances","title":"Target GCP Instances","text":"

    It will stop all the instances with filtered by the label INSTANCE_LABEL and corresponding ZONES zone in GCP_PROJECT_ID project.

    NOTE: The INSTANCE_LABEL accepts only one label and ZONES also accepts only one zone name. Therefore, all the instances must lie in the same zone.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#manged-instance-group","title":"Manged Instance Group","text":"

    If vm instances belong to a managed instance group then provide the MANAGED_INSTANCE_GROUP as enable else provided it as disable, which is the default value.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: MANAGED_INSTANCE_GROUP\n          value: 'enable'\n\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: CHAOS_INTERVAL\n          value: '15'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/","title":"GCP VM Instance Stop","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#introduction","title":"Introduction","text":"

    Scenario: stop the gcp vm

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#default-validations","title":"Default Validations","text":"View the default validations "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  namespace: default\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-instance-stop-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-instance-stop-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID GCP project ID to which the VM instances belong All the VM instances must belong to a single GCP project VM_INSTANCE_NAMES Name of target VM instances Multiple instance names can be provided as instance1,instance2,... ZONES The zones of the target VM instances Zone for every instance name has to be provided as zone1,zone2,... in the same order of VM_INSTANCE_NAMES

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s MANAGED_INSTANCE_GROUP Set to enable if the target instance is the part of a managed instance group Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#target-gcp-instances","title":"Target GCP Instances","text":"

    It will stop all the instances with the given VM_INSTANCE_NAMES instance names and corresponding ZONES zone names in GCP_PROJECT_ID project.

    NOTE: The VM_INSTANCE_NAMES contains multiple comma-separated vm instances. The comma-separated zone names should be provided in the same order as instance names.

    Use the following example to tune this:

    ## details of the gcp instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of vm instance names\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # comma separated list of zone names corresponds to the VM_INSTANCE_NAMES\n        # it should be provided in same order of VM_INSTANCE_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # gcp project id to which vm instance belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#managed-instance-group","title":"Managed Instance Group","text":"

    If vm instances belong to a managed instance group then provide the MANAGED_INSTANCE_GROUP as enable else provided it as disable, which is the default value.

    Use the following example to tune this:

    ## scale up and down to maintain the available instance counts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # tells if instances are part of managed instance group\n        # supports: enable, disable. default: disable\n        - name: MANAGED_INSTANCE_GROUP\n          value: 'enable'\n        # comma separated list of vm instance names\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # comma separated list of zone names corresponds to the VM_INSTANCE_NAMES\n        # it should be provided in same order of VM_INSTANCE_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # gcp project id to which vm instance belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n
    "},{"location":"experiments/categories/load/k6-loadgen/","title":"k6 Load Generator","text":""},{"location":"experiments/categories/load/k6-loadgen/#introduction","title":"Introduction","text":"

    k6 loadgen fault simulates load generation on the target hosts for a specific chaos duration. This fault: - Slows down or makes the target host unavailable due to heavy load. - Checks the performance of the application or process running on the instance. Support various types of load testing (ex. spike, smoke, stress)

    Scenario: Load generating with k6

    "},{"location":"experiments/categories/load/k6-loadgen/#uses","title":"Uses","text":"View the uses of the experiment

    Introduction to k6 Load Chaos in LitmusChaos

    "},{"location":"experiments/categories/load/k6-loadgen/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/load/k6-loadgen/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\nrules:\n- apiGroups: [\"\",\"litmuschaos.io\",\"batch\",\"apps\"]\n  resources: [\"pods\",\"configmaps\",\"jobs\",\"pods/exec\",\"pods/log\",\"events\",\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\",\"deletecollection\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\nroleRef:\napiGroup: rbac.authorization.k8s.io\nkind: Role\nname: k6-loadgen-sa\nsubjects:\n- kind: ServiceAccount\n  name: k6-loadgen-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/load/k6-loadgen/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 20s CHAOS_INTERVAL Time interval b/w two successive k6-loadgen (in sec) If the CHAOS_INTERVAL is not provided it will take the default value of 10s RAMP_TIME Period to wait before injection of chaos in sec LIB_IMAGE LIB Image used to excute k6 engine Defaults to ghcr.io/grafana/k6-operator:latest-runner LIB_IMAGE_PULL_POLICY LIB Image pull policy Defaults to Always SCRIPT_SECRET_NAME Provide the k8s secret name of the JS script to run k6. Default value: k6-script SCRIPT_SECRET_KEY Provide the key of the k8s secret named SCRIPT_SECRET_NAME Default value: script.js

    "},{"location":"experiments/categories/load/k6-loadgen/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/load/k6-loadgen/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/load/k6-loadgen/#custom-k6-configuration","title":"Custom k6 configuration","text":"

    You can add k6 options(ex hosts, thresholds) in the script options object. More details can be found here

    "},{"location":"experiments/categories/load/k6-loadgen/#custom-secret-name-and-secret-key","title":"Custom Secret Name and Secret Key","text":"

    You can provide the secret name and secret key of the JS script to be used for k6-loadgen. The secret should be created in the same namespace where the chaos infrastructure is created. For example, if the chaos infrastructure is created in the litmus namespace, then the secret should also be created in the litmus namespace.

    You can write a JS script like below. If you want to know more about the script, checkout this documentation.

    import http from 'k6/http';\nimport { sleep } from 'k6';\nexport const options = {\n    vus: 100,\n    duration: '30s',\n};\nexport default function () {\n    http.get('http://<<target_domain_name>>/');\n    sleep(1);\n}\n

    Then create a secret with the above script.

    kubectl create secret generic custom-k6-script \\\n  --from-file=custom-script.js -n <<chaos_infrastructure_namespace>>\n

    And If we want to use custom-k6-script secret and custom-script.js as the secret key, then the experiment tunable will look like this:

    ---\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: nginx-chaos\n  namespace: default\nspec:\n  engineState: 'active'\n  chaosServiceAccount: litmus-admin\n  experiments:\n    - name: k6-loadgen\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: \"30\"\n\n            # Interval between chaos injection in sec\n            - name: CHAOS_INTERVAL\n              value: \"30\"\n\n            # Period to wait before and after injection of chaos in sec\n            - name: RAMP_TIME\n              value: \"0\"\n\n            # Provide the secret name of the JS script\n            - name: SCRIPT_SECRET_NAME\n              value: \"custom-k6-script\"\n\n            # Provide the secret key of the JS script\n            - name: SCRIPT_SECRET_KEY\n              value: \"custom-script.js\"\n\n            # Provide the image name of the helper pod\n            - name: LIB_IMAGE\n              value: \"ghcr.io/grafana/k6-operator:latest-runner\"\n\n            # Provide the image pull policy of the helper pod\n            - name: LIB_IMAGE_PULL_POLICY\n              value: \"Always\"\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/","title":"Common tunables for node experiments","text":"

    It contains tunables, which are common for all the node experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-single-node","title":"Target Single Node","text":"

    It defines the name of the target node subjected to chaos. The target node can be tuned via TARGET_NODE ENV. It contains only a single node name. NOTE: It is supported by [node-drain, node-taint, node-restart, kubelet-service-kill, docker-service-kill] experiments.

    Use the following example to tune this:

    ## provide the target node name\n## it is applicable for the [node-drain, node-taint, node-restart, kubelet-service-kill, docker-service-kill]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-multiple-nodes","title":"Target Multiple Nodes","text":"

    It defines the comma-separated name of the target nodes subjected to chaos. The target nodes can be tuned via TARGET_NODES ENV. NOTE: It is supported by [node-cpu-hog, node-memory-hog, node-io-stress] experiments

    Use the following example to tune this:

    ## provide the comma separated target node names\n## it is applicable for the [node-cpu-hog, node-memory-hog, node-io-stress]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # comma separated target node names\n        - name: TARGET_NODES\n          value: 'node01,node02'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-nodes-with-labels","title":"Target Nodes With Labels","text":"

    It defines the labels of the targeted node(s) subjected to chaos. The node labels can be tuned via NODE_LABEL ENV. It is mutually exclusive with the TARGET_NODE(S) ENV. If TARGET_NODE(S) ENV is set then it will use the nodes provided inside it otherwise, it will derive the node name(s) with matching node labels.

    Use the following example to tune this:

    ## provide the labels of the targeted nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # labels of the targeted node\n        # it will derive the target nodes if TARGET_NODE(S) ENV is not set\n        - name: NODE_LABEL\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#node-affected-percentage","title":"Node Affected Percentage","text":"

    It defines the percentage of nodes subjected to chaos with matching node labels. It can be tuned with NODES_AFFECTED_PERC ENV. If NODES_AFFECTED_PERC is provided as empty or 0 then it will target a minimum of one node. It is supported by [node-cpu-hog, node-memory-hog, node-io-stress] experiments. The rest of the experiment selects only a single node for the chaos.

    Use the following example to tune this:

    ## provide the percentage of nodes to be targeted with matching labels\n## it is applicable for the [node-cpu-hog, node-memory-hog, node-io-stress]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # percentage of nodes to be targeted with matching node labels\n        - name: NODES_AFFECTED_PERC\n          value: '100'\n        # labels of the targeted node\n        # it will derive the target nodes if TARGET_NODE(S) ENV is not set\n        - name: NODE_LABEL\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/docker-service-kill/","title":"Docker Service Kill","text":""},{"location":"experiments/categories/nodes/docker-service-kill/#introduction","title":"Introduction","text":"

    Scenario: Kill the docker service of the node

    "},{"location":"experiments/categories/nodes/docker-service-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/docker-service-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/nodes/docker-service-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: docker-service-kill-sa\n  namespace: default\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: docker-service-kill-sa\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: docker-service-kill-sa\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: docker-service-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: docker-service-kill-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the target node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/docker-service-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/docker-service-kill/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#kill-docker-service","title":"Kill Docker Service","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # kill the docker service of the target node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: docker-service-kill-sa\n  experiments:\n  - name: docker-service-kill\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/kubelet-service-kill/","title":"Kubelet Service Kill","text":""},{"location":"experiments/categories/nodes/kubelet-service-kill/#introduction","title":"Introduction","text":"

    Scenario: Kill the kubelet service of the node

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/nodes/kubelet-service-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: kubelet-service-kill-sa\n  namespace: default\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: kubelet-service-kill-sa\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: kubelet-service-kill-sa\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: kubelet-service-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: kubelet-service-kill-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the target node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus LIB_IMAGE The lib image used to inject kubelet kill chaos the image should have systemd installed in it. Defaults to ubuntu:16.04 RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/kubelet-service-kill/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#kill-kubelet-service","title":"Kill Kubelet Service","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # kill the kubelet service of the target node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: kubelet-service-kill-sa\n  experiments:\n  - name: kubelet-service-kill\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-cpu-hog/","title":"Node CPU Hog","text":""},{"location":"experiments/categories/nodes/node-cpu-hog/#introduction","title":"Introduction","text":"

    Scenario: Stress the CPU of node

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/nodes/node-cpu-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-cpu-hog-sa\n  namespace: default\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-cpu-hog-sa\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-cpu-hog-sa\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-cpu-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-cpu-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node cpu hog chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60 LIB The chaos lib used to inject the chaos Defaults to litmus LIB_IMAGE Image used to run the stress command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before & after injection of chaos in sec Optional NODE_CPU_CORE Number of cores of node CPU to be consumed Defaults to 2 NODES_AFFECTED_PERC The Percentage of total nodes to target Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-cpu-hog/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#node-cpu-cores","title":"Node CPU Cores","text":"

    It contains number of cores of node CPU to be consumed. It can be tuned via NODE_CPU_CORE ENV.

    Use the following example to tune this:

    # stress the cpu of the targeted nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # number of cpu cores to be stressed\n        - name: NODE_CPU_CORE\n          value: '2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-cpu-hog/#node-cpu-load","title":"Node CPU Load","text":"

    It contains percentage of node CPU to be consumed. It can be tuned via CPU_LOAD ENV.

    Use the following example to tune this:

    # stress the cpu of the targeted nodes by load percentage\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # percentage of cpu to be stressed\n        - name: CPU_LOAD\n          value: \"100\"\n        # node cpu core should be provided as 0 for cpu load\n        # to work otherwise it will take cpu core as priority\n        - name: NODE_CPU_CORE\n          value: '0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-drain/","title":"Node Drain","text":""},{"location":"experiments/categories/nodes/node-drain/#introduction","title":"Introduction","text":"

    Scenario: Drain the node

    "},{"location":"experiments/categories/nodes/node-drain/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-drain/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/nodes/node-drain/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-drain/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-drain-sa\n  namespace: default\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-drain-sa\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\",\"pods/eviction\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # ignore daemonsets while draining the node\n  - apiGroups: [\"apps\"]\n    resources: [\"daemonsets\"]\n    verbs: [\"list\",\"get\",\"delete\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\",\"patch\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-drain-sa\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-drain-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-drain-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-drain/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the node to be tainted NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/node-drain/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-drain/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-drain/#drain-node","title":"Drain Node","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # drain the targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/","title":"Node IO Stress","text":""},{"location":"experiments/categories/nodes/node-io-stress/#introduction","title":"Introduction","text":"

    Scenario: Stress the IO of Node

    "},{"location":"experiments/categories/nodes/node-io-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-io-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites "},{"location":"experiments/categories/nodes/node-io-stress/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-io-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-io-stress-sa\n  namespace: default\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-io-stress-sa\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-io-stress-sa\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-io-stress-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-io-stress-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-io-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node io stress chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos (seconds) Default to 120 FILESYSTEM_UTILIZATION_PERCENTAGE Specify the size as percentage of free space on the file system Default to 10% FILESYSTEM_UTILIZATION_BYTES Specify the size in GigaBytes(GB). FILESYSTEM_UTILIZATION_PERCENTAGE & FILESYSTEM_UTILIZATION_BYTES are mutually exclusive. If both are provided, FILESYSTEM_UTILIZATION_PERCENTAGE is prioritized. CPU Number of core of CPU to be used Default to 1 NUMBER_OF_WORKERS It is the number of IO workers involved in IO disk stress Default to 4 VM_WORKERS It is the number vm workers involved in IO disk stress Default to 1 LIB The chaos lib used to inject the chaos Default to litmus LIB_IMAGE Image used to run the stress command Default to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec NODES_AFFECTED_PERC The Percentage of total nodes to target Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/nodes/node-io-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-io-stress/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-io-stress/#filesystem-utilization-percentage","title":"Filesystem Utilization Percentage","text":"

    It stresses the FILESYSTEM_UTILIZATION_PERCENTAGE percentage of total free space available in the node.

    Use the following example to tune this:

    # stress the i/o of the targeted node with FILESYSTEM_UTILIZATION_PERCENTAGE of total free space \n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_BYTES.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # percentage of total free space of file system\n        - name: FILESYSTEM_UTILIZATION_PERCENTAGE\n          value: '10' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#filesystem-utilization-bytes","title":"Filesystem Utilization Bytes","text":"

    It stresses the FILESYSTEM_UTILIZATION_BYTES GB of the i/o of the targeted node. It is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE ENV. If FILESYSTEM_UTILIZATION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on FILESYSTEM_UTILIZATION_BYTES ENV.

    Use the following example to tune this:

    # stress the i/o of the targeted node with given FILESYSTEM_UTILIZATION_BYTES\n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # file system to be stress in GB\n        - name: FILESYSTEM_UTILIZATION_BYTES\n          value: '500' # in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#limit-cpu-utilization","title":"Limit CPU Utilization","text":"

    The CPU usage can be limit to CPU cpu while performing io stress. It can be tuned via CPU ENV.

    Use the following example to tune this:

    # limit the cpu uses to the provided value while performing io stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # number of cpu cores to be stressed\n        - name: CPU\n          value: '1' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#workers-for-stress","title":"Workers For Stress","text":"

    The i/o and VM workers count for the stress can be tuned with NUMBER_OF_WORKERS and VM_WORKERS ENV respectively.

    Use the following example to tune this:

    # define the workers count for the i/o and vm\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # total number of io workers involved in stress\n        - name: NUMBER_OF_WORKERS\n          value: '4' \n          # total number of vm workers involved in stress\n        - name: VM_WORKERS\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/","title":"Node Memory Hog","text":""},{"location":"experiments/categories/nodes/node-memory-hog/#introduction","title":"Introduction","text":"
    • This experiment causes Memory resource exhaustion on the Kubernetes node. The experiment aims to verify resiliency of applications whose replicas may be evicted on account on nodes turning unschedulable (Not Ready) due to lack of Memory resources.
    • The Memory chaos is injected using a helper pod running the linux stress-ng tool (a workload generator)- The chaos is effected for a period equalling the TOTAL_CHAOS_DURATION and upto MEMORY_CONSUMPTION_PERCENTAGE(out of 100) or MEMORY_CONSUMPTION_MEBIBYTES(in Mebibytes out of total available memory).
    • Application implies services. Can be reframed as: Tests application resiliency upon replica evictions caused due to lack of Memory resources

    Scenario: Stress the memory of node

    "},{"location":"experiments/categories/nodes/node-memory-hog/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-memory-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-memory-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/nodes/node-memory-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-memory-hog-sa\n  namespace: default\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-memory-hog-sa\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-memory-hog-sa\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-memory-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-memory-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node memory hog chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (in seconds) Optional Defaults to 120 LIB The chaos lib used to inject the chaos Optional Defaults to litmus LIB_IMAGE Image used to run the stress command Optional Defaults to litmuschaos/go-runner:latest MEMORY_CONSUMPTION_PERCENTAGE Percent of the total node memory capacity Optional Defaults to 30 MEMORY_CONSUMPTION_MEBIBYTES The size in Mebibytes of total available memory. When using this we need to keep MEMORY_CONSUMPTION_PERCENTAGE empty as the percentage have more precedence Optional NUMBER_OF_WORKERS It is the number of VM workers involved in IO disk stress Optional Default to 1 RAMP_TIME Period to wait before and after injection of chaos in sec Optional NODES_AFFECTED_PERC The Percentage of total nodes to target Optional Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/nodes/node-memory-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-memory-hog/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#memory-consumption-percentage","title":"Memory Consumption Percentage","text":"

    It stresses the MEMORY_CONSUMPTION_PERCENTAGE percentage of total node capacity of the targeted node.

    Use the following example to tune this:

    # stress the memory of the targeted node with MEMORY_CONSUMPTION_PERCENTAGE of node capacity\n# it is mutually exclusive with the MEMORY_CONSUMPTION_MEBIBYTES.\n# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # percentage of total node capacity to be stressed\n        - name: MEMORY_CONSUMPTION_PERCENTAGE\n          value: '10' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/#memory-consumption-mebibytes","title":"Memory Consumption Mebibytes","text":"

    It stresses the MEMORY_CONSUMPTION_MEBIBYTES MiBi of the memory of the targeted node. It is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE ENV. If MEMORY_CONSUMPTION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on MEMORY_CONSUMPTION_MEBIBYTES ENV.

    Use the following example to tune this:

    # stress the memory of the targeted node with given MEMORY_CONSUMPTION_MEBIBYTES\n# it is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE.\n# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # node memory to be stressed\n        - name: MEMORY_CONSUMPTION_MEBIBYTES\n          value: '500' # in MiBi\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/#workers-for-stress","title":"Workers For Stress","text":"

    The workers count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # provide for the workers count for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # total number of workers involved in stress\n        - name: NUMBER_OF_WORKERS\n          value: '1' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/","title":"Node Restart","text":""},{"location":"experiments/categories/nodes/node-restart/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt state of node by restarting it.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the application pod

    Scenario: Restart the node

    "},{"location":"experiments/categories/nodes/node-restart/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-restart/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-restart experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Create a Kubernetes secret named id-rsa where the experiment will run, where its contents will be the private SSH key for SSH_USER used to connect to the node that hosts the target pod in the secret field ssh-privatekey. A sample secret is shown below:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: id-rsa\ntype: kubernetes.io/ssh-auth\nstringData:\n  ssh-privatekey: |-\n    # SSH private key for ssh contained here\n

    Creating the RSA key pair for remote SSH access should be a trivial exercise for those who are already familiar with an ssh client, which entails the following actions:

    1. Create a new key pair and store the keys in a file named my-id-rsa-key and my-id-rsa-key.pub for the private and public keys respectively:
      ssh-keygen -f ~/my-id-rsa-key -t rsa -b 4096\n
    2. For each node available, run this following command to copy the public key of my-id-rsa-key:
      ssh-copy-id -i my-id-rsa-key user@node\n

    For further details, please check this documentation. Once you have copied the public key to all nodes and created the secret described earlier, you are ready to start your experiment.

    "},{"location":"experiments/categories/nodes/node-restart/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-restart/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-restart-sa\n  namespace: default\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-restart-sa\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\",\"secrets\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-restart-sa\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-restart-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-restart-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-restart/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of target node, subjected to chaos. If not provided it will select the random node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes LIB_IMAGE The image used to restart the node Defaults to litmuschaos/go-runner:latest SSH_USER name of ssh user Defaults to root TARGET_NODE_IP Internal IP of the target node, subjected to chaos. If not provided, the experiment will lookup the node IP of the TARGET_NODE node Defaults to empty REBOOT_COMMAND Command used for reboot Defaults to sudo systemctl reboot TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s RAMP_TIME Period to wait before and after injection of chaos in sec LIB The chaos lib used to inject the chaos Defaults to litmus supported litmus only

    "},{"location":"experiments/categories/nodes/node-restart/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-restart/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-restart/#reboot-command","title":"Reboot Command","text":"

    It defines the command used to restart the targeted node. It can be tuned via REBOOT_COMMAND ENV.

    Use the following example to tune this:

    # provide the reboot command\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # command used for the reboot\n        - name: REBOOT_COMMAND\n          value: 'sudo systemctl reboot'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/#ssh-user","title":"SSH User","text":"

    It defines the name of the SSH user for the targeted node. It can be tuned via SSH_USER ENV.

    Use the following example to tune this:

    # name of the ssh user used to ssh into targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # name of the ssh user\n        - name: SSH_USER\n          value: 'root'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/#target-node-internal-ip","title":"Target Node Internal IP","text":"

    It defines the internal IP of the targeted node. It is an optional field, if internal IP is not provided then it will derive the internal IP of the targeted node. It can be tuned via TARGET_NODE_IP ENV.

    Use the following example to tune this:

    # internal ip of the targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # internal ip of the targeted node\n        - name: TARGET_NODE_IP\n          value: '<ip of node01>'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-taint/","title":"Node Taint","text":""},{"location":"experiments/categories/nodes/node-taint/#introduction","title":"Introduction","text":"
    • It taints the node to apply the desired effect. The resources which contains the correspoing tolerations can only bypass the taints.

    Scenario: Taint the node

    "},{"location":"experiments/categories/nodes/node-taint/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-taint/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-taint experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that the node specified in the experiment ENV variable TARGET_NODE (the node which will be tainted) should be cordoned before execution of the chaos experiment (before applying the chaosengine manifest) to ensure that the litmus experiment runner pods are not scheduled on it / subjected to eviction. This can be achieved with the following steps:
      • Get node names against the applications pods: kubectl get pods -o wide
      • Cordon the node kubectl cordon <nodename>
    "},{"location":"experiments/categories/nodes/node-taint/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-taint/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-taint-sa\n  namespace: default\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-taint-sa\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\",\"pods/eviction\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # ignore daemonsets while draining the node\n  - apiGroups: [\"apps\"]\n    resources: [\"daemonsets\"]\n    verbs: [\"list\",\"get\",\"delete\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\",\"patch\",\"update\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-taint-sa\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-taint-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-taint-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-taint/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the node to be tainted NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE TAINT_LABEL Label and effect to be tainted on application node

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/node-taint/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-taint/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-taint/#taint-label","title":"Taint Label","text":"

    It contains label and effect to be tainted on application node. It can be tuned via TAINT_LABEL ENV.

    Use the following example to tune this:

    # node tainted with provided key and effect\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-taint-sa\n  experiments:\n  - name: node-taint\n    spec:\n      components:\n        env:\n        # label and effect to be tainted on the targeted node\n        - name: TAINT_LABEL\n          value: 'key=value:effect'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/","title":"Common tunables for pod experiments","text":"

    It contains tunables, which are common for all pod-level experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#target-specific-pods","title":"Target Specific Pods","text":"

    It defines the comma-separated name of the target pods subjected to chaos. The target pods can be tuned via TARGET_PODS ENV.

    Use the following example to tune this:

    ## it contains comma separated target pod names\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        ## comma separated target pod names\n        - name: TARGET_PODS\n          value: 'pod1,pod2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#pod-affected-percentage","title":"Pod Affected Percentage","text":"

    It defines the percentage of pods subjected to chaos with matching labels provided at .spec.appinfo.applabel inside chaosengine. It can be tuned with PODS_AFFECTED_PERC ENV. If PODS_AFFECTED_PERC is provided as empty or 0 then it will target a minimum of one pod.

    Use the following example to tune this:

    ## it contains percentage of application pods to be targeted with matching labels or names in the application namespace\n## supported for all pod-level experiment expect pod-autoscaler\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # percentage of application pods\n        - name: PODS_AFFECTED_PERC\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#target-specific-container","title":"Target Specific Container","text":"

    It defines the name of the targeted container subjected to chaos. It can be tuned via TARGET_CONTAINER ENV. If TARGET_CONTAINER is provided as empty then it will use the first container of the targeted pod.

    Use the following example to tune this:

    ## name of the target container\n## it will use first container as target container if TARGET_CONTAINER is provided as empty\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # name of the target container\n        - name: TARGET_CONTAINER\n          value: 'nginx'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#default-application-health-check","title":"Default Application Health Check","text":"

    It defines the default application status checks as a tunable. It is helpful for the scenarios where you don\u2019t want to validate the application status as a mandatory check during pre & post chaos. It can be tuned via DEFAULT_APP_HEALTH_CHECK ENV. If DEFAULT_APP_HEALTH_CHECK is not provided by default it is set to true.

    Use the following example to tune this:

    ## application status check as tunable\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        - name: DEFAULT_APP_HEALTH_CHECK\n          value: 'false'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#node-label-filter-for-selecting-the-target-pods","title":"Node Label Filter For Selecting The Target Pods","text":"

    It defines the target application pod selection from a specific node. It is helpful for the scenarios where you want to select the pods scheduled on specific nodes as chaos candidates considering the pod affected percentage. It can be tuned via NODE_LABEL ENV.

    NOTE: This feature requires having node-level permission or clusterrole service account for filtering pods on a specific node.

    APP_LABEL TARGET_PODS NODE_LABEL SELECTED PODS Provided Provided Provided The target pods that are filtered from applabel and resides on node containing the given node label and also provided in TARGET_PODS env is selected Provided Not Provided Provided The pods that are filtered from applabel and resides on node containing the given node label is selected Not Provided Provided Provided The target pods are selected that resides on node with given node label Not Provided Not Provided Provided Invalid Not Provided Not Provided Not Provided Invalid

    Use the following example to tune this:

    ## node label to filter target pods\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        - name: NODE_LABEL\n          value: 'kubernetes.io/hostname=worker-01'\n
    "},{"location":"experiments/categories/pods/container-kill/","title":"Container Kill","text":""},{"location":"experiments/categories/pods/container-kill/#introduction","title":"Introduction","text":"
    • It Causes container failure of specific/random replicas of an application resources.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application
    • Good for testing recovery of pods having side-car containers

    Scenario: Kill target container

    "},{"location":"experiments/categories/pods/container-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/container-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the container-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/container-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/container-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: container-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: container-kill-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/container-kill/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER The name of container to be killed inside the pod If the TARGET_CONTAINER is not provided it will delete the first container CHAOS_INTERVAL Time interval b/w two successive container kill (in sec) If the CHAOS_INTERVAL is not provided it will take the default value of 10s TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 20s PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only TARGET_PODS Comma separated list of application pod name subjected to container kill chaos If not provided, it will select target pods randomly based on provided appLabels LIB_IMAGE LIB Image used to kill the container Defaults to litmuschaos/go-runner:latest LIB The category of lib use to inject chaos Default value: litmus, supported values: pumba and litmus RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel SIGNAL It contains termination signal used for container kill Default value: SIGKILL SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB

    "},{"location":"experiments/categories/pods/container-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/container-kill/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/container-kill/#kill-specific-container","title":"Kill Specific Container","text":"

    It defines the name of the targeted container subjected to chaos. It can be tuned via TARGET_CONTAINER ENV. If TARGET_CONTAINER is provided as empty then it will use the first container of the targeted pod.

    # kill the specific target container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # name of the target container\n        - name: TARGET_CONTAINER\n          value: 'nginx'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path:

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.
    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#signal-for-kill","title":"Signal For Kill","text":"

    It defines the Linux signal passed while killing the container. It can be tuned via SIGNAL ENV. It defaults to the SIGTERM.

    # specific linux signal passed while kiiling container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # signal passed while killing container\n        # defaults to SIGTERM\n        - name: SIGNAL\n          value: 'SIGKILL'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus.

    # pumba chaoslib used to kill the container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # name of the lib\n        # supoorts pumba and litmus\n        - name: LIB\n          value: 'pumba'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/","title":"Disk Fill","text":""},{"location":"experiments/categories/pods/disk-fill/#introduction","title":"Introduction","text":"
    • It causes Disk Stress by filling up the ephemeral storage of the pod on any given node.
    • It causes the application pod to get evicted if the capacity filled exceeds the pod's ephemeral storage limit.
    • It tests the Ephemeral Storage Limits, to ensure those parameters are sufficient.
    • It tests the application's resiliency to disk stress/replica evictions.

    Scenario: Fill ephemeral-storage

    "},{"location":"experiments/categories/pods/disk-fill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/disk-fill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the disk-fill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Appropriate Ephemeral Storage Requests and Limits should be set for the application before running the experiment. An example specification is shown below:
      apiVersion: v1\nkind: Pod\nmetadata:\n  name: frontend\nspec:\n  containers:\n  - name: db\n    image: mysql\n    env:\n    - name: MYSQL_ROOT_PASSWORD\n      value: \"password\"\n    resources:\n      requests:\n        ephemeral-storage: \"2Gi\"\n      limits:\n        ephemeral-storage: \"4Gi\"\n  - name: wp\n    image: wordpress\n    resources:\n      requests:\n        ephemeral-storage: \"2Gi\"\n      limits:\n        ephemeral-storage: \"4Gi\"\n
    "},{"location":"experiments/categories/pods/disk-fill/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/disk-fill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: disk-fill-sa\nsubjects:\n- kind: ServiceAccount\n  name: disk-fill-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/disk-fill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes FILL_PERCENTAGE Percentage to fill the Ephemeral storage limit Can be set to more than 100 also, to force evict the pod. The ephemeral-storage limits must be set in targeted pod to use this ENV. EPHEMERAL_STORAGE_MEBIBYTES Ephemeral storage which need to fill (unit: MiBi) It is mutually exclusive with the FILL_PERCENTAGE ENV. If both are provided then it will use the FILL_PERCENTAGE

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to disk-fill If not provided, the first container in the targeted pod will be subject to chaos CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 60s TARGET_PODS Comma separated list of application pod name subjected to disk fill chaos If not provided, it will select target pods randomly based on provided appLabels DATA_BLOCK_SIZE It contains data block size used to fill the disk(in KB) Defaults to 256, it supports unit as KB only PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Defaults to litmus supported litmus only LIB_IMAGE The image used to fill the disk Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/disk-fill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/disk-fill/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/disk-fill/#disk-fill-percentage","title":"Disk Fill Percentage","text":"

    It fills the FILL_PERCENTAGE percentage of the ephemeral-storage limit specified at resource.limits.ephemeral-storage inside the target application.

    Use the following example to tune this:

    ## percentage of ephemeral storage limit specified at `resource.limits.ephemeral-storage` inside target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## percentage of ephemeral storage limit, which needs to be filled\n        - name: FILL_PERCENTAGE\n          value: '80' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#disk-fill-mebibytes","title":"Disk Fill Mebibytes","text":"

    It fills the EPHEMERAL_STORAGE_MEBIBYTES MiBi of ephemeral storage of the targeted pod. It is mutually exclusive with the FILL_PERCENTAGE ENV. If FILL_PERCENTAGE ENV is set then it will use the percentage for the fill otherwise, it will fill the ephemeral storage based on EPHEMERAL_STORAGE_MEBIBYTES ENV.

    Use the following example to tune this:

    # ephemeral storage which needs to fill in will application\n# if ephemeral-storage limits is not specified inside target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## ephemeral storage size, which needs to be filled\n        - name: EPHEMERAL_STORAGE_MEBIBYTES\n          value: '256' #in MiBi\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#data-block-size","title":"Data Block Size","text":"

    It defines the size of the data block used to fill the ephemeral storage of the targeted pod. It can be tuned via DATA_BLOCK_SIZE ENV. Its unit is KB. The default value of DATA_BLOCK_SIZE is 256.

    Use the following example to tune this:

    # size of the data block used to fill the disk\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## size of data block used to fill the disk\n        - name: DATA_BLOCK_SIZE\n          value: '256' #in KB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is containerd.
    • SOCKET_PATH: It contains path of containerd socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    # path inside node/vm where containers are present\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        # provide the name of container runtime, it supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # provide the socket file path\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-autoscaler/","title":"Pod Autoscaler","text":""},{"location":"experiments/categories/pods/pod-autoscaler/#introduction","title":"Introduction","text":"
    • The experiment aims to check the ability of nodes to accommodate the number of replicas a given application pod.

    • This experiment can be used for other scenarios as well, such as for checking the Node auto-scaling feature. For example, check if the pods are successfully rescheduled within a specified period in cases where the existing nodes are already running at the specified limits.

    Scenario: Scale the replicas

    "},{"location":"experiments/categories/pods/pod-autoscaler/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-autoscaler/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-autoscaler experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-autoscaler/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-autoscaler-sa\n  namespace: default\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: pod-autoscaler-sa\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # performs CRUD operations on the deployments and statefulsets\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\"]\n    verbs: [\"list\",\"get\",\"patch\",\"update\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: pod-autoscaler-sa\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: pod-autoscaler-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-autoscaler-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes REPLICA_COUNT Number of replicas upto which we want to scale nil

    Variables Description Notes TOTAL_CHAOS_DURATION The timeout for the chaos experiment (in seconds) Defaults to 60 LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/pods/pod-autoscaler/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-autoscaler/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#replica-counts","title":"Replica counts","text":"

    It defines the number of replicas, which should be present in the targeted application during the chaos. It can be tuned via REPLICA_COUNT ENV.

    Use the following example to tune this:

    # provide the number of replicas \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-autoscaler-sa\n  experiments:\n  - name: pod-autoscaler\n    spec:\n      components:\n        env:\n        # number of replica, needs to scale\n        - name: REPLICA_COUNT\n          value: '3'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/","title":"Pod CPU Hog Exec","text":""},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#introduction","title":"Introduction","text":"
    • This experiment consumes the CPU resources of the application container

    • It simulates conditions where app pods experience CPU spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the CPU

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems.

    Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-cpu-hog-exec experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-cpu-hog-exec-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes CPU_CORES Number of the cpu cores subjected to CPU stress Default to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus Default to litmus TARGET_PODS Comma separated list of application pod name subjected to pod cpu hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CHAOS_INJECT_COMMAND The command to inject the cpu chaos Default to md5sum /dev/zero CHAOS_KILL_COMMAND The command to kill the chaos process Default to kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}'). Another useful one that generally works (in case the default doesn't) is kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print $1}' | tr '\\n' ' '). In case neither works, please check whether the target pod's base image offers a shell. If yes, identify appropriate shell command to kill the chaos process RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#cpu-cores","title":"CPU Cores","text":"

    It stresses the CPU_CORE cpu cores of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # cpu cores for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-exec-sa\n  experiments:\n  - name: pod-cpu-hog-exec\n    spec:\n      components:\n        env:\n        # cpu cores for stress\n        - name: CPU_CORES\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#chaos-inject-and-kill-commands","title":"Chaos Inject and Kill Commands","text":"

    It defines the CHAOS_INJECT_COMMAND and CHAOS_KILL_COMMAND ENV to set the chaos inject and chaos kill commands respectively. Default values of commands:

    • CHAOS_INJECT_COMMAND: \"md5sum /dev/zero\"
    • CHAOS_KILL_COMMAND: \"kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}')\"

    Use the following example to tune this:

    # provide the chaos kill, used to kill the chaos process\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-exec-sa\n  experiments:\n  - name: pod-cpu-hog-exec\n    spec:\n      components:\n        env:\n        # command to create the md5sum process to stress the cpu\n        - name: CHAOS_INJECT_COMMAND\n          value: 'md5sum /dev/zero'\n        # command to kill the md5sum process\n        # alternative command: \"kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print $1}' | tr '\\n' ' ')\"\n        - name: CHAOS_KILL_COMMAND\n          value: \"kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}')\"\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/","title":"Pod CPU Hog","text":""},{"location":"experiments/categories/pods/pod-cpu-hog/#introduction","title":"Introduction","text":"
    • This experiment consumes the CPU resources of the application container
    • It simulates conditions where app pods experience CPU spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.
    • It can test the application's resilience to potential slowness/unavailability of some replicas due to high CPU load

    Scenario: Stress the CPU

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems.

    Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-cpu-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-cpu-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-cpu-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes CPU_CORES Number of the cpu cores subjected to CPU stress Default to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus and pumba Default to litmus LIB_IMAGE Image used to run the helper pod. Defaults to litmuschaos/go-runner:1.13.8 STRESS_IMAGE Container run on the node at runtime by the pumba lib to inject stressors. Only used in LIB pumba Default to alexeiled/stress-ng:latest-ubuntu TARGET_PODS Comma separated list of application pod name subjected to pod cpu hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-cpu-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-cpu-hog/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#cpu-cores","title":"CPU Cores","text":"

    It stresses the CPU_CORE of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # cpu cores for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # cpu cores for stress\n        - name: CPU_CORES\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#cpu-load","title":"CPU Load","text":"

    It contains percentage of pod CPU to be consumed. It can be tuned via CPU_LOAD ENV.

    Use the following example to tune this:

    # cpu load for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # cpu load in percentage for the stress\n        - name: CPU_LOAD\n          value: '100'\n        # cpu core should be provided as 0 for cpu load\n        # to work, otherwise it will take cpu core as priority\n        - name: CPU_CORES\n          value: '0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the stress image via STRESS_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # name of chaos lib\n        # supports litmus and pumba\n        - name: LIB\n          value: 'pumba'\n        # stress image - applicable for pumba only\n        - name: STRESS_IMAGE\n          value: 'alexeiled/stress-ng:latest-ubuntu'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/","title":"Pod Delete","text":""},{"location":"experiments/categories/pods/pod-delete/#introduction","title":"Introduction","text":"
    • It Causes (forced/graceful) pod failure of specific/random replicas of an application resources.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application

    Scenario: Deletes kubernetes pod

    "},{"location":"experiments/categories/pods/pod-delete/#uses","title":"Uses","text":"View the uses of the experiment

    In the distributed system like kubernetes it is very likely that your application replicas may not be sufficient to manage the traffic (indicated by SLIs) when some of the replicas are unavailable due to any failure (can be system or application) the application needs to meet the SLO(service level objectives) for this, we need to make sure that the applications have minimum number of available replicas. One of the common application failures is when the pressure on other replicas increases then to how the horizontal pod autoscaler scales based on observed resource utilization and also how much PV mount takes time upon rescheduling. The other important aspects to test are the MTTR for the application replica, re-elections of leader or follower like in kafka application the selection of broker leader, validating minimum quorum to run the application for example in applications like percona, resync/redistribution of data.

    This experiment helps to reproduce such a scenario with forced/graceful pod failure on specific or random replicas of an application resource and checks the deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application.

    "},{"location":"experiments/categories/pods/pod-delete/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-delete experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-delete/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-delete/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-delete-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-delete-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-delete/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (in sec) Defaults to 15s, NOTE: Overall run duration of the experiment may exceed the TOTAL_CHAOS_DURATION by a few min CHAOS_INTERVAL Time interval b/w two successive pod failures (in sec) Defaults to 5s RANDOMNESS Introduces randomness to pod deletions with a minimum period defined by CHAOS_INTERVAL It supports true or false. Default value: false FORCE Application Pod deletion mode. false indicates graceful deletion with default termination period of 30s. true indicates an immediate forceful deletion with 0s grace period Default to true, With terminationGracePeriodSeconds=0 TARGET_PODS Comma separated list of application pod name subjected to pod delete chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-delete/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-delete/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-delete/#force-delete","title":"Force Delete","text":"

    The targeted pod can be deleted forcefully or gracefully. It can be tuned with the FORCE env. It will delete the pod forcefully if FORCE is provided as true and it will delete the pod gracefully if FORCE is provided as false.

    Use the following example to tune this:

    # tune the deletion of target pods forcefully or gracefully\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # provided as true for the force deletion of pod\n        # supports true and false value\n        - name: FORCE\n          value: 'true'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/#random-interval","title":"Random Interval","text":"

    The randomness in the chaos interval can be enabled via setting RANDOMNESS ENV to true. It supports boolean values. The default value is false. The chaos interval can be tuned via CHAOS_INTERVAL ENV.

    • If CHAOS_INTERVAL is set in the form of l-r i.e, 5-10 then it will select a random interval between l & r.
    • If CHAOS_INTERVAL is set in the form of value i.e, 10 then it will select a random interval between 0 & value.

    Use the following example to tune this:

    # contains random chaos interval with lower and upper bound of range i.e [l,r]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # randomness enables iterations at random time interval\n        # it supports true and false value\n        - name: RANDOMNESS\n          value: 'true'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        # it will select a random interval within this range\n        # if only one value is provided then it will select a random interval within 0-CHAOS_INTERVAL range\n        - name: CHAOS_INTERVAL\n          value: '5-10' \n
    "},{"location":"experiments/categories/pods/pod-dns-error/","title":"Pod Dns Error","text":""},{"location":"experiments/categories/pods/pod-dns-error/#introduction","title":"Introduction","text":"
    • Pod-dns-error injects chaos to disrupt dns resolution in kubernetes pods.
    • It causes loss of access to services by blocking dns resolution of hostnames/domains

    Scenario: DNS error for the target pod

    "},{"location":"experiments/categories/pods/pod-dns-error/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-dns-error/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-dns-error experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-dns-error/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-dns-error/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-dns-error-sa\nsubjects:\n  - kind: ServiceAccount\n    name: pod-dns-error-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-dns-error/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to dns-error None TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_HOSTNAMES List of the target hostnames or keywords eg. '[\"litmuschaos\"]' If not provided, all hostnames/domains will be targeted MATCH_SCHEME Determines whether the dns query has to match exactly with one of the targets or can have any of the targets as substring. Can be either exact or substring if not provided, it will be set as exact PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker SOCKET_PATH Path of the docker socket file Defaults to /run/containerd/containerd.sock LIB The chaos lib used to inject the chaos Default value: litmus, supported values: litmus LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-dns-error/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-dns-error/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-dns-error/#target-host-names","title":"Target Host Names","text":"

    It defines the comma-separated name of the target hosts subjected to chaos. It can be tuned with the TARGET_HOSTNAMES ENV. If TARGET_HOSTNAMESnot provided then all hostnames/domains will be targeted.

    Use the following example to tune this:

    # contains the target host names for the dns error\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        ## comma separated list of host names\n        ## if not provided, all hostnames/domains will be targeted\n        - name: TARGET_HOSTNAMES\n          value: '[\"litmuschaos\"]'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-error/#match-scheme","title":"Match Scheme","text":"

    It determines whether the DNS query has to match exactly with one of the targets or can have any of the targets as a substring. It can be tuned with MATCH_SCHEME ENV. It supports exact or substring values.

    Use the following example to tune this:

    # contains match scheme for the dns error\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        ## it supports 'exact' and 'substring' values\n        - name: MATCH_SCHEME\n          value: 'exact' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-error/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker runtime only.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock).

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-spoof/","title":"Pod Dns Spoof","text":""},{"location":"experiments/categories/pods/pod-dns-spoof/#introduction","title":"Introduction","text":"
    • Pod-dns-spoof injects chaos to spoof dns resolution in kubernetes pods.
    • It causes dns resolution of target hostnames/domains to wrong IPs as specified by SPOOF_MAP in the engine config.

    Scenario: DNS spoof for the target pod

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-dns-spoof experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-dns-spoof/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n    # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-dns-spoof-sa\nsubjects:\n  - kind: ServiceAccount\n    name: pod-dns-spoof-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to dns spoof None TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) SPOOF_MAP Map of the target hostnames eg. '{\"abc.com\":\"spoofabc.com\"}' where key is the hostname that needs to be spoofed and value is the hostname where it will be spoofed/redirected to. If not provided, no hostnames/domains will be spoofed PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker SOCKET_PATH Path of the docker socket file Defaults to /run/containerd/containerd.sock LIB The chaos lib used to inject the chaos Default value: litmus, supported values: litmus LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-dns-spoof/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#spoof-map","title":"Spoof Map","text":"

    It defines the map of the target hostnames eg. '{\"abc.com\":\"spoofabc.com\"}' where the key is the hostname that needs to be spoofed and value is the hostname where it will be spoofed/redirected to. It can be tuned via SPOOF_MAP ENV.

    Use the following example to tune this:

    # contains the spoof map for the dns spoofing\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-spoof-sa\n  experiments:\n  - name: pod-dns-spoof\n    spec:\n      components:\n        env:\n        # map of host names\n        - name: SPOOF_MAP\n          value: '{\"abc.com\":\"spoofabc.com\"}'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-spoof/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker runtime only.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock).

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-spoof-sa\n  experiments:\n  - name: pod-dns-spoof\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # map of host names\n        - name: SPOOF_MAP\n          value: '{\"abc.com\":\"spoofabc.com\"}'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-http-latency/","title":"Pod HTTP Latency","text":""},{"location":"experiments/categories/pods/pod-http-latency/#introduction","title":"Introduction","text":"
    • It injects http response latency on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to lossy/flaky http responses.

    Scenario: Add latency to the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-latency/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-latency/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-latency-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-latency-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-latency/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 LATENCY Latency value in ms to be added to requests Defaults to 2000

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http latency chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-latency/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-latency/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    # provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#latency","title":"Latency","text":"

    It defines the latency value to be added to the http request. It can be tuned via LATENCY ENV.

    Use the following example to tune this:

    ## provide the latency value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the latency value\n        - name: LATENCY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/","title":"Pod HTTP Modify Body","text":""},{"location":"experiments/categories/pods/pod-http-modify-body/#introduction","title":"Introduction","text":"
    • It injects http modify body chaos on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • Can be used to overwrite the http response body by providing the new body value as RESPONSE_BODY.
    • It can test the application's resilience to error or incorrect http response body.

    Scenario: Modify Body of the HTTP response

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-modify-body experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-modify-body-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-modify-body-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 RESPONSE_BODY Body string to overwrite the http response body If no value is provided, response will be an empty body. Defaults to empty body

    Variables Description Notes CONTENT_ENCODING Encoding type to compress/encodde the response body Accepted values are: gzip, deflate, br, identity. Defaults to none (no encoding) CONTENT_TYPE Content type of the response body Defaults to text/plain PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http modify body chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-http-modify-body/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-modify-body/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#response-body","title":"RESPONSE BODY","text":"

    It defines the body string that will overwrite the http response body. It can be tuned via RESPONSE_BODY ENV.

    Use the following example to tune this:

    ## provide the response body value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#content-encoding-and-content-type","title":"Content Encoding and Content Type","text":"

    It defines the content encoding and content type of the response body. It can be tuned via CONTENT_ENCODING and CONTENT_TYPE ENV.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the encoding type for the response body\n        # currently supported value are gzip, deflate\n        # if empty no encoding will be applied\n        - name: CONTENT_ENCODING\n          value: 'gzip'\n        # provide the content type for the response body\n        - name: CONTENT_TYPE\n          value: 'text/html'\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/","title":"Pod HTTP Modify Header","text":""},{"location":"experiments/categories/pods/pod-http-modify-header/#introduction","title":"Introduction","text":"
    • It injects http modify header on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can cause modification of headers of requests and responses of the service. This can be used to test service resilience towards incorrect or incomplete headers.

    Scenario: Modify Header of the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-modify-header experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-modify-header-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-modify-header-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 HEADERS_MAP Map of headers to modify/add Eg: {\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}. To remove a header, just set the value to \"\"; Eg: {\"X-Litmus-Test-Header\": \"\"} HEADER_MODE Whether to modify response headers or request headers. Accepted values: request, response Defaults to response

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http modify header chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-modify-header/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#headers-map","title":"Headers Map","text":"

    It is the map of headers that are to be modified or added to the Http request/response. It can be tuned via HEADERS_MAP ENV.

    Use the following example to tune this:

    ## provide the headers as a map\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # map of headers to modify/add; Eg: {\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}\n        # to remove a header, just set the value to \"\"; Eg: {\"X-Litmus-Test-Header\": \"\"}\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#header-mode","title":"Header Mode","text":"

    It defined whether the request or the response header has to be modified. It can be tuned via HEADER_MODE ENV.

    Use the following example to tune this:

    ## provide the mode of the header modification; request/response\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # whether to modify response headers or request headers. Accepted values: request, response\n        - name: HEADER_MODE\n          value: 'response'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-header-sa\n  experiments:\n  - name: pod-http-modify-header\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/","title":"Pod HTTP Reset Peer","text":""},{"location":"experiments/categories/pods/pod-http-reset-peer/#introduction","title":"Introduction","text":"
    • It injects http reset on the service whose port is provided as TARGET_SERVICE_PORT which stops outgoing http requests by resetting the TCP connection by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to lossy/flaky http connection.

    Scenario: Add reset peer to the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-reset-peer experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-reset-peer-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-reset-peer-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 RESET_TIMEOUT Reset Timeout specifies after how much duration to reset the connection Defaults to 0

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http reset peer chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-reset-peer/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV. Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#reset-timeout","title":"RESET TIMEOUT","text":"

    It defines the reset timeout value to be added to the http request. It can be tuned via RESET_TIMEOUT ENV.

    Use the following example to tune this:

    ## provide the reset timeout value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # reset timeout specifies after how much duration to reset the connection\n        - name: RESET_TIMEOUT #in ms\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/","title":"Pod HTTP Status Code","text":""},{"location":"experiments/categories/pods/pod-http-status-code/#introduction","title":"Introduction","text":"
    • It injects http status code chaos inside the pod which modifies the status code of the response from the provided application server to desired status code provided by user on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to error code http responses from the provided application server.

    Scenario: Modify http response status code of the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-status-code/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-status-code/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-status-code experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-status-code/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-status-code-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-status-code-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target This should be the port on which the application container runs at the pod level, not at the service level. Defaults to port 80 STATUS_CODE Modified status code for the HTTP response If no value is provided, then a random value is selected from the list of supported values. Multiple values can be provided as comma separated, a random value from the provided list will be selected Supported values: [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]. Defaults to random status code MODIFY_RESPONSE_BODY Whether to modify the body as per the status code provided. If true, then the body is replaced by a default template for the status code. Defaults to true

    Variables Description Notes RESPONSE_BODY Body string to overwrite the http response body This will be used only if MODIFY_RESPONSE_BODY is set to true. If no value is provided, response will be an empty body. Defaults to empty body CONTENT_ENCODING Encoding type to compress/encodde the response body Accepted values are: gzip, deflate, br, identity. Defaults to none (no encoding) CONTENT_TYPE Content type of the response body Defaults to text/plain PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http status code chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-http-status-code/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-status-code/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV. This should be the port where the application runs at the pod level, not at the service level. This means if the application pod is running the service at port 8080 and we create a service exposing that at port 80, then the target service port should be 8080 and not 80, which is the port at pod-level.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#status-code","title":"Status Code","text":"

    It defines the status code value for the http response. It can be tuned via STATUS_CODE ENV.

    Use the following example to tune this:

    ## modified status code for the http response\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # modified status code for the http response\n        # if no value is provided, a random status code from the supported code list will selected\n        # if multiple comma separated values are provided, then a random value from the provided list will be selected\n        # if an invalid status code is provided, the experiment will fail\n        # supported status code list: [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#modify-response-body","title":"Modify Response Body","text":"

    It defines whether to modify the respone body with a pre-defined template to match with the status code value of the http response. It can be tuned via MODIFY_RESPONSE_BODY ENV.

    Use the following example to tune this:

    ##  whether to modify the body as per the status code provided\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#response-body","title":"RESPONSE BODY","text":"

    It defines the body string that will overwrite the http response body. It can be tuned via RESPONSE_BODY and MODIFY_RESPONSE_BODY ENV. The MODIFY_RESPONSE_BODY ENV should be set to true to enable this feature.

    Use the following example to tune this:

    ## provide the response body value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the body string to overwrite the response body. This will be used only if MODIFY_RESPONSE_BODY is set to true\n        - name: RESPONSE_BODY\n          value: '<h1>Hello World</h1>'\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#content-encoding-and-content-type","title":"Content Encoding and Content Type","text":"

    It defines the content encoding and content type of the response body. It can be tuned via CONTENT_ENCODING and CONTENT_TYPE ENV.

    Use the following example to tune this:

    ##  whether to modify the body as per the status code provided\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the encoding type for the response body\n        # currently supported value are gzip, deflate\n        # if empty no encoding will be applied\n        - name: CONTENT_ENCODING\n          value: 'gzip'\n        # provide the content type for the response body\n        - name: CONTENT_TYPE\n          value: 'text/html'\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/","title":"Pod IO Stress","text":""},{"location":"experiments/categories/pods/pod-io-stress/#introduction","title":"Introduction","text":"
    • This experiment causes disk stress on the application pod. The experiment aims to verify the resiliency of applications that share this disk resource for ephemeral or persistent storage purposes

    Scenario: Stress the IO of the target pod

    "},{"location":"experiments/categories/pods/pod-io-stress/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems

    Stressing the disk with continuous and heavy IO for example can cause degradation in reads written by other microservices that use this shared disk for example modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out. Another issue is the amount of scratch space eaten up on a node which leads to the lack of space for newer containers to get scheduled (kubernetes too gives up by applying an \"eviction\" taint like \"disk-pressure\") and causes a wholesale movement of all pods to other nodes

    "},{"location":"experiments/categories/pods/pod-io-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-io-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-io-stress/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-io-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup..

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-io-stress-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-io-stress-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-io-stress/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes FILESYSTEM_UTILIZATION_PERCENTAGE Specify the size as percentage of free space on the file system Default to 10% FILESYSTEM_UTILIZATION_BYTES Specify the size in GigaBytes(GB). FILESYSTEM_UTILIZATION_PERCENTAGE & FILESYSTEM_UTILIZATION_BYTES are mutually exclusive. If both are provided, FILESYSTEM_UTILIZATION_PERCENTAGE is prioritized. NUMBER_OF_WORKERS It is the number of IO workers involved in IO disk stress Default to 4 TOTAL_CHAOS_DURATION The time duration for chaos (seconds) Default to 120s VOLUME_MOUNT_PATH Fill the given volume mount path LIB The chaos lib used to inject the chaos Default to litmus. Available litmus and pumba. LIB_IMAGE Image used to run the stress command Default to litmuschaos/go-runner:latest TARGET_PODS Comma separated list of application pod name subjected to pod io stress chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-io-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-io-stress/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-io-stress/#filesystem-utilization-percentage","title":"Filesystem Utilization Percentage","text":"

    It stresses the FILESYSTEM_UTILIZATION_PERCENTAGE percentage of total free space available in the pod.

    Use the following example to tune this:

    # stress the i/o of the targeted pod with FILESYSTEM_UTILIZATION_PERCENTAGE of total free space \n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_BYTES.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # percentage of free space of file system, need to be stressed\n        - name: FILESYSTEM_UTILIZATION_PERCENTAGE\n          value: '10' #in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#filesystem-utilization-bytes","title":"Filesystem Utilization Bytes","text":"

    It stresses the FILESYSTEM_UTILIZATION_BYTES GB of the i/o of the targeted pod. It is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE ENV. If FILESYSTEM_UTILIZATION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on FILESYSTEM_UTILIZATION_BYTES ENV.

    Use the following example to tune this:

    # stress the i/o of the targeted pod with given FILESYSTEM_UTILIZATION_BYTES\n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # size of io to be stressed\n        - name: FILESYSTEM_UTILIZATION_BYTES\n          value: '1' #in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#mount-path","title":"Mount Path","text":"

    The volume mount path, which needs to be filled. It can be tuned with VOLUME_MOUNT_PATH ENV.

    Use the following example to tune this:

    # provide the volume mount path, which needs to be filled\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # path need to be stressed/filled\n        - name: VOLUME_MOUNT_PATH\n          value: '/some-dir-in-container'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#workers-for-stress","title":"Workers For Stress","text":"

    The worker's count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # number of workers for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # number of io workers \n        - name: NUMBER_OF_WORKERS\n          value: '4'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus.

    Use the following example to tune this:

    # use the pumba lib for io stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # name of lib\n        # it supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/","title":"Pod Memory Hog Exec","text":""},{"location":"experiments/categories/pods/pod-memory-hog-exec/#introduction","title":"Introduction","text":"
    • This experiment consumes the Memory resources on the application container on specified memory in megabytes.

    • It simulates conditions where app pods experience Memory spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the Memory

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#uses","title":"Uses","text":"View the uses of the experiment

    Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.

    This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-memory-hog-exec experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-memory-hog-exec-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-memory-hog-exec-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes MEMORY_CONSUMPTION The amount of memory used of hogging a Kubernetes pod (megabytes) Defaults to 500MB (Up to 2000MB) TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus Defaults to litmus TARGET_PODS Comma separated list of application pod name subjected to pod memory hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod CHAOS_KILL_COMMAND The command to kill the chaos process Defaults to kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1). Another useful one that generally works (in case the default doesn't) is kill -9 $(ps afx | grep \"[dd] if=/dev/zero\" | awk '{print $1}' | tr '\\n' ' '). In case neither works, please check whether the target pod's base image offers a shell. If yes, identify appropriate shell command to kill the chaos process PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-memory-hog-exec/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#memory-consumption","title":"Memory Consumption","text":"

    It stresses the MEMORY_CONSUMPTION MB memory of the targeted pod for the TOTAL_CHAOS_DURATION duration. The memory consumption limit is 2000MB

    Use the following example to tune this:

    # memory to be stressed in MB\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # memory consuption value in MB\n        # it is limited to 2000MB\n        - name: MEMORY_CONSUMPTION\n          value: '500' #in MB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#chaos-kill-commands","title":"Chaos Kill Commands","text":"

    It defines the CHAOS_KILL_COMMAND ENV to set the chaos kill command. Default values of CHAOS_KILL_COMMAND command:

    • CHAOS_KILL_COMMAND: \"kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1)\"

    Use the following example to tune this:

    # provide the chaos kill command used to kill the chaos process\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-exec-sa\n  experiments:\n  - name: pod-memory-hog-exec\n    spec:\n      components:\n        env:\n        # command to kill the dd process\n        # alternative command: \"kill -9 $(ps afx | grep \"[dd] if=/dev/zero\" | awk '{print $1}' | tr '\\n' ' ')\"\n        - name: CHAOS_KILL_COMMAND\n          value: \"kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1)\"\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/","title":"Pod Memory Hog","text":""},{"location":"experiments/categories/pods/pod-memory-hog/#introduction","title":"Introduction","text":"
    • This experiment consumes the Memory resources on the application container on specified memory in megabytes.
    • It simulates conditions where app pods experience Memory spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the Memory

    "},{"location":"experiments/categories/pods/pod-memory-hog/#uses","title":"Uses","text":"View the uses of the experiment

    Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.

    This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified

    "},{"location":"experiments/categories/pods/pod-memory-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-memory-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-memory-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-memory-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-memory-hog-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes MEMORY_CONSUMPTION The amount of memory used of hogging a Kubernetes pod (megabytes) Defaults to 500MB NUMBER_OF_WORKERS The number of workers used to run the stress process Defaults to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus and pumba Defaults to litmus LIB_IMAGE Image used to run the helper pod. Defaults to litmuschaos/go-runner:1.13.8 STRESS_IMAGE Container run on the node at runtime by the pumba lib to inject stressors. Only used in LIB pumba Default to alexeiled/stress-ng:latest-ubuntu TARGET_PODS Comma separated list of application pod name subjected to pod memory hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos. If not provided, it will select the first container of the target pod CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-memory-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-memory-hog/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#memory-consumption","title":"Memory Consumption","text":"

    It stresses the MEMORY_CONSUMPTION MB memory of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # define the memory consumption in MB\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # memory consumption value\n        - name: MEMORY_CONSUMPTION\n          value: '500' #in MB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/#workers-for-stress","title":"Workers For Stress","text":"

    The worker's count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # number of workers used for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # number of workers for stress\n        - name: NUMBER_OF_WORKERS\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.
    "},{"location":"experiments/categories/pods/pod-memory-hog/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the stress image via STRESS_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use the pumba lib for the memory stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # name of chaoslib\n        # it supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # stress image - applicable for pumba lib only\n        - name: STRESS_IMAGE\n          value: 'alexeiled/stress-ng:latest-ubuntu'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/","title":"Pod Network Corruption","text":""},{"location":"experiments/categories/pods/pod-network-corruption/#introduction","title":"Introduction","text":"
    • It injects packet corruption on the specified container by starting a traffic control (tc) process with netem rules to add egress packet corruption
    • It can test the application's resilience to lossy/flaky network

    Scenario: Corrupt the network packets of target pod

    "},{"location":"experiments/categories/pods/pod-network-corruption/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-corruption/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-corruption experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-corruption/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-corruption-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-corruption-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network corruption Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_CORRUPTION_PERCENTAGE Packet corruption in percentage Default (100) CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-corruption/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-corruption/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#network-packet-corruption","title":"Network Packet Corruption","text":"

    It defines the network packet corruption percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_CORRUPTION_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-corruption for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # network packet corruption percentage\n        - name: NETWORK_PACKET_CORRUPTION_PERCENTAGE\n          value: '100' #in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/","title":"Pod Network Duplication","text":""},{"location":"experiments/categories/pods/pod-network-duplication/#introduction","title":"Introduction","text":"
    • It injects chaos to disrupt network connectivity to kubernetes pods.
    • It causes Injection of network duplication on the specified container by starting a traffic control (tc) process with netem rules to add egress delays. It Can test the application's resilience to duplicate network.

    Scenario: Duplicate the network packets of target pod

    "},{"location":"experiments/categories/pods/pod-network-duplication/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-duplication/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-duplication experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-duplication/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-duplication-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-duplication-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network latency Optional Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_DUPLICATION_PERCENTAGE The packet duplication in percentage Optional Default to 100 percentage CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-duplication/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-duplication/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#network-packet-duplication","title":"Network Packet Duplication","text":"

    It defines the network packet duplication percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_DUPLICATION_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-duplication for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # network packet duplication percentage\n        - name: NETWORK_PACKET_DUPLICATION_PERCENTAGE\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/","title":"Pod Network Latency","text":""},{"location":"experiments/categories/pods/pod-network-latency/#introduction","title":"Introduction","text":"
    • It injects latency on the specified container by starting a traffic control (tc) process with netem rules to add egress delays
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce letency in the network of target pod

    "},{"location":"experiments/categories/pods/pod-network-latency/#uses","title":"Uses","text":"View the uses of the experiment

    The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container). The idea of this experiment is to simulate issues within your pod network OR microservice communication across services in different availability zones/regions etc.

    Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things. Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree. The experiment provides DESTINATION_IPS or DESTINATION_HOSTS so that you can control the chaos against specific services within or outside the cluster.

    The applications may stall or get corrupted while they wait endlessly for a packet. The experiment limits the impact (blast radius) to only the traffic you want to test by specifying IP addresses or application information.This experiment will help to improve the resilience of your services over time

    "},{"location":"experiments/categories/pods/pod-network-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-latency/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-latency-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-latency-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-latency/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network latency Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_LATENCY The latency/delay in milliseconds Default 2000, provide numeric value only JITTER The network jitter value in ms Default 0, provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-latency/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-latency/#network-latency","title":"Network Latency","text":"

    It defines the network latency(in ms) to be injected in the targeted application. It can be tuned via NETWORK_LATENCY ENV.

    Use the following example to tune this:

    # it inject the network-latency for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # network latency to be injected\n        - name: NETWORK_LATENCY\n          value: '2000' #in ms\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#jitter","title":"Jitter","text":"

    It defines the jitter (in ms), a parameter that allows introducing a network delay variation. It can be tuned via JITTER ENV. Its default value is 0.

    Use the following example to tune this:

    # provide the network latency jitter\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # value of the network latency jitter (in ms)\n        - name: JITTER\n          value: '200'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/","title":"Pod Network Loss","text":""},{"location":"experiments/categories/pods/pod-network-loss/#introduction","title":"Introduction","text":"
    • It injects packet loss on the specified container by starting a traffic control (tc) process with netem rules to add egress loss
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce network loss of the target pod

    "},{"location":"experiments/categories/pods/pod-network-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-loss experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-loss/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-loss-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-loss-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-loss/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network loss Optional Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_LOSS_PERCENTAGE The packet loss in percentage Optional Default to 100 percentage CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-loss/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-loss/#network-packet-loss","title":"Network Packet Loss","text":"

    It defines the network packet loss percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_LOSS_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-loss for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # network packet loss percentage\n        - name: NETWORK_PACKET_LOSS_PERCENTAGE\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/","title":"Pod Network Partition","text":""},{"location":"experiments/categories/pods/pod-network-partition/#introduction","title":"Introduction","text":"
    • It blocks the 100% Ingress and Egress traffic of the target application by creating network policy.
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce network loss of the target pod

    "},{"location":"experiments/categories/pods/pod-network-partition/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-partition/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-partition experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-partition/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-partition/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # performs CRUD operations on the network policies\n  - apiGroups: [\"networking.k8s.io\"]\n    resources: [\"networkpolicies\"]\n    verbs: [\"create\",\"delete\",\"list\",\"get\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-partition-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-partition-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-partition/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) POLICY_TYPES Contains type of network policy It supports egress, ingress and all values POD_SELECTOR Contains labels of the destination pods NAMESPACE_SELECTOR Contains labels of the destination namespaces PORTS Comma separated list of the targeted ports DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined LIB The chaos lib used to inject the chaos supported value: litmus RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/pods/pod-network-partition/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-partition/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-partition/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network partition experiment interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#target-specific-namespaces","title":"Target Specific Namespace(s)","text":"

    The network partition experiment interrupt traffic for all the namespaces by default. The access to/from pods in specific namespace can be allowed via providing namespace labels inside NAMESPACE_SELECTOR ENV.

    Use the following example to tune this:

    # it inject the chaos for specified namespaces, matched by labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # labels of the destination namespace\n        - name: NAMESPACE_SELECTOR\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#target-specific-pods","title":"Target Specific Pod(s)","text":"

    The network partition experiment interrupt traffic for all the extranal pods by default. The access to/from specific pod(s) can be allowed via providing pod labels inside POD_SELECTOR ENV.

    Use the following example to tune this:

    # it inject the chaos for specified pods, matched by labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # labels of the destination pods\n        - name: POD_SELECTOR\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#policy-type","title":"Policy Type","text":"

    The network partition experiment interrupt both ingress and egress traffic by default. The interruption of either ingress or egress traffic can be tuned via POLICY_TYPES ENV.

    Use the following example to tune this:

    # inject network loss for only ingress or only engress or all traffics\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # provide the network policy type\n        # it supports `ingress`, `egress`, and `all` values\n        # default value is `all`\n        - name: POLICY_TYPES\n          value: 'all'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#destination-ports","title":"Destination Ports","text":"

    The network partition experiment interrupt traffic for all the external ports by default. Access to specific port(s) can be allowed by providing comma separated list of ports inside PORTS ENV.

    Note:

    • If PORT is not set and none of the pod-selector, namespace-selector and destination_ips are provided then it will block traffic for all ports for all pods/ips
    • If PORT is not set but any of the podselector, nsselector and destination ips are provided then it will allow all ports for all the pods/ips filtered by the specified selectors

    Use the following example to tune this:

    # it inject the chaos for specified ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # comma separated list of ports\n        - name: PORTS\n          value: 'tcp: [8080,80], udp: [9000,90]'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/","title":"Spring Boot App Kill","text":""},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject app-kill. When the configured methods are called in the application, it will shut down the application.

    Scenario: Kill Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-app-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in the application. It can be enabled in two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this, it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-app-kill-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-app-kill-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application

    Variables Description Notes CM_LEVEL It contains the number of requests to be attacked, n value means the nth request will be affected Default value is 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods by providing a comma-seperated list of fully qualified packages(class and/or method names) Default is an empty list, which means it will target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] Default it is restController SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/","title":"Spring Boot CPU Stress","text":""},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject cpu-stress. Which attacks the CPU of the Java Virtual Machine. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Stress CPU of Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-cpu-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in the application. It can be enabled in two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-cpu-stress-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-cpu-stress-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application CPU_LOAD_FRACTION It contains fraction of CPU to be stressed, Eg: 0.95 equals 95% Default value is 0.9. It supports a value in range [0.1,1.0]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Default value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) Default is an empty list, which means it will target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] Default is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # stress cpu of spring-boot application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-cpu-stress-sa\n  experiments:\n    - name: spring-boot-cpu-stress\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#cpu-load-fraction","title":"CPU Load Fraction","text":"

    It contains fraction of cpu to be stressed, 0.95 equals 95%. It can be tunes via CPU_LOAD_FRACTION ENV

    Use the following example to tune this:

    # provide the cpu load fraction to be stressed\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-cpu-stress-sa\n  experiments:\n    - name: spring-boot-cpu-stress\n      spec:\n        components:\n          env:\n            # it contains the fraction of the used CPU. Eg: 0.95 equals 95%.\n            # it supports value in range [0.1,1.0]\n            - name: CPU_LOAD_FRACTION\n              value: '0.9'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/","title":"Spring Boot Exceptions","text":""},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject exceptions at runtime when the method is used. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Inject exceptions to Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-exceptions experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-exceptions-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-exceptions-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application

    Variables Description Notes CM_EXCEPTIONS_TYPE It contains type of raised exception Defaults value: java.lang.IllegalArgumentException CM_EXCEPTIONS_ARGUMENTS It contains argument of raised exception Defaults value: java.lang.String:custom illegal argument exception CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-exceptions-sa\n  experiments:\n    - name: spring-boot-exceptions\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#exception-type-and-arguments","title":"Exception Type and Arguments","text":"

    Spring boot exception type and arguments can be tuned via CM_EXCEPTIONS_TYPE and CM_EXCEPTIONS_ARGUMENTS ENV

    Use the following example to tune this:

    # provide the exception type and args\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-exceptions-sa\n  experiments:\n    - name: spring-boot-exceptions\n      spec:\n        components:\n          env:\n            # Type of raised exception\n            - name: CM_EXCEPTIONS_TYPE\n              value: 'java.lang.IllegalArgumentException'\n\n             # Argument of the raised exception\n            - name: CM_EXCEPTIONS_ARGUMENTS\n              value: 'java.lang.String:custom illegal argument exception'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/","title":"Spring boot experiments tunables","text":"

    It contains the Spring Boot specific experiment tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#spring-boot-request-level","title":"Spring Boot request Level","text":"

    It contains number of requests are to be attacked, n value means each nth request will be affected. It can be tuned by CM_LEVEL ENV.

    Use the following example to tune this:

    # limits the number of requests to be attacked\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it contains the number of requests that are to be attacked.\n            # n value means nth request will be affected\n            - name: CM_LEVEL\n              value: '1'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#watch-custom-services","title":"Watch Custom Services","text":"

    It contains comma seperated list of fully qualified packages(class and/or method names), which limits watched packages/classes/methods. It can be tuned by CM_WATCHED_CUSTOM_SERVICES ENV.

    Use the following example to tune this:

    # it contains comma separated list of custom services\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it limits watched packages/classes/methods\n            - name: CM_WATCHED_CUSTOM_SERVICES\n              value: 'com.example.chaosdemo.controller.HelloController.sayHello,com.example.chaosdemo.service.HelloService'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#watchers","title":"Watchers","text":"

    It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient]. It can be tuned by CM_WATCHERS ENV.

    Use the following example to tune this:

    # it contains comma separated list of watchers\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # provide name of watcher\n            # it supports controller, restController, service, repository, component, webClient\n            - name: CM_WATCHERS\n              value: 'restController'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/","title":"Spring Boot Faults","text":""},{"location":"experiments/categories/spring-boot/spring-boot-faults/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject multiple spring boot faults simultaneously on the target pod.
    • It supports app-kill, cpu-stress, memory-stress, latency, and exceptions faults

    Scenario: Inject Spring Boot Faults

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-faults experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-faults-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-faults-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application CM_KILL_APPLICATION_ACTIVE It enable the app-kill faults It supports boolean values. Default is false CM_LATENCY_ACTIVE It enable the latency faults It supports boolean values. Default is false CM_MEMORY_ACTIVE It enable the memory stress faults It supports boolean values. Default is false CM_CPU_ACTIVE It enable the cpu stress faults It supports boolean values. Default is false CM_EXCEPTIONS_ACTIVE It enable the exceptions faults It supports boolean values. Default is false CPU_LOAD_FRACTION It contains fraction of cpu to be stressed, 0.95 equals 95% default value is 0.9. It supports value in range [0.1,1.0] CM_EXCEPTIONS_TYPE It contains type of raised exception Defaults value: java.lang.IllegalArgumentException CM_EXCEPTIONS_ARGUMENTS It contains argument of raised exception Defaults value: java.lang.String:custom illegal argument exception LATENCY It contains network latency to be injected(in ms) default value is 2000 MEMORY_FILL_FRACTION It contains fraction of memory to be stressed, 0.7 equals 70% default value is 0.70. It supports value in range [0.01,0.95]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-faults/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#inject-multiple-faults-simultaneously-cpu-latency-and-exceptions","title":"Inject Multiple Faults Simultaneously (CPU, Latency and Exceptions)","text":"

    It injects cpu, latency, and exceptions faults simultaneously on the target pods

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-faults-sa\n  experiments:\n    - name: spring-boot-faults\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: '30'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it enables spring-boot latency fault\n            - name: CM_LATENCY_ACTIVE\n              value: 'true'\n\n            # provide the latency (ms)\n            # it is applicable when latency is active\n            - name: LATENCY\n              value: '2000'\n\n            # it enables spring-boot cpu stress fault\n            - name: CM_CPU_ACTIVE\n              value: 'true'\n\n            # it contains fraction of cpu to be stressed(0.95 equals 95%)\n            # it supports value in range [0.1,1.0]\n            # it is applicable when cpu is active\n            - name: CPU_LOAD_FRACTION\n              value: '0.9'\n\n            # it enables spring-boot exceptions fault\n            - name: CM_EXCEPTIONS_ACTIVE\n              value: 'true'\n\n            # Type of raised exception\n            # it is applicable when exceptions is active\n            - name: CM_EXCEPTIONS_TYPE\n              value: 'java.lang.IllegalArgumentException'\n\n              # Argument of raised exception\n              # it is applicable when exceptions is active\n            - name: CM_EXCEPTIONS_ARGUMENTS\n              value: 'java.lang.String:custom illegal argument exception'\n\n            ## percentage of total pods to target\n            - name: PODS_AFFECTED_PERC\n              value: ''\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#inject-multiple-faults-simultaneously-appkill-and-memory","title":"Inject Multiple Faults Simultaneously (Appkill and Memory)","text":"

    It injects appkill and memory stress faults simultaneously on the target pods

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-faults-sa\n  experiments:\n    - name: spring-boot-faults\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: '30'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it enables spring app-kill fault\n            - name: CM_KILL_APPLICATION_ACTIVE\n              value: 'true'\n\n            # it enables spring-boot memory stress fault\n            - name: CM_MEMORY_ACTIVE\n              value: ''\n\n            # it contains fraction of memory to be stressed(0.70 equals 70%)\n            # it supports value in range [0.01,0.95]\n            # it is applicable when memory is active\n            - name: MEMORY_FILL_FRACTION\n              value: '0.70'\n\n            ## percentage of total pods to target\n            - name: PODS_AFFECTED_PERC\n              value: ''\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/","title":"Spring Boot Latency","text":""},{"location":"experiments/categories/spring-boot/spring-boot-latency/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject network latency to every nth request. This can be tuned via CM_LEVEL ENV.

    Scenario: Inject network latency to Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-latency-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-latency-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application LATENCY It contains network latency to be injected(in ms) default value is 2000

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-latency/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-latency-sa\n  experiments:\n    - name: spring-boot-latency\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#network-latency","title":"Network Latency","text":"

    It contains network latency value in ms. It can be tunes via LATENCY ENV

    Use the following example to tune this:

    # provide the network latency\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-latency-sa\n  experiments:\n    - name: spring-boot-latency\n      spec:\n        components:\n          env:\n            # provide the latency (ms)\n            - name: LATENCY\n              value: '2000'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/","title":"Spring Boot Memory Stress","text":""},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject memory-stress. Which attacks the memory of the Java Virtual Machine. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Stress Memory of Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-memory-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-memory-stress-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-memory-stress-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application MEMORY_FILL_FRACTION It contains fraction of memory to be stressed, 0.7 equals 70% default value is 0.70. It supports value in range [0.01,0.95]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # stress memory of spring-boot application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-memory-stress-sa\n  experiments:\n    - name: spring-boot-memory-stress\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#memory-fill-fraction","title":"Memory Fill Fraction","text":"

    It contains fraction of memory to be stressed, 0.70 equals 70%. It can be tunes via MEMORY_FILL_FRACTION ENV

    Use the following example to tune this:

    # provide the memory fraction to be filled\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-memory-stress-sa\n  experiments:\n    - name: spring-boot-memory-stress\n      spec:\n        components:\n          env:\n            # it contains the fraction of used CPU. Eg: 0.70 equals 70%.\n            # it supports value in range [0.01,0.95]\n            - name: MEMORY_FILL_FRACTION\n              value: '0.70'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/vmware/vm-poweroff/","title":"VM Poweroff","text":""},{"location":"experiments/categories/vmware/vm-poweroff/#introduction","title":"Introduction","text":"
    • It causes VMWare VMs to Stop/Power-off before bringing them back to Powered-on state after a specified chaos duration using the VMWare APIs to start/stop the target VM.
    • It helps to check the performance of the application/process running on the VMWare VMs.

    Scenario: poweroff the vm

    "},{"location":"experiments/categories/vmware/vm-poweroff/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/vmware/vm-poweroff/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the vm-poweroff experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient Vcenter access to stop and start the VM.
    • (Optional) Ensure to create a Kubernetes secret having the Vcenter credentials in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: vcenter-secret\n  namespace: litmus\ntype: Opaque\nstringData:\n    VCENTERSERVER: XXXXXXXXXXX\n    VCENTERUSER: XXXXXXXXXXXXX\n    VCENTERPASS: XXXXXXXXXXXXX\n

    Note: You can pass the VM credentials as secrets or as an chaosengine ENV variable.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#default-validations","title":"Default Validations","text":"View the default validations
    • VM should be in healthy state.
    "},{"location":"experiments/categories/vmware/vm-poweroff/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: vm-poweroff-sa\n  namespace: default\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: vm-poweroff-sa\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: vm-poweroff-sa\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: vm-poweroff-sa\nsubjects:\n- kind: ServiceAccount\n  name: vm-poweroff-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes APP_VM_MOIDS MOIDs of the vmware instance Once you open VM in vCenter WebClient, you can find MOID in address field (VirtualMachine:vm-5365). Alternatively you can use the CLI to fetch the MOID. Eg: vm-5365

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/vmware/vm-poweroff/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/vmware/vm-poweroff/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#stoppoweroff-vm-by-moid","title":"Stop/Poweroff VM By MOID","text":"

    It contains MOID of the vm instance. It can be tuned via APP_VM_MOIDS ENV.

    Use the following example to tune this:

    # power-off the VMWare VM\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: vm-poweroff-sa\n  experiments:\n  - name: vm-poweroff\n    spec:\n      components:\n        env:\n        # MOID of the VM\n        - name: APP_VM_MOIDS\n          value: 'vm-53,vm-65'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/","title":"IAM integration for Litmus service accounts","text":"

    You can execute Litmus AWS experiments to target different AWS services from the EKS cluster itself, for this we need to authenticate Litmus with the AWS platform, we can do this in two different ways:

    • Using secrets: It is one of the common ways to authenticate litmus with AWS irrespective of the Kubernetes cluster used for the deployment. In other words, it is Kubernetes\u2019 native way for the authentication of litmus with the AWS platform.
    • IAM Integration: It can be used when we\u2019ve deployed Litmus on EKS cluster, we can associate an IAM role with a Kubernetes service account. This service account can then provide AWS permissions to the experiment pod that uses that service account. We\u2019ll discuss more this method in the below sections.
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#why-should-we-use-iam-integration-for-aws-authentication","title":"Why should we use IAM integration for AWS authentication?","text":"

    The IAM roles for service accounts feature provides the following benefits:

    • Least privilege: By using the IAM roles for service accounts feature, you no longer need to provide extended permissions to the node IAM role so that pods on that node can call AWS APIs. You can scope IAM permissions to a service account, and only pods that use that service account have access to those permissions.
    • Credential isolation: The experiment can only retrieve credentials for the IAM role that is associated with the service account to which it belongs. The experiment never has access to credentials that are intended for another experiment that belongs to another pod.
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#enable-service-accounts-to-access-aws-resources","title":"Enable service accounts to access AWS resources:","text":""},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-1-create-an-iam-oidc-provider-for-your-cluster","title":"Step 1: Create an IAM OIDC provider for your cluster","text":"

    We need to perform this once for a cluster. We\u2019re going to follow the AWS documentation to setup an OIDC provider with eksctl.

    Check whether you have an existing IAM OIDC provider for your cluster: To check this you can follow the given instruction.

    Note: For demonstration we\u2019ll be using cluster name as litmus-demo and region us-west-1 you can replace these values according to your ENV.

    aws eks describe-cluster --name <litmus-demo> --query \"cluster.identity.oidc.issuer\" --output text\n
    Output:

    https://oidc.eks.us-west-1.amazonaws.com/id/D054E55B6947B1A7B3F200297789662C\n

    Now list the IAM OIDC providers in your account.

    Command:

    aws iam list-open-id-connect-providers | grep <EXAMPLED539D4633E53DE1B716D3041E>\n

    Replace <D054E55B6947B1A7B3F200297789662C> (including <>) with the value returned from the previous command.

    So now here we don\u2019t have an IAM OIDC identity provider, So we need to create it for your cluster with the following command. Replace <litmus-demo> (including <>) with your own value.

    eksctl utils associate-iam-oidc-provider --cluster litmus-demo --approve\n2021-09-07 14:54:01 [\u2139]  eksctl version 0.52.0\n2021-09-07 14:54:01 [\u2139]  using region us-west-1\n2021-09-07 14:54:04 [\u2139]  will create IAM Open ID Connect provider for cluster \"udit-cluster-11\" in \"us-west-1\"\n2021-09-07 14:54:05 [\u2714]  created IAM Open ID Connect provider for cluster \"litmus-demo\" in \"us-west-1\"\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-2-creating-an-iam-role-and-policy-for-your-service-account","title":"Step 2: Creating an IAM role and policy for your service account","text":"

    You must create an IAM policy that specifies the permissions that you would like the experiment should to have. You have several ways to create a new IAM permission policy. Check out the AWS docs for creating the IAM policy. We will make use of eksctl command to setup the same.

    eksctl create iamserviceaccount \\\n--name <service_account_name> \\\n--namespace <service_account_namespace> \\\n--cluster <cluster_name> \\\n--attach-policy-arn <IAM_policy_ARN> \\\n--approve \\\n--override-existing-serviceaccounts\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-3-associate-an-iam-role-with-a-service-account","title":"Step 3: Associate an IAM role with a service account","text":"

    Complete this task for each Kubernetes service account that needs access to AWS resources. We can do this by defining the IAM role to associate with a service account in your cluster by adding the following annotation to the service account.

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  annotations:\n    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\n

    You can also annotate the experiment service account running the following command.

    Notes: 1. Ideally, annotating the litmus-admin service account in litmus namespace should work for most of the experiments. 2. For the cluster autoscaler experiment, annotate the service account in the kube-system namespace.

    kubectl annotate serviceaccount -n <SERVICE_ACCOUNT_NAMESPACE> <SERVICE_ACCOUNT_NAME> \\\neks.amazonaws.com/role-arn=arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\n

    Verify that the experiment service account is now associated with the IAM.

    If you run an experiment and describe one of the pods, you can verify that the AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN environment variables exist.

    kubectl exec -n litmus <ec2-terminate-by-id-z4zdf> env | grep AWS\n
    Output:
    AWS_VPC_K8S_CNI_LOGLEVEL=DEBUG\nAWS_ROLE_ARN=arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\nAWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token\n

    Now we have successfully enabled the experiment service accounts to access AWS resources.

    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#configure-the-experiment-cr","title":"Configure the Experiment CR.","text":"

    Since we have already configured the IAM for the experiment service account we don\u2019t need to create secret and mount it with experiment CR which is enabled by default. To remove the secret mount we have to remove the following lines from experiment YAML.

    secrets:\n- name: cloud-secret\n    mountPath: /tmp/\n
    We can now run the experiment with the direct IAM integration.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/","title":"IAM integration for Litmus service accounts","text":"

    To execute LitmusChaos GCP experiments, one needs to authenticate with the GCP by means of a service account before trying to access the target resources. Usually, you have only one way of providing the service account credentials to the experiment, using a service account key, but if you're using a GKE cluster you have a keyless medium of authentication as well.

    Therefore you have two ways of providing the service account credentials to your GKE cluster:

    • Using Secrets: As you would normally do, you can create a secret containing the GCP service account in your GKE cluster, which gets utilized by the experiment for authentication to access your GCP resources.

    • IAM Integration: When you're using a GKE cluster, you can bind a GCP service account to a Kubernetes service account as an IAM policy, which can be then used by the experiment for keyless authentication using GCP Workload Identity. We\u2019ll discuss more on this method in the following sections.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#why-use-iam-integration-for-gcp-authentication","title":"Why use IAM integration for GCP authentication?","text":"

    A Google API request can be made using a GCP IAM service account, which is an identity that an application uses to make calls to Google APIs. You might create individual IAM service accounts for each application as an application developer, then download and save the keys as a Kubernetes secret that you manually rotate. Not only is this a time-consuming process, but service account keys only last ten years (or until you manually rotate them). An unaccounted-for key could give an attacker extended access in the event of a breach or compromise. Using service account keys as secrets is not an optimal way of authenticating GKE workloads due to this potential blind spot and the management cost of key inventory and rotation.

    Workload Identity allows you to restrict the possible \"blast radius\" of a breach or compromise while enforcing the principle of least privilege across your environment. It accomplishes this by automating workload authentication best practices, eliminating the need for workarounds, and making it simple to implement recommended security best practices.

    • Your tasks will only have the permissions they require to fulfil their role with the principle of least privilege. It minimizes the breadth of a potential compromise by not granting broad permissions.

    • Unlike the 10-year lifetime service account keys, credentials supplied to the Workload Identity are only valid for a short time, decreasing the blast radius in the case of a compromise.

    • The risk of unintentional disclosure of credentials due to a human mistake is greatly reduced because Google controls the namespace service account credentials for you. It also eliminates the need for you to manually rotate these credentials.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#how-to-enable-service-accounts-to-access-gcp-resources","title":"How to enable service accounts to access GCP resources?","text":"

    We will be following the steps from the GCP Documentation for Workload Identity

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-1-enable-workload-identity","title":"STEP 1: Enable Workload Identity","text":"

    You can enable Workload Identity on clusters and node pools using the Google Cloud CLI or the Google Cloud Console. Workload Identity must be enabled at the cluster level before you can enable Workload Identity on node pools.

    Workload Identity can be enabled for an existing cluster as well as a new cluster. To enable Workload Identity on a new cluster, run the following command:

    gcloud container clusters create CLUSTER_NAME \\\n    --region=COMPUTE_REGION \\\n    --workload-pool=PROJECT_ID.svc.id.goog\n
    Replace the following: - CLUSTER_NAME: the name of your new cluster. - COMPUTE_REGION: the Compute Engine region of your cluster. For zonal clusters, use --zone=COMPUTE_ZONE. - PROJECT_ID: your Google Cloud project ID.

    You can enable Workload Identity on an existing Standard cluster by using the gcloud CLI or the Cloud Console. Existing node pools are unaffected, but any new node pools in the cluster use Workload Identity. To enable Workload Identity on an existing cluster, run the following command:

    gcloud container clusters update CLUSTER_NAME \\\n    --region=COMPUTE_REGION \\\n    --workload-pool=PROJECT_ID.svc.id.goog\n
    Replace the following: - CLUSTER_NAME: the name of your new cluster. - COMPUTE_REGION: the Compute Engine region of your cluster. For zonal clusters, use --zone=COMPUTE_ZONE. - PROJECT_ID: your Google Cloud project ID.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-2-configure-litmuschaos-to-use-workload-identity","title":"STEP 2: Configure LitmusChaos to use Workload Identity","text":"

    Assuming that you already have LitmusChaos installed in your GKE cluster as well as the Kubernetes service account you want to use for your GCP experiments, execute the following steps.

    1. Get Credentials for your cluster.

      gcloud container clusters get-credentials CLUSTER_NAME\n
      Replace CLUSTER_NAME with the name of your cluster that has Workload Identity enabled.

    2. Create an IAM service account for your application or use an existing IAM service account instead. You can use any IAM service account in any project in your organization. For Config Connector, apply the IAMServiceAccount object for your selected service account. To create a new IAM service account using the gcloud CLI, run the following command:

      gcloud iam service-accounts create GSA_NAME \\\n    --project=GSA_PROJECT\n
      Replace the following:

    3. GSA_NAME: the name of the new IAM service account.
    4. GSA_PROJECT: the project ID of the Google Cloud project for your IAM service account.

    5. Please ensure that this service account has all the roles requisite for interacting with the Compute Engine resources including VM Instances and Persistent Disks according to the GCP experiments that you're willing to run. You can grant additional roles using the following command:

      gcloud projects add-iam-policy-binding PROJECT_ID \\\n    --member \"serviceAccount:GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com\" \\\n    --role \"ROLE_NAME\"\n
      Replace the following:

    6. PROJECT_ID: your Google Cloud project ID.
    7. GSA_NAME: the name of your IAM service account.
    8. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    9. ROLE_NAME: the IAM role to assign to your service account, like roles/spanner.viewer.

    10. Allow the Kubernetes service account to be used for the GCP experiments to impersonate the GCP IAM service account by adding an IAM policy binding between the two service accounts. This binding allows the Kubernetes service account to act as the IAM service account.

      gcloud iam service-accounts add-iam-policy-binding GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com \\\n    --role roles/iam.workloadIdentityUser \\\n    --member \"serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]\"\n
      Replace the following:

    11. GSA_NAME: the name of your IAM service account.
    12. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    13. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    14. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.

    15. Annotate the Kubernetes service account to be used for LitmusChaos GCP experiments with the email address of the GCP IAM service account.

      kubectl annotate serviceaccount KSA_NAME \\\n    --namespace NAMESPACE \\\n    iam.gke.io/gcp-service-account=GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com\n
      Replace the following:

    16. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    17. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.
    18. GSA_NAME: the name of your IAM service account.
    19. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-3-update-chaosengine-manifest","title":"STEP 3: Update ChaosEngine Manifest","text":"

    Add the following value to the ChaosEngine manifest field .spec.experiments[].spec.components.nodeSelector to schedule the experiment pod on nodes that use Workload Identity.

    iam.gke.io/gke-metadata-server-enabled: \"true\"\n

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-4-update-chaosexperiment-manifest","title":"STEP 4: Update ChaosExperiment Manifest","text":"

    Remove cloud-secret at .spec.definition.secrets in the ChaosExperiment manifest as we are not using a secret to provide our GCP Service Account credentials.

    Now you can run your GCP experiments with a keyless authentication provided by GCP using Workload Identity.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#how-to-disable-iam-service-accounts-from-accessing-gcp-resources","title":"How to disable IAM service accounts from accessing GCP resources?","text":"

    To stop using Workload Identity, revoke access to the GCP IAM service account and disable Workload Identity on the cluster.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-1-revoke-access-to-the-iam-service-account","title":"STEP 1: Revoke access to the IAM service account","text":"
    1. To revoke access to the GCP IAM service account, use the following command:
      gcloud iam service-accounts remove-iam-policy-binding GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com \\\n    --role roles/iam.workloadIdentityUser \\\n    --member \"serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]\"\n
      Replace the following:
    2. PROJECT_ID: the project ID of the GKE cluster.
    3. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.
    4. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    5. GSA_NAME: the name of the IAM service account.
    6. GSA_PROJECT: the project ID of the IAM service account.

    It can take up to 30 minutes for cached tokens to expire.

    1. Remove the annotation from the service account being used for LitmusChaos GCP experiments:
      kubectl annotate serviceaccount KSA_NAME \\\n    --namespace NAMESPACE iam.gke.io/gcp-service-account-\n
    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-2-disable-workload-identity","title":"STEP 2: Disable Workload Identity","text":"
    1. Disable Workload Identity on each node pool:

      gcloud container node-pools update NODEPOOL_NAME \\\n    --cluster=CLUSTER_NAME \\\n    --workload-metadata=GCE_METADATA\n
      Repeat this command for every node pool in the cluster.

    2. Disable Workload Identity in the cluster:

      gcloud container clusters update CLUSTER_NAME \\\n    --disable-workload-identity\n

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#troubleshooting-guide","title":"Troubleshooting Guide","text":"

    Refer to the GCP documentation on troubleshooting Workload Identity here.

    "},{"location":"experiments/concepts/chaos-resources/contents/","title":"Chaos Resources","text":"

    At the heart of the Litmus Platform are the chaos custom resources. This section consists of the specification (details of each field within the .spec & .status of the resources) as well as standard examples for tuning the supported parameters.

    Chaos Resource Name Description User Guide ChaosEngine Contains the ChaosEngine specifications user-guide ChaosEngine ChaosExperiment Contains the ChaosExperiment specifications user-guide ChaosExperiment ChaosResult Contains the ChaosResult specifications user-guide ChaosResult ChaosScheduler Contains the ChaosScheduler specifications user-guide ChaosScheduler Probes Contains the Probes specifications user-guide Probes"},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/","title":"Application Specifications","text":"

    It contains AUT and auxiliary applications details provided at spec.appinfo and spec.auxiliaryAppInfo respectively inside chaosengine.

    View the application specification schema

    Field .spec.appinfo.appns Description Flag to specify namespace of application under test Type Optional Range user-defined (type: string) Default n/a Notes The appns in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos.

    Field .spec.appinfo.applabel Description Flag to specify unique label of application under test Type Optional Range user-defined (type: string)(pattern: \"label_key=label_value\") Default n/a Notes The applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: \"true\". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.

    Field .spec.appinfo.appkind Description Flag to specify resource kind of application under test Type Optional Range deployment, statefulset, daemonset, deploymentconfig, rollout Default n/a (depends on app type) Notes The appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos

    Field .spec.auxiliaryAppInfo Description Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces. Type Optional Range user-defined (type: string)(pattern: \"namespace:label_key=label_value\"). Default n/a Notes The auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/#application-under-test","title":"Application Under Test","text":"

    It defines the appns, applabel, and appkind to set the namespace, labels, and kind of the application under test.

    • appkind: It supports deployment, statefulset, daemonset, deploymentconfig, and rollout. It is mandatory for the pod-level experiments and optional for the rest of the experiments.

    Use the following example to tune this:

    # contains details of the AUT(application under test)\n# appns: name of the application\n# applabel: label of the applicaton\n# appkind: kind of the application. supports: deployment, statefulset, daemonset, rollout, deploymentconfig\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # AUT details\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/#auxiliary-application-info","title":"Auxiliary Application Info","text":"

    The contains a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In the case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary. It can be tuned via auxiliaryAppInfo field. It supports input the below format:

    • auxiliaryAppInfo: <namespace1>:<key1=value1>,<namespace2>:<key2=value2>

    Note: Auxiliary application check is only supported for node-level experiments.

    Use the following example to tune this:

    # contains the comma seperated list of auxiliary applications details\n# it is provide in `<namespace1>:<key1=value1>,<namespace2>:<key2=value2>` format\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # provide the comma separated auxiliary applications details\n  auxiliaryAppInfo: \"nginx:app=nginx,default:app=busybox\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/contents/","title":"Chaos Engine Specifications","text":"

    Bind an instance of a given app with one or more chaos experiments, define run characteristics, override chaos defaults, define steady-state hypothesis, reconciled by Litmus Chaos Operator.

    This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.

    Field Name Description User Guide State Specification It defines the state of the chaosengine State Specifications Application Specification It defines the details of AUT and auxiliary applications Application Specifications RBAC Specification It defines the chaos-service-account name RBAC Specifications Runtime Specification It defines the runtime details of the chaosengine Runtime Specifications Runner Specification It defines the runner pod specifications Runner Specifications Experiment Specification It defines the experiment pod specifications Experiment Specifications"},{"location":"experiments/concepts/chaos-resources/chaos-engine/engine-state/","title":"State Specifications","text":"

    It is a user-defined flag to trigger chaos. Setting it to active ensures the successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on the actual state of the ChaosEngine. It can be tuned via engineState field. It supports active and stop values.

    View the state specification schema

    Field .spec.engineState Description Flag to control the state of the chaosengine Type Mandatory Range active, stop Default active Notes The engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine.

    Use the following example to tune this:

    # contains the chaosengine state\n# supports: active and stop states\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  # contains the state of engine\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/","title":"Experiment Specifications","text":"

    It contains all the experiment tunables provided at .spec.experiments[].spec.components inside chaosengine.

    View the experiment specification schema

    Field .spec.experiments[].spec.components.configMaps Description Configmaps passed to the chaos experiment Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Field .spec.experiments[].spec.components.secrets Description Kubernetes secrets passed to the chaos experiment Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Field .spec.experiments[].spec.components.experimentImage Description Override the image of the chaos experiment Type Optional Range string Default n/a Notes The experiment[].spec.components.experimentImage overrides the experiment image for the chaoexperiment.

    Field .spec.experiments[].spec.components.experimentImagePullSecrets Description Flag to specify imagePullSecrets for the ChaosExperiment Type Optional Range user-defined (type: []corev1.LocalObjectReference) Default n/a Notes The .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment.

    Field .spec.experiments[].spec.components.nodeSelector Description Provide the node selector for the experiment pod Type Optional Range Labels in the from of label key=value Default n/a Notes The experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.

    Field .spec.experiments[].spec.components.statusCheckTimeouts Description Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry) Type Optional Range It contains values in the form {delay: int, timeout: int} Default delay: 2s and timeout: 180s Notes The experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.

    Field .spec.experiments[].spec.components.resources Description Specify the resource requirements for the ChaosExperiment pod Type Optional Range user-defined (type: corev1.ResourceRequirements) Default n/a Notes The experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.

    Field .spec.experiments[].spec.components.experimentAnnotations Description Annotations that needs to be provided in the pod which will be created (experiment-pod) Type Optional Range user-defined (type: label key=value) Default n/a Notes The .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod.

    Field .spec.experiments[].spec.components.tolerations Description Toleration for the experiment pod Type Optional Range user-defined (type: []corev1.Toleration) Default n/a Notes The .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-annotations","title":"Experiment Annotations","text":"

    It allows developers to specify the custom annotations for the experiment pod. It can be tuned via experimentAnnotations field.

    Use the following example to tune this:

    # contains annotations for the chaos runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # annotations for the experiment pod\n        experimentAnnotations:\n          name: chaos-experiment\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-configmaps-and-secrets","title":"Experiment Configmaps And Secrets","text":"

    It defines the configMaps and secrets to set the configmaps and secrets mounted to the experiment pod respectively.

    • configMaps: It provides for a means to insert config information into the experiment. The configmaps definition is validated for the correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
    • secrets: It provides for a means to push secrets (typically project ids, access credentials, etc.,) into the experiment pods. These are especially useful in the case of platform-level/infra-level chaos experiments. The secrets definition is validated for the correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Use the following example to tune this:

    # contains configmaps and secrets for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # configmaps details mounted to the experiment pod\n        configMaps:\n        - name: \"configmap-01\"\n          mountPath: \"/mnt\"\n        # secrets details mounted to the experiment pod\n        secrets:\n        - name: \"secret-01\"\n          mountPath: \"/tmp\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-image","title":"Experiment Image","text":"

    It overrides the experiment image for the chaosexperiment. It allows developers to specify the experiment image. It can be tuned via experimentImage field.

    Use the following example to tune this:

    # contains the custom image for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # override the image of the experiment pod\n        experimentImage: \"litmuschaos/go-runner:ci\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-imagepullsecrets","title":"Experiment ImagePullSecrets","text":"

    It allows developers to specify the imagePullSecret name for ChaosExperiment. It can be tuned via experimentImagePullSecrets field.

    Use the following example to tune this:

    # contains the imagePullSecrets for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # secret name for the experiment image, if using private registry\n        experimentImagePullSecrets:\n        - name: regcred\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-nodeselectors","title":"Experiment NodeSelectors","text":"

    The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos. It can be tuned via nodeSelector field.

    Use the following example to tune this:

    # contains the node-selector for the experiment pod\n# it will schedule the experiment pod on the coresponding node with matching labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # nodeselector for the experiment pod\n        nodeSelector:\n          context: chaos\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-resource-requirements","title":"Experiment Resource Requirements","text":"

    It contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod. It can be tuned via resources field.

    Use the following example to tune this:

    # contains the resource requirements for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # resource requirements for the experiment pod\n        resources:\n          requests:\n            cpu: \"250m\"\n            memory: \"64Mi\"\n          limits:\n          cpu: \"500m\"\n          memory: \"128Mi\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-tolerations","title":"Experiment Tolerations","text":"

    It provides tolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. It can be tuned via tolerations field.

    Use the following example to tune this:

    # contains the tolerations for the experiment pod\n# it will schedule the experiment pod on the tainted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # tolerations for the experiment pod\n        tolerations:\n        - key: \"key1\"\n          operator: \"Equal\"\n          value: \"value1\"\n          effect: \"Schedule\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-status-check-timeout","title":"Experiment Status Check Timeout","text":"

    It overrides the status timeouts inside chaosexperiments. It contains timeout & delay in seconds. It can be tuned via statusCheckTimeouts field.

    Use the following example to tune this:

    # contains status check timeout for the experiment pod\n# it will set this timeout as upper bound while checking application status, node status in experiments\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # status check timeout for the experiment pod\n        statusCheckTimeouts:\n          delay: 2\n          timeout: 180\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/rbac-details/","title":"RBAC Specifications","text":"

    It specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment are provided in the .spec.definition.permissions field of the respective chaosexperiment CR. It can be tuned via chaosServiceAccount field.

    View the RBAC specification schema

    Field .spec.chaosServiceAccount Description Flag to specify serviceaccount used for chaos experiment Type Mandatory Range user-defined (type: string) Default n/a Notes The chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR.

    Use the following example to tune this:

    # contains name of the serviceAccount which contains all the RBAC permissions required for the experiment\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  # name of the service account w/ sufficient permissions\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/","title":"Runner Specifications","text":"

    It contains all the chaos-runner tunables provided at .spec.components.runner inside chaosengine.

    View the runner specification schema

    Field .spec.components.runner.image Description Flag to specify image of ChaosRunner pod Type Optional Range user-defined (type: string) Default n/a (refer Notes) Notes The .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE

    Field .spec.components.runner.imagePullPolicy Description Flag to specify imagePullPolicy for the ChaosRunner Type Optional Range Always, IfNotPresent Default IfNotPresent Notes The .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.

    Field .spec.components.runner.imagePullSecrets Description Flag to specify imagePullSecrets for the ChaosRunner Type Optional Range user-defined (type: []corev1.LocalObjectReference) Default n/a Notes The .components.runner.imagePullSecrets allows developers to specify the imagePullSecret name for ChaosRunner.

    Field .spec.components.runner.runnerAnnotations Description Annotations that needs to be provided in the pod which will be created (runner-pod) Type Optional Range user-defined (type: map[string]string) Default n/a Notes The .components.runner.runnerAnnotation allows developers to specify the custom annotations for the runner pod.

    Field .spec.components.runner.args Description Specify the args for the ChaosRunner Pod Type Optional Range user-defined (type: []string) Default n/a Notes The .components.runner.args allows developers to specify their own debug runner args.

    Field .spec.components.runner.command Description Specify the commands for the ChaosRunner Pod Type Optional Range user-defined (type: []string) Default n/a Notes The .components.runner.command allows developers to specify their own debug runner commands.

    Field .spec.components.runner.configMaps Description Configmaps passed to the chaos runner pod Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The .spec.components.runner.configMaps provides for a means to insert config information into the runner pod.

    Field .spec.components.runner.secrets Description Kubernetes secrets passed to the chaos runner pod. Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments.

    Field .spec.components.runner.nodeSelector Description Node selectors for the runner pod Type Optional Range Labels in the from of label key=value Default n/a Notes The .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.

    Field .spec.components.runner.resources Description Specify the resource requirements for the ChaosRunner pod Type Optional Range user-defined (type: corev1.ResourceRequirements) Default n/a Notes The .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.

    Field .spec.components.runner.tolerations Description Toleration for the runner pod Type Optional Range user-defined (type: []corev1.Toleration) Default n/a Notes The .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-annotations","title":"ChaosRunner Annotations","text":"

    It allows developers to specify the custom annotations for the runner pod. It can be tuned via runnerAnnotations field.

    Use the following example to tune this:

    # contains annotations for the chaos runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n     # annotations for the chaos-runner\n     runnerAnnotations:\n       name: chaos-runner\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-args-and-command","title":"ChaosRunner Args And Command","text":"

    It defines the args and command to set the args and command of the chaos-runner respectively.

    • args: It allows developers to specify their own debug runner args.
    • command: It allows developers to specify their own debug runner commands.

    Use the following example to tune this:

    # contains args and command for the chaos runner\n# it will be useful for the cases where custom image of the chaos-runner is used, which supports args and commands\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    # override the args and command for the chaos-runner\n    runner:\n      # name of the custom image\n      image: \"<your repo>/chaos-runner:ci\"\n      # args for the image\n      args:\n      - \"/bin/sh\"\n      # command for the image\n      command:\n      - \"-c\"\n      - \"<custom-command>\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-configmaps-and-secrets","title":"ChaosRunner Configmaps And Secrets","text":"

    It defines the configMaps and secrets to set the configmaps and secrets mounted to the chaos-runner respectively.

    • configMaps: It provides for a means to insert config information into the runner pod.
    • secrets: It provides for a means to push secrets (typically project ids, access credentials, etc.,) into the chaos runner pod. These are especially useful in the case of platform-level/infra-level chaos experiments.

    Use the following example to tune this:

    # contains configmaps and secrets for the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n     # configmaps details mounted to the runner pod\n     configMaps:\n     - name: \"configmap-01\"\n       mountPath: \"/mnt\"\n     # secrets details mounted to the runner pod\n     secrets:\n     - name: \"secret-01\"\n       mountPath: \"/tmp\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-image-and-imagepullpoicy","title":"ChaosRunner Image and ImagePullPoicy","text":"

    It defines the image and imagePullPolicy to set the image and imagePullPolicy for the chaos-runner respectively.

    • image: It allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE.
    • imagePullPolicy: It allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.

    Use the following example to tune this:

    # contains the image and imagePullPolicy of the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # override the image of the chaos-runner\n      # by default it is used the image based on the litmus version\n      image: \"litmuschaos/chaos-runner:latest\"\n      # imagePullPolicy for the runner image\n      # supports: Always, IfNotPresent. default: IfNotPresent\n      imagePullPolicy: \"Always\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-imagepullsecrets","title":"ChaosRunner ImagePullSecrets","text":"

    It allows developers to specify the imagePullSecret name for the ChaosRunner. It can be tuned via imagePullSecrets field.

    Use the following example to tune this:

    # contains the imagePullSecrets for the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # secret name for the runner image, if using private registry\n      imagePullSecrets:\n      - name: regcred\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-nodeselectors","title":"ChaosRunner NodeSelectors","text":"

    The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos. It can be tuned via nodeSelector field.

    Use the following example to tune this:

    # contains the node-selector for the chaos-runner\n# it will schedule the chaos-runner on the coresponding node with matching labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # nodeselector for the runner pod\n      nodeSelector:\n        context: chaos\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-resource-requirements","title":"ChaosRunner Resource Requirements","text":"

    It contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod. It can be tuned via resources field.

    Use the following example to tune this:

    # contains the resource requirements for the runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # resource requirements for the runner pod\n      resources:\n        requests:\n          cpu: \"250m\"\n          memory: \"64Mi\"\n        limits:\n         cpu: \"500m\"\n         memory: \"128Mi\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-tolerations","title":"ChaosRunner Tolerations","text":"

    It provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. It can be tuned via tolerations field.

    Use the following example to tune this:

    # contains the tolerations for the chaos-runner\n# it will schedule the chaos-runner on the tainted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # tolerations for the runner pod\n      tolerations:\n      - key: \"key1\"\n        operator: \"Equal\"\n        value: \"value1\"\n        effect: \"Schedule\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/","title":"Runtime Specifications","text":"

    It contains runtime details of the chaos experiments provided at .spec inside chaosengine.

    View the runtime specification schema

    Field .spec.annotationCheck Description Flag to control annotationChecks on applications as prerequisites for chaos Type Optional Range true, false Default true Notes The annotationCheck in the spec controls whether or not the operator checks for the annotation \"litmuschaos.io/chaos\" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection.

    Field .spec.terminationGracePeriodSeconds Description Flag to control terminationGracePeriodSeconds for the chaos pods(abort case) Type Optional Range integer value Default 30 Notes The terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.

    Field .spec.jobCleanUpPolicy Description Flag to control cleanup of chaos experiment job post execution of chaos Type Optional Range delete, retain Default delete Notes <The jobCleanUpPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms).

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#annotation-check","title":"Annotation Check","text":"

    It controls whether or not the operator checks for the annotation litmuschaos.io/chaos to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated while setting it to false suppresses this check and proceeds with chaos injection. It can be tuned via annotationCheck field. It supports the boolean value and the default value is false.

    Use the following example to tune this:

    # checks the AUT for the annoations. The AUT should be annotated with `litmuschaos.io/chaos: true` if provided as true\n# supports: true, false. default: false\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  # annotaionCheck details\n  annotationCheck: \"true\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#jobcleanup-policy","title":"Jobcleanup Policy","text":"

    It controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms). It can be tuned via jobCleanUpPolicy fields. It supports retain and delete. The default value is retain.

    Use the following example to tune this:

    # flag to delete or retain the chaos resources after completions of chaosengine\n# supports: delete, retain. default: retain\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  jobCleanUpPolicy: \"delete\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#termination-grace-period-seconds","title":"Termination Grace Period Seconds","text":"

    It controls the terminationGracePeriodSeconds for the chaos resources in the abort case. Chaos pods contain chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before being completely terminated. It can be tuned via terminationGracePeriodSeconds field.

    Use the following example to tune this:

    # contains flag to control the terminationGracePeriodSeconds for the chaos pod(abort case)\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # contains terminationGracePeriodSeconds for the chaos pods\n  terminationGracePeriodSeconds: 100\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/","title":"Component Specification","text":"

    It contains component details provided at spec.definition inside chaosexperiment

    View the component specification schema

    Field .spec.definition.image Description Flag to specify the image to run the ChaosExperiment Type Mandatory Range user-defined (type: string) Default n/a (refer Notes) Notes The .spec.definition.image allows the developers to specify their experiment images. Typically set to the Litmus go-runner or the ansible-runner. This feature of the experiment enables BYOC (BringYourOwnChaos), where developers can implement their own variants of a standard chaos experiment

    Field .spec.definition.imagePullPolicy Description Flag that helps the developers to specify imagePullPolicy for the ChaosExperiment Type Mandatory Range IfNotPresent, Always (type: string) Default Always Notes The .spec.definition.imagePullPolicy allows developers to specify the pull policy for ChaosExperiment image. Set to Always during debug/test

    Field .spec.definition.args Description Flag to specify the entrypoint for the ChaosExperiment Type Mandatory Range user-defined (type:list of string) Default n/a Notes The .spec.definition.args specifies the entrypoint for the ChaosExperiment. It depends on the language used in the experiment. For litmus-go the .spec.definition.args contains a single binary of all experiments and managed via -name flag to indicate experiment to run(-name (exp-name)).

    Field .spec.definition.command Description Flag to specify the shell on which the ChaosExperiment will execute Type Mandatory Range user-defined (type: list of string). Default /bin/bash Notes The .spec.definition.command specifies the shell used to run the experiment /bin/bash is the most common shell to be used.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#image","title":"Image","text":"

    It allows the developers to specify their experiment images. Typically set to the Litmus go-runner or the ansible-runner. This feature of the experiment enables BYOC (BringYourOwnChaos), where developers can implement their own variants of a standard chaos experiment. It can be tuned via image field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    # image of the chaosexperiment\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#imagepullpolicy","title":"ImagePullPolicy","text":"

    It allows developers to specify the pull policy for ChaosExperiment image. Set to Always during debug/test. It can be tuned via imagePullPolicy field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    # imagePullPolicy of the chaosexperiment\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#args","title":"Args","text":"

    It specifies the entrypoint for the ChaosExperiment. It depends on the language used in the experiment. For litmus-go the .spec.definition.args contains a single binary of all experiments and managed via -name flag to indicate experiment to run(-name (exp-name)). It can be tuned via args field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    # it contains args of the experiment\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#command","title":"Command","text":"

    It specifies the shell used to run the experiment /bin/bash is the most common shell to be used. It can be tuned via command field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    # it contains command of the experiment\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/","title":"Configuration Specification","text":"

    It contains configuration details provided at spec.definition inside chaosexperiment

    View the configuration specification schema

    Field .spec.definition.labels Description Flag to specify the label for the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.labels allow developers to specify the ChaosPod label for an experiment.

    Field .spec.definition.securityContext.podSecurityContext Description Flag to specify security context for ChaosPod Type Optional Range user-defined (type:corev1.PodSecurityContext) Default n/a Notes The .spec.definition.securityContext.podSecurityContext allows the developers to specify the security context for the ChaosPod which applies to all containers inside the Pod.

    Field .spec.definition.securityContext.containerSecurityContext.privileged Description Flag to specify the security context for the ChaosExperiment pod Type Optional Range true, false (type:bool) Default n/a Notes The .spec.definition.securityContext.containerSecurityContext.privileged specify the securityContext params to the experiment container.

    Field .spec.definition.configMaps Description Flag to specify the configmap for ChaosPod Type Optional Range user-defined Default n/a Notes The .spec.definition.configMaps allows the developers to mount the ConfigMap volume into the experiment pod.

    Field .spec.definition.secrets Description Flag to specify the secrets for ChaosPod Type Optional Range user-defined Default n/a Notes The .spec.definition.secrets specify the secret data to be passed for the ChaosPod. The secrets typically contains confidential information like credentials.

    Field .spec.definition.experimentAnnotations Description Flag to specify the custom annotation to the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.experimentAnnotations allows the developer to specify the Custom annotation for the chaos pod.

    Field .spec.definition.hostFileVolumes Description Flag to specify the host file volumes to the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.hostFileVolumes allows the developer to specify the host file volumes to the ChaosPod.

    Field .spec.definition.hostPID Description Flag to specify the host PID for the ChaosPod Type Optional Range true, false (type:bool) Default n/a Notes The .spec.definition.hostPID allows the developer to specify the host PID for the ChaosPod.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#labels","title":"Labels","text":"

    It allows developers to specify the ChaosPod label for an experiment. It can be tuned via labels field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains experiment labels\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#podsecuritycontext","title":"PodSecurityContext","text":"

    It allows the developers to specify the security context for the ChaosPod which applies to all containers inside the Pod. It can be tuned via podSecurityContext field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains pod security context\n    securityContext:\n      podSecurityContext:\n        allowPrivilegeEscalation: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#container-security-context","title":"Container Security Context","text":"

    It allows the developers to specify the security context for the container inside ChaosPod. It can be tuned via containerSecurityContext field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains container security context\n    securityContext:\n      containerSecurityContext:\n        privileged: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#configmaps","title":"ConfigMaps","text":"

    It allows the developers to mount the ConfigMap volume into the experiment pod. It can tuned via configMaps field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains configmaps details\n    configMaps:\n      - name: experiment-data\n        mountPath: \"/mnt\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#secrets","title":"Secrets","text":"

    It specify the secret data to be passed for the ChaosPod. The secrets typically contains confidential information like credentials. It can be tuned via secret field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains secret details\n    secret:\n      - name: auth-credentials\n        mountPath: \"/tmp\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#experiment-annotations","title":"Experiment Annotations","text":"

    It allows the developer to specify the Custom annotation for the chaos pod. It can be tuned via experimentAnnotations field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains experiment annotations\n    experimentAnnotations:\n      context: chaos\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#host-file-volumes","title":"Host File Volumes","text":"

    It allows the developer to specify the host file volumes to the ChaosPod. It can be tuned via hostFileVolumes field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains host file volumes\n    hostFileVolumes:\n      - name: socket file\n        mountPath: \"/run/containerd/containerd.sock\"\n        nodePath: \"/run/containerd/containerd.sock\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#host-pid","title":"Host PID","text":"

    It allows the developer to specify the host PID for the ChaosPod. It can be tuned via hostPID field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it allows hostPID\n    hostPID: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/contents/","title":"Chaos Experiment Specifications","text":"

    Granular definition of chaos intent specified via image, librar, necessary permissions, low-level chaos parameters (default values).

    This section describes the fields in the ChaosExperiment and the possible values that can be set against the same.

    Field Name Description User Guide Scope Specification It defines scope of the chaosexperiment Scope Specifications Component Specification It defines component details of the chaosexperiment Component Specifications Experiment Tunables Specification It defines tunables of the chaosexperiment Experiment Tunables Specification Configuration Specification It defines configuration details of the chaosexperiment Configuration Specification"},{"location":"experiments/concepts/chaos-resources/chaos-experiment/experiment-tunable-specification/","title":"Experiment Tunables Specification","text":"

    It contains the array of tunables passed to the experiment pods as environment variables. It is used to manage the experiment execution. We can set the default values for all the variables (tunable) here which can be overridden by ChaosEngine from .spec.experiments[].spec.components.env if required. To know about the variables that need to be overridden check the list of \"mandatory\" & \"optional\" env for an experiment as provided within the respective experiment documentation. It can be provided at spec.definition.env inside chaosexperiment.

    View the experiment tunables specification

    Field .spec.definition.env Description Flag to specify env used for ChaosExperiment Type Mandatory Range user-defined (type: {name: string, value: string}) Default n/a Notes The .spec.definition.env specifies the array of tunables passed to the experiment pods as environment variables. It is used to manage the experiment execution. We can set the default values for all the variables (tunable) here which can be overridden by ChaosEngine from .spec.experiments[].spec.components.env if required. To know about the variables that need to be overridden check the list of \"mandatory\" & \"optional\" env for an experiment as provided within the respective experiment documentation.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    # permissions for the chaosexperiment\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    # it contains experiment tunables\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/","title":"Scope Specification","text":"

    It contains scope and permissions details provided at spec.definition.scope and spec.definition.permissions respectively inside chaosexperiment.

    View the scope specification schema

    Field .spec.definition.scope Description Flag to specify the scope of the ChaosExperiment Type Optional Range Namespaced, Cluster Default n/a (depends on experiment type) Notes The .spec.definition.scope specifies the scope of the experiment. It can be Namespaced scope for pod level experiments and Cluster for the experiments having a cluster wide impact.

    Field .spec.definition.permissions Description Flag to specify the minimum permission to run the ChaosExperiment Type Optional Range user-defined (type: list) Default n/a Notes The .spec.definition.permissions specify the minimum permission that is required to run the ChaosExperiment. It also helps to estimate the blast radius for the ChaosExperiment.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/#experiment-scope","title":"Experiment Scope","text":"

    It specifies the scope of the experiment. It can be Namespaced scope for pod level experiments and Cluster for the experiments having a cluster wide impact. It can be tuned via scope field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    # scope of the chaosexperiment\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/#experiment-permissions","title":"Experiment Permissions","text":"

    It specify the minimum permission that is required to run the ChaosExperiment. It also helps to estimate the blast radius for the ChaosExperiment. It can be tuned via permissions field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    # permissions for the chaosexperiment\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    ## it defines the sequence of chaos execution for multiple target pods\n    ## supported values: serial, parallel\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/contents/","title":"Chaos Result Specifications","text":"

    Hold engine reference, experiment state, verdict(on complete), salient application/result attributes, sources for metrics collection

    This section describes the fields in the ChaosResult and the possible values that can be set against the same.

    Field Name Description User Guide Spec Specification It defines spec details of the chaosresult Spec Specification Status Specification It defines status details of the chaosresult Status Specification Probe Specification It defines component details of the chaosresult Probe Specification"},{"location":"experiments/concepts/chaos-resources/chaos-result/probe-specification/","title":"Probe Status","text":"

    It contains probe details provided at status.probeStatus inside chaosresult. It contains following fields:

    • name: Flag to show the name of probe used in the experiment
    • type: Flag to show the type of probe used
    • status.continuous: Flag to show the result of probe in continuous mode
    • status.prechaos: Flag to show the result of probe in pre chaos
    • status.postchaos: Flag to show the result of probe in post chaos
    View the probe schema

    Field .status.probestatus.name Description Flag to show the name of probe used in the experiment Range n/a n/a (type: string) Notes The .status.probestatus.name shows the name of the probe used in the experiment.

    Field .status.probestatus.type Description Flag to show the type of probe used Range HTTPProbe,K8sProbe,CmdProbe(type:string) Notes The .status.probestatus.type shows the type of probe used.

    Field .status.probestatus.status.continuous Description Flag to show the result of probe in continuous mode Range Awaited,Passed,Better Luck Next Time (type: string) Notes The .status.probestatus.status.continuous helps to get the result of the probe in the continuous mode. The httpProbe is better used in the Continuous mode.

    Field .status.probestatus.status.postchaos Description Flag to show the probe result post chaos Range Awaited,Passed,Better Luck Next Time (type:map[string]string) Notes The .status.probestatus.status.postchaos shows the result of probe setup in EOT mode executed at the End of Test as a post-chaos check.

    Field .status.probestatus.status.prechaos Description Flag to show the probe result pre chaos Range Awaited,Passed,Better Luck Next Time (type:string) Notes The .status.probestatus.status.prechaos shows the result of probe setup in SOT mode executed at the Start of Test as a pre-chaos check.

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\n  Probe Status:\n    # name of probe\n    Name:  check-frontend-access-url\n    # status of probe\n    Status:\n      Continuous:  Passed \ud83d\udc4d #Continuous\n    # type of probe\n    Type:          HTTPProbe\n    # name of probe\n    Name:          check-app-cluster-cr-status\n    # status of probe\n    Status:\n      Post Chaos:  Passed \ud83d\udc4d #EoT\n    # type of probe\n    Type:          K8sProbe\n    # name of probe\n    Name:          check-database-integrity\n    # status of probe\n    Status:\n      Post Chaos:  Passed \ud83d\udc4d #Edge\n      Pre Chaos:   Passed \ud83d\udc4d \n    # type of probe\n    Type:          CmdProbe\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/spec-specification/","title":"Spec Specification","text":"

    It contains spec details provided at spec inside chaosresult. The name of chaosengine and chaosexperiment are present at spec.engine and spec.experiment respectively.

    View the spec details schema

    Field .spec.engine Description Flag to hold the ChaosEngine name for the experiment Range n/a (type: string) Notes The .spec.engine holds the engine name for the current course of the experiment.

    Field .spec.experiment Description Flag to hold the ChaosExperiment name which induces chaos. Range n/a (type: string) Notes The .spec.experiment holds the ChaosExperiment name for the current course of the experiment.

    view the sample chaosresult:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  # name of the chaosengine\n  Engine:      engine-nginx\n  # name of the chaosexperiment\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/","title":"Status Specification","text":"

    It contains status details provided at status inside chaosresult.

    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/#experiment-status","title":"Experiment Status","text":"

    It contains experiment status provided at status.experimentStatus inside chaosresult. It contains following fields:

    • failStep: Flag to show the failure step of the ChaosExperiment
    • phase: Flag to show the current phase of the experiment
    • probesuccesspercentage: Flag to show the probe success percentage
    • verdict: Flag to show the verdict of the experiment
    View the experiment status

    Field .status.experimentStatus.failstep Description Flag to show the failure step of the ChaosExperiment Range n/a(type: string) Notes The .status.experimentStatus.failstep Show the step at which the experiment failed. It helps in faster debugging of failures in the experiment execution.

    Field .status.experimentStatus.phase Description Flag to show the current phase of the experiment Range Awaited,Running,Completed,Aborted (type: string) Notes The .status.experimentStatus.phase shows the current phase in which the experiment is. It gets updated as the experiment proceeds.If the experiment is aborted then the status will be Aborted.

    Field .status.experimentStatus.probesuccesspercentage Description Flag to show the probe success percentage Range 1 to 100 (type: int) Notes The .status.experimentStatus.probesuccesspercentage shows the probe success percentage which is a ratio of successful checks v/s total probes.

    Field .status.experimentStatus.verdict Description Flag to show the verdict of the experiment. Range Awaited,Pass,Fail,Stopped (type: string) Notes The .status.experimentStatus.verdict shows the verdict of the experiment. It is Awaited when the experiment is in progress and ends up with Pass or Fail according to the experiment result.

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    # step on which experiment fails\n    Fail Step:                 N/A\n    # phase of the chaos result\n    Phase:                     Completed\n    # Success Percentage of the litmus probes\n    Probe Success Percentage:  100\n    # Verdict of the chaos result\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/#result-history","title":"Result History","text":"

    It contains history of experiment runs present at status.history. It contains following fields:

    • passedRuns: It contains cumulative passed run count
    • failedRuns: It contains cumulative failed run count
    • stoppedRuns: It contains cumulative stopped run count
    • targets.name: It contains name of target application
    • target.kind: It contains kinds of target application
    • target.chaosStatus: It contains chaos status
    View the history details

    Field .status.history.passedRuns Description It contains cumulative passed run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.passedRuns contains cumulative passed run counts for a specific ChaosResult.

    Field .status.history.failedRuns Description It contains cumulative failed run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.failedRuns contains cumulative failed run counts for a specific ChaosResult.

    Field .status.history.stoppedRuns Description It contains cumulative stopped run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.stoppedRuns contains cumulative stopped run counts for a specific ChaosResult.

    Field .status.history.targets.name Description It contains name of the target application Range string Notes The .status.history.targets.name contains name of the target application

    Field .status.history.targets.kind Description It contains kind of the target application Range string Notes The .status.history.targets.kind contains kind of the target application

    Field .status.history.targets.chaosStatus Description It contains status of the chaos Range targeted, injected, reverted Notes The .status.history.targets.chaosStatus contains status of the chaos

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    # fail experiment run count\n    Failed Runs:   1\n    # passed experiment run count\n    Passed Runs:   1\n    # stopped experiment run count\n    Stopped Runs:  0\n    Targets:\n      # status of the chaos\n      Chaos Status:  targeted\n      # kind of the application\n      Kind:          deployment\n      # name of the application\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/contents/","title":"Chaos Scheduler Specifications","text":"

    Hold attributes for repeated execution (run now, once@timestamp, b/w start-end timestamp@ interval). Embeds the ChaosEngine as template

    This section describes the fields in the ChaosScheduler and the possible values that can be set against the same.

    Parameter Description User Guide Schedule Once Schedule chaos once on specified time or now Schedule Once Repeat Schedule Schedule chaos in repeat mode Repeat Schedule Schedule State Defines the state of the schedule Schedule State Engine Specifications Defines the chaosengine specifications Engine Specifications"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/engine-specification/","title":"Engine Specification","text":"

    It embeds the ChaosEngine as a template inside schedule CR. Which contains the chaosexperiment and target application details.

    View the engine details

    Field .spec.engineTemplateSpec Description Flag to control chaosengine to be formed Type Mandatory Range n/a Default n/a Notes The engineTemplateSpec is the ChaosEngineSpec of ChaosEngine that is to be formed.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/engine-specification/#engine-specification_1","title":"Engine Specification","text":"

    Specify the chaosengine details at spec.engineTemplateSpec inside schedule CR

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n         #format should be like \"10m\" or \"2h\" accordingly for minutes or hours\n        minChaosInterval: \"2m\"  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/","title":"Schedule Once","text":"

    It schedule the chaos once either on the specified time or immediately after creation of schedule CR.

    View the schedule once schema"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#schedule-now","title":"Schedule NOW","text":"

    Field .spec.schedule.now Description Flag to control the type of scheduling Type Mandatory Range true, false Default n/a Notes The now in the spec.schedule ensures immediate creation of chaosengine, i.e., injection of chaos."},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#schedule-once_1","title":"Schedule Once","text":"

    Field .spec.schedule.once.executionTime Description Flag to specify execution timestamp at which chaos is injected, when the policy is once. The chaosengine is created exactly at this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes .spec.schedule.once refers to a single-instance execution of chaos at a particular timestamp specified by .spec.schedule.once.executionTime

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#immediate-chaos","title":"Immediate Chaos","text":"

    It schedule the chaos immediately after creation of the chaos-schedule CR. It can be tuned via setting spec.schedule.now to true.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    now: true\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#chaos-at-a-specified-timestamp","title":"Chaos at a Specified TimeStamp","text":"

    It schedule the chaos once at the specified time. It can be tuned via setting spec.schedule.once.executionTime. The execution time should be in UTC Timezone.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    once:\n      #should be modified according to current UTC Time\n      executionTime: \"2020-05-12T05:47:00Z\"   \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/","title":"Repeat Schedule","text":"

    It schedule the chaos in the repeat mode. There are various ways we can set up this type of schedule by varying the the fields inside spec.repeat.

    Note - We have just one field i.e. minChaosInterval to be specified as mandatory one. All other fields are optional and totally dependent on the desired behaviour.

    View the schedule repeat schema

    Field .spec.schedule.repeat.timeRange.startTime Description Flag to specify start timestamp of the range within which chaos is injected, when the policy is repeat. The chaosengine is not created before this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes When startTime is specified against the policy repeat, ChaosEngine will not be formed before this time, no matter when it was created.

    Field .spec.schedule.repeat.timeRange.endTime Description Flag to specify end timestamp of the range within which chaos is injected, when the policy is repeat. The chaosengine is not created after this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes When endTime is specified against the policy repeat, ChaosEngine will not be formed after this time.

    Field .spec.schedule.repeat.properties.minChaosInterval.hour.everyNthHour Description Flag to specify the hours between each successive schedule Type Mandatory Range integer Default n/a Notes The minChaosInterval.hour.everyNthHour in the spec specifies the time interval in hours between each schedule

    Field .spec.schedule.repeat.properties.minChaosInterval.hour.minuteOfTheHour Description Flag to specify minute of hour for each successive schedule Type Mandatory Range integer Default 0 Notes The minChaosInterval.hour.minuteOfTheHour in the spec specifies the minute of the hour between each schedule

    Field .spec.schedule.repeat.properties.minChaosInterval.minute.everyNthMinute Description Flag to specify the minutes for each successive schedule Type Mandatory Range integer Default n/a Notes The minChaosInterval.hour.everyNthMinute in the spec specifies the time interval in minutes between each schedule

    Field .spec.schedule.repeat.workDays.includedDays Description Flag to specify the days at which chaos is allowed to take place Type Mandatory Range user-defined (type: string)(pattern: [{day_name},{day_name}...]). Default n/a Notes The includedDays in the spec specifies a (comma-separated) list of days of the week at which chaos is allowed to take place. {day_name} is to be specified with the first 3 letters of the name of day such as Mon, Tue etc.

    Field .spec.schedule.repeat.workHours.includedHours Description Flag to specify the hours at which chaos is allowed to take place Type Mandatory Range {hour_number} will range from 0 to 23 (type: string)(pattern: {hour_number}-{hour_number}). Default n/a Notes The includedHours in the spec specifies a range of hours of the day at which chaos is allowed to take place. 24 hour format is followed"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#basic-schema-to-execute-repeat-strategy","title":"Basic Schema to Execute Repeat Strategy","text":"

    This will keep executing the schedule and creating engines for an indefinite amount of time.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-every-nth-minute","title":"Schedule ChaosEngine at every nth minute","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos at every 5 minutes\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-every-nth-hour","title":"Schedule ChaosEngine at every nth hour","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos every hour at 0th minute\n          hour:\n            everyNthHour: 1\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-nth-minute-of-every-nth-hour","title":"Schedule ChaosEngine at nth minute of every nth hour","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos every hour at 30th minute\n          hour:\n            everyNthHour: 1\n            minuteOfTheHour: 30\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-time-range-for-the-chaos-schedule","title":"Specifying Time Range for the Chaos Schedule","text":"

    This will manipulate the schedule to be started and ended according to our definition.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        startTime: \"2020-05-12T05:47:00Z\"   \n        endTime: \"2020-09-13T02:58:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-just-the-end-time","title":"Specifying Just the End Time","text":"

    Assumes the custom resource creation timestamp as the StartTime

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        endTime: \"2020-09-13T02:58:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-just-the-starttime","title":"Specifying Just the StartTime","text":"

    Executes chaos indefinitely (until the ChaosSchedule CR is removed) starting from the specified timestamp

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        startTime: \"2020-05-12T05:47:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-work-hours","title":"Specifying Work Hours","text":"

    This ensures chaos execution within the specified hours of the day, everyday.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5   \n      workHours:\n        # format should be <starting-hour-number>-<ending-hour-number>(inclusive)\n        includedHours: 0-12\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    # It can be true/false\n    annotationCheck: 'true'\n    #ex. values: ns1:name=percona,ns2:run=nginx\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    # It can be delete/retain\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-work-days","title":"Specifying work days","text":"

    This executes chaos on specified days of the week, with the specified minimum interval.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n      workDays:\n        includedDays: \"Mon,Tue,Wed,Sat,Sun\"\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/","title":"Halt/Resume ChaosSchedule","text":"

    Chaos Schedules can be halted or resumed as per need. It can tuned via setting spec.scheduleState to halt and active respectively.

    View the state schema

    Field .spec.scheduleState Description Flag to control chaosshedule state Type Optional Range active, halt, complete Default active Notes The scheduleState is the current state of ChaosSchedule. If the schedule is running its state will be active, if the schedule is halted its state will be halt and if the schedule is completed it state will be complete.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/#halt-the-schedule","title":"Halt The Schedule","text":"

    Follow the below steps to halt the active schedule:

    • Edit the ChaosSchedule CR in your favourite editor
      kubectl edit chaosschedule schedule-nginx\n
    • Change the spec.scheduleState to halt
      spec:\n  scheduleState: halt\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/#resume-the-schedule","title":"Resume The Schedule","text":"

    Follow the below steps to resume the halted schedule:

    • Edit the chaosschedule
      kubectl edit chaosschedule schedule-nginx\n
    • Change the spec.scheduleState to active
      spec:\n  scheduleState: active\n
    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/","title":"Command Probe","text":"

    The command probe allows developers to run shell commands and match the resulting output as part of the entry/exit criteria. The intent behind this probe was to allow users to implement a non-standard & imperative way of expressing their hypothesis. For example, the cmdProbe enables you to check for specific data within a database, parse the value out of a JSON blob being dumped into a certain path, or check for the existence of a particular string in the service logs. It can be executed by setting type as cmdProbe inside .spec.experiments[].spec.probe.

    View the command probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .cmdProbe/inputs.command Description Flag to hold the command for the cmdProbe Type Mandatory Range n/a {type: string} Notes The .cmdProbe/inputs.command contains the shell command, which should be run as part of cmdProbe

    Field .cmdProbe/inputs.source Description Flag to hold the source for the cmdProbe Type Mandatory Range It contains the source attributes i.e, image, imagePullPolicy Notes The .cmdProbe/inputs.source It supports inline mode where command should be run within the experiment pod, and it can be tuned by omiting source field. Otherwise provide the source details(i.e, image) which can be used to launch a external pod where the command execution is carried out.

    Field .cmdProbe/inputs.comparator.type Description Flag to hold type of the data used for comparision Type Mandatory Range string, int, float Notes The .cmdProbe/inputs.comparator.type contains type of data, which should be compare as part of comparision operation

    Field .cmdProbe/inputs.comparator.criteria Description Flag to hold criteria for the comparision Type Mandatory Range it supports {>=, <=, ==, >, <, !=, oneOf, between} for int & float type. And {equal, notEqual, contains, matches, notMatches, oneOf} for string type. Notes The .cmdProbe/inputs.comparator.criteria contains criteria of the comparision, which should be fulfill as part of comparision operation.

    Field .cmdProbe/inputs.comparator.value Description Flag to hold value for the comparision Type Mandatory Range n/a {type: string} Notes The .cmdProbe/inputs.comparator.value contains value of the comparision, which should follow the given criteria as part of comparision operation.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#inline-mode","title":"Inline Mode","text":"

    In inline mode, the command probe is executed from within the experiment pod. It is preferred for simple shell commands. It is default mode, and it can be tuned by omitting source field.

    Use the following example to tune this:

    # execute the command inside the experiment pod itself\n# cases where command doesn't need any extra binaries which is not available in litmsuchaos/go-runner image\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          # command which needs to run in cmdProbe\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#source-mode","title":"Source Mode","text":"

    In source mode, the command execution is carried out from within a new pod whose image can be specified. It can be used when application-specific binaries are required.

    View the source probe schema

    Field .image Description Flag to hold the image of the source pod Type Mandatory Range n/a (type: string) Notes The .image holds the image of the source pod/td>

    Field .hostNetwork Description Flag to enable the hostNetwork for the source pod Type Optional Range (type: boolean) Notes The .hostNetwork flag to enable the hostnetwork. It supports boolean values and default value is false/td>

    Field .args Description Flag to hold the args for the source pod Type Optional Range (type: []string]) Notes The .args flag to hold the args for source pod/td>

    Field .env Description Flag to hold the envs for the source pod Type Optional Range (type: []corev1.EnvVar]) Notes The .env flag to hold the envs for source pod/td>

    Field .labels Description Flag to hold the labels for the source pod Type Optional Range (type: map[string]string) Notes The .labels flag to hold the labels for source pod/td>

    Field .annotations Description Flag to hold the annotations for the source pod Type Optional Range (type: map[string]string) Notes The .annotations flag to hold the annotations for source pod/td>

    Field .command Description Flag to hold the command for the source pod Type Optional Range (type: []string Notes The .command flag to hold the command for source pod/td>

    Field .imagePullPolicy Description Flag to set the imagePullPolicy for the source pod Type Optional Range (type: corev1.PullPolicy Notes The .imagePullPolicy Flag to set the imagePullPolicy for the source pod/td>

    Field .privileged Description Flag to set the privileged for the source pod Type Optional Range (type: boolean Notes The .privileged Flag to set the privileged for the source pod. Default value is false/td>

    Field .nodeSelector Description Flag to hold the node selectors for the probe pod Type Optional Range (type: map[string]string Notes The .nodeSelector Flag to hold the node selectors for the probe pod/td>

    Field .tolerations Description Flag to hold the tolerations for the probe pod Type Optional Range (type: []corev1.Tolerations Notes The .tolerations Flag to hold the Tolerations for the probe pod

    Field .volumes Description Flag to hold the volumes for the source pod Type Optional Range (type: []corev1.Volume Notes The .volumes Flag to hold the volumes for source pod/td>

    Field .volumeMount Description Flag to hold the volume mounts for the source pod Type Optional Range (type: []corev1.VolumeMount Notes The .volumes Flag to hold the volume Mounts for source pod/td>

    Field .imagePullSecrets Description Flag to set the imagePullSecrets for the source pod Type Optional Range (type: []corev1.LocalObjectReference Notes The .imagePullSecrets Flag to set the imagePullSecrets for the source pod/td>

    Use the following example to tune this:

    # it launches the external pod with the source image and run the command inside the same pod\n# cases where command needs an extra binaries which is not available in litmsuchaos/go-runner image\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          # command which needs to run in cmdProbe\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=, oneOf, between] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches, oneOf] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n          # source for the cmdProbe\n          source:\n            image: \"<source-image>\"\n            imagePullPolicy: Always\n            privileged: true\n            hostNetwork: false\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/contents/","title":"Probes Specifications","text":"

    Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard \u201cin-built\u201d checks).

    Probe Name Description User Guide Command Probe It defines the command probes Command Probe HTTP Probe It defines the http probes HTTP Probe K8S Probe It defines the k8s probes K8S Probe Prometheus Probe It defines the prometheus probes Prometheus Probe Probe Chaining It chain the litmus probes Probe Chaining"},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/","title":"HTTP Probe","text":"

    The http probe allows developers to specify a URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria. The received status code is mapped against an expected status. It supports http Get and Post methods. It can be executed by setting type as httpProbe inside .spec.experiments[].spec.probe.

    View the http probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .httpProbe/inputs.url Description Flag to hold the URL for the httpProbe Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.url contains the URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria.

    Field .httpProbe/inputs.insecureSkipVerify Description Flag to hold the flag to skip certificate checks for the httpProbe Type Optional Range true, false Notes The .httpProbe/inputs.insecureSkipVerify contains flag to skip certificate checks.

    Field .httpProbe/inputs.responseTimeout Description Flag to hold the flag to response timeout for the httpProbe Type Optional Range n/a {type: integer} Notes The .httpProbe/inputs.responseTimeout contains flag to provide the response timeout for the http Get/Post request.

    Field .httpProbe/inputs.method.get.criteria Description Flag to hold the criteria for the http get request Type Mandatory Range ==, !=, oneOf Notes The .httpProbe/inputs.method.get.criteria contains criteria to match the http get request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run

    Field .httpProbe/inputs.method.get.responseCode Description Flag to hold the expected response code for the get request Type Mandatory Range HTTP_RESPONSE_CODE Notes The .httpProbe/inputs.method.get.responseCode contains the expected response code for the http get request as part of httpProbe run

    Field .httpProbe/inputs.method.post.contentType Description Flag to hold the content type of the post request Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.method.post.contentType contains the content type of the http body data, which need to be passed for the http post request

    Field .httpProbe/inputs.method.post.body Description Flag to hold the body of the http post request Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.method.post.body contains the http body, which is required for the http post request. It is used for the simple http body. If the http body is complex then use .httpProbe/inputs.method.post.bodyPath field.

    Field .httpProbe/inputs.method.post.bodyPath Description Flag to hold the path of the http body, required for the http post request Type Optional Range n/a {type: string} Notes The .httpProbe/inputs.method.post.bodyPath This field is used in case of complex POST request in which the body spans multiple lines, the bodyPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.

    Field .httpProbe/inputs.method.post.criteria Description Flag to hold the criteria for the http post request Type Mandatory Range ==, !=, oneOf Notes The .httpProbe/inputs.method.post.criteria contains criteria to match the http post request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run

    Field .httpProbe/inputs.method.post.responseCode Description Flag to hold the expected response code for the post request Type Mandatory Range HTTP_RESPONSE_CODE Notes The .httpProbe/inputs.method.post.responseCode contains the expected response code for the http post request as part of httpProbe run

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-get-request","title":"HTTP Get Request","text":"

    In HTTP Get method, it sends an http GET request to the provided URL and matches the response code based on the given criteria(==, !=, oneOf). It can be executed by setting httpProbe/inputs.method.get field.

    Use the following example to tune this:

    # contains the http probes with get method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http get method and verify the response code\n            get: \n              # criteria which should be matched\n              criteria: == # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-post-requesthttp-body-is-a-simple","title":"HTTP Post Request(http body is a simple)","text":"

    It contains the http body, which is required for the http post request. It is used for the simple http body. The http body can be provided in the body field. It can be executed by setting httpProbe/inputs.method.post.body field.

    Use the following example to tune this:

    # contains the http probes with post method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http post method and verify the response code\n            post: \n              # value of the http body, used for the post request\n              body: \"<http-body>\"\n              # http body content type\n              contentType: \"application/json; charset=UTF-8\"\n              # criteria which should be matched\n              criteria: \"==\" # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"200\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-post-requesthttp-body-is-a-complex","title":"HTTP Post Request(http body is a complex)","text":"

    In the case of a complex POST request in which the body spans multiple lines, the bodyPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR. It can be executed by setting httpProbe/inputs.method.post.body field.

    NOTE: It is mutually exclusive with the body field. If body is set then it will use the body field for the post request otherwise, it will use the bodyPath field.

    Use the following example to tune this:

    # contains the http probes with post method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http post method and verify the response code\n            post: \n              # the configMap should be mounted to the experiment which contains http body\n              # use the mounted path here\n              bodyPath: \"/mnt/body.yml\"\n              # http body content type\n              contentType: \"application/json; charset=UTF-8\"\n              # criteria which should be matched\n              criteria: \"==\" # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"200\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#response-timout","title":"Response Timout","text":"

    It contains a flag to provide the response timeout for the http Get/Post request. It can be tuned via .httpProbe/inputs.responseTimeout field. It is an optional field and its unit is milliseconds.

    Use the following example to tune this:

    # defines the response timeout for the http probe\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          # timeout for the http requests\n          responseTimeout: 100 #in ms\n          method:\n            get: \n              criteria: == # ==, !=, oneof\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#skip-certification-check","title":"Skip Certification Check","text":"

    It contains flag to skip certificate checks. It can bed tuned via .httpProbe/inputs.insecureSkipVerify field. It supports boolean values. Provide it to true to skip the certificate checks. Its default value is false.

    Use the following example to tune this:

    # skip the certificate checks for the httpProbe\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          # skip certificate checks for the httpProbe\n          # supports: true, false. default: false\n          insecureSkipVerify: \"true\"\n          method:\n            get: \n              criteria: == \n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/","title":"K8S Probe","text":"

    With the proliferation of custom resources & operators, especially in the case of stateful applications, the steady-state is manifested as status parameters/flags within Kubernetes resources. k8sProbe addresses verification of the desired resource state by allowing users to define the Kubernetes GVR (group-version-resource) with appropriate filters (field selectors/label selectors). The experiment makes use of the Kubernetes Dynamic Client to achieve this. It supports CRUD operations which can be defined at probe.k8sProbe/inputs.operation. It can be executed by setting type as k8sProbe inside .spec.experiments[].spec.probe.

    View the k8s probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .k8sProbe/inputs.group Description Flag to hold the group of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.group contains group of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.version Description Flag to hold the apiVersion of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.version contains apiVersion of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.resource Description Flag to hold the kubernetes resource name for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.resource contains the kubernetes resource name on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.namespace Description Flag to hold the namespace of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.namespace contains namespace of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.fieldSelector Description Flag to hold the fieldSelectors of the kubernetes resource for the k8sProbe Type Optional Range n/a {type: string} Notes The .k8sProbe/inputs.fieldSelector contains fieldSelector to derived the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.labelSelector Description Flag to hold the labelSelectors of the kubernetes resource for the k8sProbe Type Optional Range n/a {type: string} Notes The .k8sProbe/inputs.labelSelector contains labelSelector to derived the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.operation Description Flag to hold the operation type for the k8sProbe Type Mandatory Range create, delete, present, absent Notes The .k8sProbe/inputs.operation contains operation which should be applied on the kubernetes resource as part of k8sProbe. It supports four type of operation. It can be one of create, delete, present, absent.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#create-operation","title":"Create Operation","text":"

    It creates kubernetes resource based on the data provided inside probe.data field. It can be defined by setting operation to create operation.

    Use the following example to tune this:

    # create the given resource provided inside data field\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"create-percona-pvc\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource should be created\n          namespace: \"default\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"create\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n        # contains manifest, which can be used to create the resource\n        data: |\n          kind: PersistentVolumeClaim\n          apiVersion: v1\n          metadata:\n            name: percona-mysql-claim\n            labels:\n              openebs.io/target-affinity: percona\n          spec:\n            storageClassName: standard\n            accessModes:\n            - ReadWriteOnce\n            resources:\n              requests:\n                storage: 100Mi\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#delete-operation","title":"Delete Operation","text":"

    It deletes matching kubernetes resources via GVR and filters (field selectors/label selectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to delete operation.

    Use the following example to tune this:

    # delete the resource matched with the given inputs\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"delete-percona-pvc\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace of the instance, which needs to be deleted\n          namespace: \"default\"\n          # labels selectors for the k8s resource, which needs to be deleted\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource, which needs to be deleted\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"delete\"\n        mode: \"EOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#present-operation","title":"Present Operation","text":"

    It checks for the presence of kubernetes resource based on GVR and filters (field selectors/labelselectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to present operation.

    Use the following example to tune this:

    # verify the existance of the resource matched with the given inputs inside cluster\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-percona-pvc-presence\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource\n          namespace: \"default\"\n          # labels selectors for the k8s resource\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"present\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#absent-operation","title":"Absent Operation","text":"

    It checks for the absence of kubernetes resource based on GVR and filters (field selectors/labelselectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to absent operation.

    Use the following example to tune this:

    # verify that the no resource should be present in cluster with the given inputs\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-percona-pvc-absence\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource\n          namespace: \"default\"\n          # labels selectors for the k8s resource\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"absent\"\n        mode: \"EOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/","title":"Introduction","text":"

    Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard \u201cin-built\u201d checks). It can be provided at .spec.experiments[].spec.probe inside chaosengine. It supports four types: cmdProbe, k8sProbe, httpProbe, and promProbe.

    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#probe-modes","title":"Probe Modes","text":"

    The probes can be set up to run in five different modes. Which can be tuned via mode ENV.

    • SOT: Executed at the Start of the Test as a pre-chaos check
    • EOT: Executed at the End of the Test as a post-chaos check
    • Edge: Executed both, before and after the chaos
    • Continuous: The probe is executed continuously, with a specified polling interval during the chaos injection.
    • OnChaos: The probe is executed continuously, with a specified polling interval strictly for chaos duration of chaos

    Use the following example to tune this:

    # contains the common attributes or run properties\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        # modes for the probes\n        # supports: [SOT, EOT, Edge, Continuous, OnChaos]\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#run-properties","title":"Run Properties","text":"

    All probes share some common attributes. Which can be tuned via runProperties ENV.

    • probeTimeout: Represents the time limit for the probe to execute the check specified and return the expected data.
    • retry: The number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.
    • interval: The period between subsequent retries
    • probePollingInterval: The time interval for which continuous/onchaos probes should be sleep after each iteration.

    Use the following example to tune this:

    # contains the common attributes or run properties\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        runProperties:\n          # time limit for the probe to execute the specified check\n          probeTimeout: 5 #in seconds\n          # the time period between subsequent retries\n          interval: 2 #in seconds\n          # number of times a check is re-run upon failure before declaring the probe status as failed\n          retry: 1\n          #time interval for which continuous probe should wait after each iteration\n          # applicable for onChaos and Continuous probes\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#initial-delay-seconds","title":"Initial Delay Seconds","text":"

    It Represents the initial waiting time interval for the probes. It can be tuned via initialDelaySeconds ENV.

    Use the following example to tune this:

    # contains the initial delay seconds for the probes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        RunProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n          #initial waiting time interval for the probes\n          initialDelaySeconds: 30 #in seconds\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#stopcontinue-experiment-on-probe-failure","title":"Stop/Continue Experiment On Probe Failure","text":"

    It can be set to true/false to stop or continue the experiment execution after the probe fails. It can be tuned via stopOnFailure ENV. It supports boolean values. The default value is false.

    Use the following example to tune this:

    # contains the flag to stop/continue experiment based on the specified flag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n          #it can be set to true/false to stop or continue the experiment execution after probe fails\n          # supports: true, false. default: false\n          stopOnFailure: true\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#comparator","title":"Comparator","text":"

    Comparator used to validate the SLO based on the probe's actual and expected values for the specified criteria.

    View the comparator's supported fields

    Field .type Description Flag to hold type of the probe's output Type Mandatory Range {int, float, string} (type: string) Notes The .type holds the type of the probe's output/td>

    Field .criteria Description Flag to hold the criteria, which should to be followed by the actual and expected probe outputs Type Mandatory Range Float & Int type: {>,<.<=,>=,==,!=,oneOf,between}, String type: {equal, notEqual, contains, matches, notMatches, oneOf} Notes The .criteria holds the criteria, which should to be followed by the actual and expected probe outputs

    Field .value Description Flag to hold the probe's expected value, which should follow the specified criteria Type Mandatory Range value can be of int, float, string, slice type Notes The .value hold the probe's expected value, which should follow the specified criteria

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n  appns: \"default\"\n  applabel: \"app=nginx\"\n  appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=, oneOf, between] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches, oneOf] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n          source:\n            image: \"<source-image>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#arithmetic-criteria","title":"Arithmetic criteria:","text":"

    It is used to compare the numeric values(int,float) for arithmetic comparisons. It consists of >, <, >=, <=, ==, != criteria

    comparator:\n  type: int\n  criteria: \">\" \n  value: \"20\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#oneof-criteria","title":"OneOf criteria:","text":"

    It is used to compare numeric or string values, whether actual value lies in expected slice. Here expected values consists either of int/float/string values

    comparator:\n  type: int\n  criteria: \"oneOf\"\n  value: \"[400,404,405]\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#between-criteria","title":"Between criteria:","text":"

    It is used to compare the numeric(int,float) values, whether actual value lies between the given lower and upper bound range[a,b]

    comparator:\n  type: int\n  criteria: \"between\"\n  value: \"[1000,5000]\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#equal-and-notequal-criteria","title":"Equal and NotEqual criteria:","text":"

    It is used to compare the string values, it checks whether actual value is equal/notEqual to the expected value or not

    comparator:\n  type: string\n  criteria: \"equal\" #equal or notEqual\n  value: \"<string value>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#contains-criteria","title":"Contains criteria:","text":"

    It is used to compare the string values, it checks whether expected value is sub string of actual value or not

    comparator:\n  type: string\n  criteria: \"contains\" \n  value: \"<string value>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#matches-and-notmatches-criteria","title":"Matches and NotMatches criteria:","text":"

    It is used to compare the string values, it checks whether the actual value matches/notMatches the regex(provided as expected value) or not

    comparator:\n  type: string\n  criteria: \"matches\" #matches or notMatches\n  value: \"<regex>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/probe-chaining/","title":"Probe Chaining","text":"

    Probe chaining enables reuse of probe a result (represented by the template function {{ .<probeName>.probeArtifact.Register}}) in subsequent \"downstream\" probes defined in the ChaosEngine. Note: The order of execution of probes in the experiment depends purely on the order in which they are defined in the ChaosEngine.

    Use the following example to tune this:

    # chaining enables reuse of probe's result (represented by the template function {{ <probeName>.probeArtifact.Register}}) \n#-- in subsequent \"downstream\" probes defined in the ChaosEngine.\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"probe1\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          command: \"<command>\"\n          comparator:\n            type: \"string\"\n            criteria: \"equals\"\n            value: \"<value-for-criteria-match>\"\n          source: \"inline\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n      - name: \"probe2\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          ## probe1's result being used as one of the args in probe2\n          command: \"<commmand> {{ .probe1.ProbeArtifacts.Register }} <arg2>\"\n          comparator:\n            type: \"string\"\n            criteria: \"equals\"\n            value: \"<value-for-criteria-match>\"\n          source: \"inline\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/","title":"Prometheus Probe","text":"

    The prometheus probe allows users to run Prometheus queries and match the resulting output against specific conditions. The intent behind this probe is to allow users to define metrics-based SLOs in a declarative way and determine the experiment verdict based on its success. The probe runs the query on a Prometheus server defined by the endpoint, and checks whether the output satisfies the specified criteria. It can be executed by setting type as promProbe inside .spec.experiments[].spec.probe.

    View the prometheus probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .promProbe/inputs.endpoint Description Flag to hold the prometheus endpoints for the promProbe Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.endpoint contains the prometheus endpoints

    Field .promProbe/inputs.query Description Flag to hold the promql query for the promProbe Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.query contains the promql query to extract out the desired prometheus metrics via running it on the given prometheus endpoint

    Field .promProbe/inputs.queryPath Description Flag to hold the path of the promql query for the promProbe Type Optional Range n/a {type: string} Notes The .promProbe/inputs.queryPath This field is used in case of complex queries that spans multiple lines, the queryPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.

    Field .promProbe/inputs.comparator.criteria Description Flag to hold criteria for the comparision Type Mandatory Range it supports {>=, <=, ==, >, <, !=, oneOf, between} criteria Notes The .promProbe/inputs.comparator.criteria contains criteria of the comparision, which should be fulfill as part of comparision operation.

    Field .promProbe/inputs.comparator.value Description Flag to hold value for the comparision Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.comparator.value contains value of the comparision, which should follow the given criteria as part of comparision operation.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#prometheus-queryquery-is-a-simple","title":"Prometheus Query(query is a simple)","text":"

    It contains the promql query to extract out the desired prometheus metrics via running it on the given prometheus endpoint. The prometheus query can be provided in the query field. It can be executed by setting .promProbe/inputs.query field.

    Use the following example to tune this:

    # contains the prom probe which execute the query and match for the expected criteria\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-probe-success\"\n        type: \"promProbe\"\n        promProbe/inputs:\n          # endpoint for the promethus service\n          endpoint: \"<prometheus-endpoint>\"\n          # promql query, which should be executed\n          query: \"<promql-query>\"\n          comparator:\n            # criteria which should be followed by the actual output and the expected output\n            #supports >=,<=,>,<,==,!= comparision\n            criteria: \"==\" \n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#prometheus-queryquery-is-a-complex","title":"Prometheus Query(query is a complex","text":"

    In case of complex queries that spans multiple lines, the queryPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR. It can be executed by setting promProbe/inputs.queryPath field.

    NOTE: It is mutually exclusive with the query field. If query is set then it will use the query field otherwise, it will use the queryPath field.

    Use the following example to tune this:

    # contains the prom probe which execute the query and match for the expected criteria\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-probe-success\"\n        type: \"promProbe\"\n        promProbe/inputs:\n          # endpoint for the promethus service\n          endpoint: \"<prometheus-endpoint>\"\n          # the configMap should be mounted to the experiment which contains promql query\n          # use the mounted path here\n          queryPath: \"<path of the query>\"\n          comparator:\n            # criteria which should be followed by the actual output and the expected output\n            #supports >=,<=,>,<,==,!= comparision\n            criteria: \"==\" \n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/security/kyverno-policies/","title":"Kyverno Policies","text":"

    Kyverno policies blocks configurations that don't match a policy (enforce mode) or can generate policy violations (audit mode). It scans existing configurations and reports violations in the cluster. Litmus recommends using the provided policy configuration to enable the execution of all supported (out-of-the-box) experiments listed in the chaoshub. Having said that, this is recommendatory in nature and left to user discretion/choice depending upon experiments desired.

    The details listed here are expected to aid users of Kyverno. If you are using alternate means to enforce runtime security, such as native Kubernetes PSPs (pod security policies), refer this section: refer

    "},{"location":"experiments/concepts/security/kyverno-policies/#policies-in-litmus","title":"Policies in Litmus","text":"

    Litmus recommends using the following policies:

    1. Add Capabilities: It restricts add capabilities except the NET_ADMIN and SYS_ADMIN for the pods that use runtime API
    2. Host Namespaces: It validates following host namespaces for the pods that use runtime API.
      1. HostPID: It allows hostPID. It should be set to true.
      2. HostIPC: It restricts the host IPC. It should be set to false.
      3. HostNetwork: It restricts the hostNetwork. It should be set to false.
    3. Host Paths: It restricts hostPath except the socket-path & container-path host paths for the pods that uses runtime API. It allows hostPaths for service-kill experiments.
    4. Privilege Escalation: It restricts privilege escalation except for the pods that use runtime API
    5. Privilege Container: It restricts privileged containers except for the pods that use runtime API
    6. User Groups: It allows users groups for all the experiment pods
    "},{"location":"experiments/concepts/security/kyverno-policies/#install-policies","title":"Install Policies","text":"

    These Kyverno policies are based on the Kubernetes Pod Security Standards definitons. To apply all pod security policies (recommended) install Kyverno and kustomize, then run:

    kustomize build https://github.com/litmuschaos/chaos-charts/security/kyverno-policies | kubectl apply -f -\n
    "},{"location":"experiments/concepts/security/kyverno-policies/#pod-security-policies-in-restricted-setup","title":"Pod Security Policies in restricted setup","text":"

    If setup contains restricted policies which don't allow execution of litmus experiments by default. For Example deny-privilege-escalation policy doesn't allow privileged escalation. It deny all the pods to use privileged escalation.

    To allow litmus pods to use the privileged escalation. Add the litmus serviceAcccount or ClusterRole/Role inside the exclude block as :

    apiVersion: kyverno.io/v1\nkind: ClusterPolicy\nmetadata:\n  name: deny-privilege-escalation\n  annotations:\n    policies.kyverno.io/category: Pod Security Standards (Restricted)\n    policies.kyverno.io/severity: medium\n    policies.kyverno.io/subject: Pod\n    policies.kyverno.io/description: >-\n      Privilege escalation, such as via set-user-ID or set-group-ID file mode, should not be allowed.\n      This policy ensures the `allowPrivilegeEscalation` fields are either undefined\n      or set to `false`.      \nspec:\n  background: true\n  validationFailureAction: enforce\n  rules:\n  - name: deny-privilege-escalation\n    match:\n      resources:\n        kinds:\n        - Pod\n    exclude:\n      clusterRoles:\n      # add litmus cluster roles here\n      - litmus-admin\n      roles:\n      # add litmus roles here\n      - litmus-roles\n      subjects:\n      # add serviceAccount name here\n      - kind: ServiceAccount\n        name: pod-network-loss-sa\n    validate:\n      message: >-\n        Privilege escalation is disallowed. The fields\n        spec.containers[*].securityContext.allowPrivilegeEscalation, and\n        spec.initContainers[*].securityContext.allowPrivilegeEscalation must\n        be undefined or set to `false`.        \n      pattern:\n        spec:\n          =(initContainers):\n          - =(securityContext):\n              =(allowPrivilegeEscalation): \"false\"\n          containers:\n          - =(securityContext):\n              =(allowPrivilegeEscalation): \"false\"\n
    "},{"location":"experiments/concepts/security/openshift-scc/","title":"OpenShift Security Context Constraint (SCC)","text":"

    Security context constraints allow administrators to control permissions for pods in a cluster. A service account provides an identity for processes that run in a Pod. The service account within a project which applications would usually be run as is the default service account. You may run other applications in the same project, and don't necessarily want to override the privileges used for all applications, create a new service account which can be granted the special rights. In the project where the application is to run. For example run install litmus-admin service account.

    $ oc apply -f https://litmuschaos.github.io/litmus/litmus-admin-rbac.yaml\n\nserviceaccount/litmus-admin created\nclusterrole.rbac.authorization.k8s.io/litmus-admin created\nclusterrolebinding.rbac.authorization.k8s.io/litmus-admin created\n

    The next step is that which must be run as a cluster administrator. It is the granting of the appropriate rights to the service account. This is done by specifying that the service account should run with a specific security context constraint (SCC).

    As an administrator, you can see the list of SCCs that are defined in the cluster by running the oc get scc command.

    $ oc get scc --as system:admin\n\nNAME               PRIV      CAPS      SELINUX     RUNASUSER          FSGROUP     SUPGROUP    PRIORITY   READONLYROOTFS   VOLUMES\nanyuid             false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    10         false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nhostaccess         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim projected secret]\nhostmount-anyuid   false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath nfs persistentVolumeClaim projected secret]\nhostnetwork        false     []        MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nnonroot            false     []        MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nprivileged         true      [*]       RunAsAny    RunAsAny           RunAsAny    RunAsAny    <none>     false            [*]\nrestricted         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\n

    By default applications would run under the restricted SCC. We can use make use of the default SCC or can create our own SCC to provide the litmus experiment service account (here litmus-admin) to run all the experiments. Here is one such SCC that can be used:

    litmus-scc.yaml

    apiVersion: security.openshift.io/v1\nkind: SecurityContextConstraints\n# To mount the socket path directory in helper pod\nallowHostDirVolumePlugin: true\nallowHostIPC: false\nallowHostNetwork: false\n# To run fault injection on a target container using pid namespace.\n# It is used in stress, network, dns and http experiments. \nallowHostPID: true\nallowHostPorts: false\nallowPrivilegeEscalation: true\n# To run some privileged modules in dns, stress and network chaos\nallowPrivilegedContainer: true\n# NET_ADMIN & SYS_ADMIN: used in network chaos experiments to perform\n# network operations (running tc command in network ns of target container). \n# SYS_ADMIN: used in stress chaos experiment to perform cgroup operations.\nallowedCapabilities:\n- 'NET_ADMIN'\n- 'SYS_ADMIN'\ndefaultAddCapabilities: null\nfsGroup:\n  type: MustRunAs\ngroups: []\nmetadata:\n  name: litmus-scc\npriority: null\nreadOnlyRootFilesystem: false\nrequiredDropCapabilities: null\nrunAsUser:\n  type: RunAsAny\nseLinuxContext:\n  type: MustRunAs\nsupplementalGroups:\n  type: RunAsAny\nusers:\n- system:serviceaccount:litmus:argo\nvolumes:\n# To allow configmaps mounts on upload scripts or envs.\n- configMap\n# To derive the experiment pod name in the experimemnt.\n- downwardAPI\n# used for chaos injection like io chaos.\n- emptyDir\n- hostPath\n- persistentVolumeClaim\n- projected\n# To authenticate with different cloud providers\n- secret\n

    Install the SCC

    $ oc create -f litmus-scc.yaml\nsecuritycontextconstraints.security.openshift.io/litmus-scc created\n

    Now to associate the new service account with the SCC, run the given command

    $ oc adm policy add-scc-to-user litmus-scc -z litmus-admin --as system:admin -n litmus\nclusterrole.rbac.authorization.k8s.io/system:openshift:scc:litmus-scc added: \"litmus-admin\"\n

    The -z option indicates to apply the command to the service account in the current project. To add-scc-to-user add the name of SCC. Provide the namespace of the target service account after -n.

    "},{"location":"experiments/concepts/security/psp/","title":"Using Pod Security Policies with Litmus","text":"

    While working in environments (clusters) that have restrictive security policies, the default litmuschaos experiment execution procedure may be inhibited. This is mainly due to the fact that the experiment pods running the chaos injection tasks in privileged mode. This, in turn, is necessitated due to the mounting of container runtime-specific socket files from the Kubernetes nodes in order to invoke runtime APIs. While this is not needed for all experiments (a considerable number of them use purely the K8s API), those involving injection of chaos processes into the network/process namespaces of other containers have this requirement (ex: netem, stress).

    The restrictive policies are often enforced via pod security policies (PSP) today, with organizations opting for the default \"restricted\" policy.

    "},{"location":"experiments/concepts/security/psp/#applying-pod-security-policies-to-litmus-chaos-pods","title":"Applying Pod Security Policies to Litmus Chaos Pods","text":"
    • To run the litmus pods with operating characteristics described above, first create a custom PodSecurityPolicy that allows the same:

      apiVersion: policy/v1beta1\nkind: PodSecurityPolicy\nmetadata:\nname: litmus\nannotations:\n    seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'\nspec:\nprivileged: true\n# Required to prevent escalations to root.\nallowPrivilegeEscalation: true\n# Allow core volume types.\nvolumes:\n    # To mount script files/templates like ssm-docs in experiment\n    - 'configMap'\n    # Used for chaos injection like io chaos\n    - 'emptyDir'\n    - 'projected'\n    # To authenticate with different cloud providers\n    - 'secret'\n    # To derive the experiment pod name in the experimemnt\n    - 'downwardAPI'\n    # Assume that persistentVolumes set up by the cluster admin are safe to use.\n    - 'persistentVolumeClaim'\n    # To mount the socket path directory used to perform container runtime operations\n    - 'hostPath'\n\nallowedHostPaths:\n    # substitutes this path with an appropriate socket path\n    # ex: '/run/containerd/containerd.sock', '/run/containerd/containerd.sock', '/run/crio/crio.sock'\n    - pathPrefix: \"/run/containerd/containerd.sock\"\n    # substitutes this path with an appropriate container path\n    # ex: '/var/lib/docker/containers', '/var/lib/containerd/io.containerd.runtime.v1.linux/k8s.io', '/var/lib/containers/storage/overlay/'\n    - pathPrefix: \"/var/lib/containerd/io.containerd.runtime.v1.linux/k8s.io\"\n\nallowedCapabilities:\n    # NET_ADMIN & SYS_ADMIN: used in network chaos experiments to perform\n    # network operations (running tc command in network ns of target container). \n    - \"NET_ADMIN\"\n    # SYS_ADMIN: used in stress chaos experiment to perform cgroup operations.\n    - \"SYS_ADMIN\"\nhostNetwork: false\nhostIPC: false\n    # To run fault injection on a target container using pid namespace.\n    # It is used in stress, network, dns and http experiments. \nhostPID: true\nseLinux:\n    # This policy assumes the nodes are using AppArmor rather than SELinux.\n    rule: 'RunAsAny'\nsupplementalGroups:\n    rule: 'MustRunAs'\n    ranges:\n    # Forbid adding the root group.\n    - min: 1\n      max: 65535\nfsGroup:\n    rule: 'MustRunAs'\n    ranges:\n    # Forbid adding the root group.\n    - min: 1\n      max: 65535\nreadOnlyRootFilesystem: false\n

      Note: This PodSecurityPolicy is a sample configuration which works for a majority of the usecases. It is left to the user's discretion to modify it based on the environment. For example, if the experiment doesn't need the socket file to be mounted, allowedHostPaths can be excluded from the psp spec. On the other hand, in case of CRI-O runtime, network-chaos tests need the chaos pods executed in privileged mode. It is also possible that different PSP configs are used in different namespaces based on ChaosExperiments installed/executed in them.

    • Subscribe to the created PSP in the experiment RBAC (or in the admin-mode rbac, as applicable). For example, the pod-delete experiment rbac instrumented with the PSP is shown below:

      ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n- apiGroups: [\"\"]\nresources: [\"pods\",\"events\"]\nverbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\",\"deletecollection\"]\n- apiGroups: [\"\"]\nresources: [\"pods/exec\",\"pods/log\",\"replicationcontrollers\"]\nverbs: [\"create\",\"list\",\"get\"]\n- apiGroups: [\"batch\"]\nresources: [\"jobs\"]\nverbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n- apiGroups: [\"apps\"]\nresources: [\"deployments\",\"statefulsets\",\"daemonsets\",\"replicasets\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"apps.openshift.io\"]\nresources: [\"deploymentconfigs\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"argoproj.io\"]\nresources: [\"rollouts\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"litmuschaos.io\"]\nresources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\nverbs: [\"create\",\"list\",\"get\",\"patch\",\"update\"]\n- apiGroups: [\"policy\"]\nresources: [\"podsecuritypolicies\"]\nverbs: [\"use\"]\nresourceNames: [\"litmus\"] \n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\napiGroup: rbac.authorization.k8s.io\nkind: Role\nname: pod-delete-sa\nsubjects:\n- kind: ServiceAccount\nname: pod-delete-sa\nnamespace: default\n
    • Execute the ChaosEngine and verify that the litmus experiment pods are created successfully.

    "},{"location":"experiments/faq/ci-cd/","title":"CI/CD","text":""},{"location":"experiments/faq/ci-cd/#table-of-contents","title":"Table of Contents","text":"
    1. Is there any use case to integrate Litmus into CI? Which experiment have you integrated as part of the CI? And what would you do if a microservice fails an experiment in the CI?

    2. Is there any way to use Litmus within GitHub? When someone submits a k8s deployment for a PR , We want to run a chaos Experiment on that to see whether it passes or not

    3. How can users integrate Litmuschaos in their environment with Gitops?

    4. How can we use Litmus in our DevOps pipeline/cycle?

    "},{"location":"experiments/faq/ci-cd/#is-there-any-use-case-to-integrate-litmus-into-ci-which-experiment-have-you-integrated-as-part-of-the-ci-and-what-would-you-do-if-a-microservice-fails-an-experiment-in-the-ci","title":"Is there any use case to integrate Litmus into CI? Which experiment have you integrated as part of the CI? And what would you do if a microservice fails an experiment in the CI?","text":"

    We have integrated Litmus with a couple of CI tools, the major ones are:

    • GitHub Actions using litmuschaos actions
    • GitLab using remote templates
    • Keptn
    • Spinnaker templates

    By this, we induce chaos as part of the CI stage as Continuous Chaos allows us to automatically identify application failures over the development phase.

    Failure of an exp in CI should invariably fail the pipeline. The pass would be more subjective. Depends on what is the nature of the CI pipeline - what it is the tests being carried is like etc., If you are doing a simple pod-delete or cpu-hog on a microservice pod w/o traffic OR w/o running it in an env that doesn't need it to interact w/ other services then the insights are limited.

    "},{"location":"experiments/faq/ci-cd/#is-there-any-way-to-use-litmus-within-github-when-someone-submits-a-k8s-deployment-for-a-pr-we-want-to-run-a-chaos-experiment-on-that-to-see-whether-it-passes-or-not","title":"Is there any way to use Litmus within GitHub? When someone submits a k8s deployment for a PR , We want to run a chaos Experiment on that to see whether it passes or not.","text":"

    Yes, with the help of GitHub-chaos-action we can automate the chaos execution on an application in the same place where the code is stored. We can write individual tasks along with chaos actions and combine them to create a custom GitHub workflow. GitHub Workflows are custom automated processes that we can set up in our repository to build, test, package, or deploy any code project on GitHub. Including the GitHub chaos actions in our workflow YAML, We can test the performance/resiliency of our application in a much simpler and better way. To know more visit our Github chaos action repository.

    "},{"location":"experiments/faq/ci-cd/#how-can-users-integrate-litmuschaos-in-their-environment-with-gitops","title":"How can users integrate Litmuschaos in their environment with Gitops?","text":"

    GitOps feature in Litmus enables users to sync workflows from a configured git repo, any workflow inserts/updates made to the repo will be monitored and picked up by the Litmus portal and will be executed on the target cluster. Litmus portal GitOps also includes an event-driven chaos injection feature where users can annotate an application to be watched for changes and if and when the change happens chaos workflows can be triggered automatically. This integrates with other GitOps tools like Flux/Argo CD and enables users to automatically run chaos workflows whenever a new release happens or a particular change occurs in the application. To configure a git repo the user must provide the Git URL of the repository and the branch name and the authentication credentials which are of two types:

    1. Access Token
    2. SSH Key

    Once GitOps is enabled, any new workflows created will be stored in the configured repo in the path litmus/<project-id>/<workflow-name>.yaml

    "},{"location":"experiments/faq/ci-cd/#how-can-we-use-litmus-in-our-devops-pipelinecycle","title":"How can we use Litmus in our DevOps pipeline/cycle?","text":"

    You can add Litmus to the CI/CD pipelines as part of an end-to-end testing approach due to its minimal pre-requisites and simple result mechanisms. It also provides utilities for quick setup of Kubernetes clusters on different platforms as well as installation of storage provider control plane components (operators). Openebs.ci is a reference implementation of how Litmus can be used in the DevOps pipeline.

    "},{"location":"experiments/faq/content/","title":"Litmus FAQ","text":""},{"location":"experiments/faq/content/#faq","title":"FAQ","text":"Category Description References Install Questions related to litmus installation Install Experiments Questions related to litmus experiments Experiments Portal Questions related to litmus portal Portal Scheduler Questions related to litmus scheduler Scheduler Security Questions related to litmus security Security CI/CD Questions related to litmus CI/CD integration CI/CD"},{"location":"experiments/faq/content/#troubleshooting","title":"Troubleshooting","text":"Category Description References Install Troubleshooting related to litmus installation Install Experiments Troubleshooting related to litmus experiments Experiments Portal Troubleshooting related to litmus portal Portal Scheduler Troubleshooting related to litmus scheduler Scheduler"},{"location":"experiments/faq/experiments/","title":"Litmus Experiments","text":""},{"location":"experiments/faq/experiments/#table-of-contents","title":"Table of Contents","text":"
    1. Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?

    2. Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?

    3. How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?

    4. What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?

    5. What are the typical probes used for pod-network related experiments?

    6. Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?

    7. How to run chaos experiment programatically using apis?

    8. Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?

    9. The experiment is not targeting all pods with the given label, it just selects only one pod by default

    10. Do we have a way to see what pods are targeted when users use percentages?

    11. What is the function of spec.definition.scope of a ChaosExperiment CR?

    12. Pod network latency -- I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?

    13. How to check the NETWORK_INTERFACE and SOCKET_PATH variable?

    14. What are the different ways to target the pods and nodes for chaos?

    15. Does the pod affected perc select the random set of pods from the total pods under chaos?

    16. How to extract the chaos start time and end time?

    17. How do we check the MTTR (Mean time to recovery) for an application post chaos?

    18. What is the difference between Ramp Time and Chaos Interval?

    19. Can the appkind be a pod?

    20. What type of chaos experiments are supported by Litmus?

    21. What are the permissions required to run Litmus Chaos Experiments?

    22. What is the scope of a Litmus Chaos Experiment?

    23. To get started with running chaos experiments using Litmus?

    24. How to view and interpret the results of a chaos experiment?

    25. Do chaos experiments run as a standard set of pods?

    26. Is it mandatory to annotate application deployments for chaos?

    27. How to add Custom Annotations as chaos filters?

    28. Is it mandatory for the chaosengine and chaos experiment resources to exist in the same namespace?

    29. How to get the chaos logs in Litmus?

    30. Does Litmus support generation of events during chaos?

    31. How to stop or abort a chaos experiment?

    32. Can a chaos experiment be resumed once stopped or aborted?

    33. How to restart chaosengine after graceful completion?

    34. Does Litmus support any chaos metrics for experiments?

    35. Does Litmus track any usage metrics on the test clusters?

    36. What to choose between minChaosInterval and instanceCount?

    "},{"location":"experiments/faq/experiments/#node-memory-hog-experiments-pod-oom-killed-even-before-the-kubelet-sees-the-memory-stress","title":"Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?","text":"

    The experiment takes a percentage of the total memory capacity of the Node. The helper pod runs on the target node to stress the resources of that node. So The experiment will not consume/hog the memory resources greater than the total memory available on Node. In other words there will always be an upper limit for the amount of memory to be consumed, which equal to the total available memory. Please refer to this blog for more details.

    "},{"location":"experiments/faq/experiments/#pod-network-corruption-and-pod-network-loss-both-experiments-force-network-packet-loss-is-it-worthwhile-trying-out-both-experiments-in-a-scheduled-chaos-test","title":"Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?","text":"

    Yes, ultimately these are different ways to simulate a degraded network. Both cases are expected to typically cause retransmissions (for tcp). The extent of degradation depends on the percentage of loss/corruption

    "},{"location":"experiments/faq/experiments/#how-is-the-packet-loss-achieved-in-pod-network-loss-and-corruption-experiments-what-are-the-internals-of-it","title":"How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?","text":"

    The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container) The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in different availability zones/regions etc.., Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things. Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree.

    The args passed to the tc netem command run against the target container changes depending on the type of n/w fault

    "},{"location":"experiments/faq/experiments/#whats-the-difference-between-pod-memorycpu-hog-vs-pod-memorycpu-hog-exec","title":"What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?","text":"

    The pod cpu and memory chaos experiment till now (version 1.13.7) was using an exec mode of execution which means - we were execing inside the specified target container and launching process like md5sum and dd to consume the cpu and memory respectively. This is done by providing CHAOS_INJECT_COMMAND and CHAOS-KILL-COMMAND in chaosengine CR. But we have some limitations of using this method. Those were:

    • The chaos inject and kill command are highly dependent on the base image of the target container and may work for some and for others you may have to derive it manually and use it.
    • For scratch images that don't expose shells we couldn't execute the chaos.

    To overcome this - The stress-chaos experiments (cpu, memory and io) are enhanced to use a non exec mode of chaos execution. It makes use of target container cgroup for the resource allocation and container pid namespace for showing the stress-ng process in target container. This stress-ng process will consume the resources on the target container without doing an exec. The new enhanced experiments are available from litmus 1.13.8 version.

    "},{"location":"experiments/faq/experiments/#what-are-the-typical-probes-used-for-pod-network-related-experiments","title":"What are the typical probes used for pod-network related experiments?","text":"

    Precisely the role of the experiment. Cause n/w degradation w/o the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container) The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in diff availability zones/regions etc..,

    Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things.

    Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree

    "},{"location":"experiments/faq/experiments/#litmus-provides-multiple-libs-to-run-some-chaos-experiments-like-stress-chaos-and-network-chaos-so-which-library-should-be-preferred-to-use","title":"Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?","text":"

    The optional libs (like Pumba) is more of an illustration of how you can use 3rd party tools with litmus. Called the BYOC (Bring Your Own Chaos). The preferred LIB is litmus.

    "},{"location":"experiments/faq/experiments/#how-to-run-chaos-experiment-programatically-using-apis","title":"How to run chaos experiment programatically using apis?","text":"

    To directly consume/manipulate the chaos resources (i.e., chaosexperiment, chaosengine or chaosresults) via API - you can directly use the kube API. The CRDs by default provide us with an API endpoint. You can use any generic client implementation (go/python are most used ones) to access them. In case you use go, there is a clientset available as well: go-client

    Here are some simple CRUD ops against chaosresources you could construct with curl (I have used kubectl proxy, one could use an auth token instead)- just for illustration purposes.

    "},{"location":"experiments/faq/experiments/#create-chaosengine","title":"Create ChaosEngine:","text":"

    For example, assume this is the engine spec

    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines -XPOST -H 'Content-Type: application/json' -d@pod-delete-chaosengine-trigger.json\n
    "},{"location":"experiments/faq/experiments/#read-chaosengine-status","title":"Read ChaosEngine status:","text":"
    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos | jq '.status.engineStatus, .status.experiments[].verdict'\n
    "},{"location":"experiments/faq/experiments/#update-chaosengine-spec","title":"Update ChaosEngine Spec:","text":"

    (say, this is the patch: https://gist.github.com/ksatchit/be54955a1f4231314797f25361ac488d)

    curl --header \"Content-Type: application/json-patch+json\" --request PATCH --data '[{\"op\": \"replace\", \"path\": \"/spec/engineState\", \"value\": \"stop\"}]' http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos\n
    "},{"location":"experiments/faq/experiments/#delete-the-chaosengine-resource","title":"Delete the ChaosEngine resource:","text":"
    curl -X DELETE localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos \\\n-d '{\"kind\":\"DeleteOptions\",\"apiVersion\":\"v1\",\"propagationPolicy\":\"Foreground\"}' \\\n-H \"Content-Type: application/json\"\n
    "},{"location":"experiments/faq/experiments/#similarly-to-check-the-resultsverdict-of-the-experiment-from-chaosresult-you-could-use","title":"Similarly, to check the results/verdict of the experiment from ChaosResult, you could use:","text":"
    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/nginx-chaos-pod-delete | jq '.status.experimentStatus.verdict, .status.experimentStatus.probeSuccessPercentage'\n
    "},{"location":"experiments/faq/experiments/#kubernetes-by-default-has-built-in-features-like-replicasetdeployment-to-prevent-service-unavailability-continuous-curl-from-the-httpprobe-on-litmus-should-not-fail-in-case-of-container-kill-pod-delete-and-oom-due-to-pod-memory-hog-then-why-do-we-need-cpu-io-and-network-related-chaos-experiments","title":"Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?","text":"

    There are some scenarios that can still occur despite whatever availability aids K8s provides. For example, take disk usage or CPU hogs -- problems you would generally refer to as \"Noisy Neighbour\" problems. Stressing the disk w/ continuous and heavy I/O for example can cause degradation in reads and writes performed by other microservices that use this shared disk - for example. (modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out). Another issue is the amount of scratch space eaten up on a node - leading to lack of space for newer containers to get scheduled (kubernetes too gives up by applying an \"eviction\" taint like \"disk-pressure\") and causes a wholesale movement of all pods to other nodes. Similarly w/ CPU chaos -- by injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in app traffic OR in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.

    "},{"location":"experiments/faq/experiments/#the-experiment-is-not-targeting-all-pods-with-the-given-label-it-just-selects-only-one-pod-by-default","title":"The experiment is not targeting all pods with the given label, it just selects only one pod by default.","text":"

    Yes. You can use either the PODS_AFFECTED_PERCENTAGE or TARGET_PODS env to select multiple pods. Refer: experiment tunable envs.

    "},{"location":"experiments/faq/experiments/#do-we-have-a-way-to-see-what-pods-are-targeted-when-users-use-percentages","title":"Do we have a way to see what pods are targeted when users use percentages?","text":"

    We can view the target pods from the experiment logs or inside chaos results.

    "},{"location":"experiments/faq/experiments/#what-is-the-function-of-specdefinitionscope-of-a-chaosexperiment-cr","title":"What is the function of spec.definition.scope of a ChaosExperiment CR?","text":"

    The spec.definition.scope & .spec.definition.permissions is mostly for indicative/illustration purposes (for external tools to identify and validate what are the permissions associated to run the exp). By itself, it doesn't influence how and where an exp can be used.One could remove these fields if needed (of course along w/ the crd validation) and store these manifests if desired.

    "},{"location":"experiments/faq/experiments/#in-pod-network-latency-i-have-pod-a-talking-to-pod-b-over-service-b-and-i-want-to-introduce-latency-between-pod-a-and-service-b-what-would-go-into-specappinfo-section-pod-a-namespace-label-selector-and-kind-what-will-go-into-destination_ip-and-destination_host-service-b-details-what-are-the-target_pods","title":"In Pod network latency - I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?","text":"

    It will target the [1:total_replicas](based on PODS_AFFECTED_PERC) numbers of random pods with matching labels(appinfo.applabel) and namespace(appinfo.appns). But if you want to target a specific pod then you can provide their names as a comma separated list inside TARGET_PODS. Yes, you can provide service B details inside DESTINATION_IPS or DESTINATION_HOSTS. The NETWORK_INTERFACE should be eth0.

    "},{"location":"experiments/faq/experiments/#how-to-check-the-network_interface-and-socket_path-variable","title":"How to check the NETWORK_INTERFACE and SOCKET_PATH variable?","text":"

    The NETWORK_INTERFACE is the interface name inside the pod/container that needs to be targeted. You can find it by execing into the target pod and checking the available interfaces. You can try ip link, iwconfig , ifconfig depending on the tools installed in the pod either of those could work.

    The SOCKET_PATH by default takes the containerd socket path. If you are using something else like docker, crio or have a different socket path by any chance you can specify it. This is required to communicate with the container runtime of your cluster. In addition to this if container-runtime is different then provide the name of container runtime inside CONTAINER_RUNTIME ENV. It supports docker, containerd, and crio runtimes.

    "},{"location":"experiments/faq/experiments/#what-are-the-different-ways-to-target-the-pods-and-nodes-for-chaos","title":"What are the different ways to target the pods and nodes for chaos?","text":"

    The different ways are:

    Pod Chaos:

    • Appinfo: Provide the target pod labels in the chaos engine appinfo section.
    • TARGET_PODS: You can provide the target pod names as a Comma Separated Variable. Like pod1,pod2.

    Node Chaos:

    • TARGET_NODE or TARGET_NODES: Provide the target node or nodes in these envs.
    • NODE_LABEL: Provide the label of the target nodes.
    "},{"location":"experiments/faq/experiments/#does-the-pod-affected-percentage-select-the-random-set-of-pods-from-the-total-pods-under-chaos","title":"Does the pod affected percentage select the random set of pods from the total pods under chaos?","text":"

    Yes, it selects the random pods based on the POD_AFFACTED_PERC ENV. In pod-delete experiment it selects random pods for each iterations of chaos. But for rest of the experiments(if it supports iterations) then it will select random pods once and use the same set of pods for remaining iterations.

    "},{"location":"experiments/faq/experiments/#how-to-extract-the-chaos-start-time-and-end-time","title":"How to extract the chaos start time and end time?","text":"

    We can use the Chaos exporter metrics for the same. One can also visualise these events along with time in chaos engine events.

    "},{"location":"experiments/faq/experiments/#how-do-we-check-the-mttr-mean-time-to-recovery-for-an-application-post-chaos","title":"How do we check the MTTR (Mean time to recovery) for an application post chaos?","text":"

    The MTTR can be validated by using statusCheck Timeout in the chaos engine. By default its value will be 180 seconds. We can also overwrite this using ChaosEngine. For more details refer this

    "},{"location":"experiments/faq/experiments/#what-is-the-difference-between-ramp-time-and-chaos-interval","title":"What is the difference between Ramp Time and Chaos Interval?","text":"

    The ramp time is the time duration to wait before and after injection of chaos in seconds. While the chaos interval is the time interval (in second) between successive chaos iterations.

    "},{"location":"experiments/faq/experiments/#can-the-appkind-be-a-pod","title":"Can the appkind be a pod?","text":"

    The appkind as pod is not supported explicitly. The supported appkind are deployment, statefulset, replicaset, daemonset, rollout, and deploymentconfig. But we can target the pods by following ways:

    • provide labels and namespace at spec.appinfo.applabel and spec.appinfo.appns respectively and provide spec.appinfo.appkind as empty.
    • provide pod names at TARGET_PODS ENV and provide spec.appinfo as nil

    NOTE: The annotationCheck should be provided as false

    "},{"location":"experiments/faq/experiments/#what-type-of-chaos-experiments-are-supported-by-litmus","title":"What type of chaos experiments are supported by Litmus?","text":"

    Litmus broadly defines Kubernetes chaos experiments into two categories: application or pod-level chaos experiments and platform or infra-level chaos experiments. The former includes pod-delete, container-kill, pod-cpu-hog, pod-network-loss etc., while the latter includes node-drain, disk-loss, node-cpu-hog etc., The infra chaos experiments typically have a higher blast radius and impact more than one application deployed on the Kubernetes cluster. Litmus also categorizes experiments on the basis of the applications, with the experiments consisting of app-specific health checks. For a full list of supported chaos experiments, visit: https://hub.litmuschaos.io

    "},{"location":"experiments/faq/experiments/#what-are-the-permissions-required-to-run-litmus-chaos-experiments","title":"What are the permissions required to run Litmus Chaos Experiments?","text":"

    By default, the Litmus operator uses the \u201clitmus\u201d serviceaccount that is bound to a ClusterRole, in order to watch for the ChaosEngine resource across namespaces. However, the experiments themselves are associated with \u201cchaosServiceAccounts\u201d which are created by the developers with bare-minimum permissions necessary to execute the experiment in question. Visit the chaos-charts repo to view the experiment-specific rbac permissions. For example, here are the permissions for container-kill chaos.

    "},{"location":"experiments/faq/experiments/#what-is-the-scope-of-a-litmus-chaos-experiment","title":"What is the scope of a Litmus Chaos Experiment?","text":"

    The chaos CRs (chaosexperiment, chaosengine, chaosresults) themselves are namespace scoped and are installed in the same namespace as that of the target application. While most of the experiments can be executed with service accounts mapped to namespaced roles, some infra chaos experiments typically perform health checks of applications across namespaces & therefore need their serviceaccounts mapped to ClusterRoles.

    "},{"location":"experiments/faq/experiments/#to-get-started-with-running-chaos-experiments-using-litmus","title":"To get started with running chaos experiments using Litmus?","text":"

    Litmus has a low entry barrier and is easy to install/use. Typically, it involves installing the chaos-operator, chaos experiment CRs from the charthub, annotating an application for chaos and creating a chaosengine CR to map your application instance with a desired chaos experiment. Refer to the getting started documentation to learn more on how to run a simple chaos experiment.

    "},{"location":"experiments/faq/experiments/#how-to-view-and-interpret-the-results-of-a-chaos-experiment","title":"How to view and interpret the results of a chaos experiment?","text":"

    The results of a chaos experiment can be obtained from the verdict property of the chaosresult custom resource. If the verdict is Pass, it means that the application under test is resilient to the chaos injected. Alternatively, Fail reflects that the application is not resilient enough to the injected chaos, and indicates the need for a relook into the deployment sanity or possible application bugs/issues.

    kubectl describe chaosresult <chaosengine-name>-<chaos-experiment> -n <namespace>\n
    The status of the experiment can also be gauged by the \u201cstatus\u201d property of the ChaosEngine.

    Kubectl describe chaosengine <chaosengne-name> -n <namespace>\n
    "},{"location":"experiments/faq/experiments/#do-chaos-experiments-run-as-a-standard-set-of-pods","title":"Do chaos experiments run as a standard set of pods?","text":"

    The chaos experiment (triggered after creation of the ChaosEngine resource) workflow consists of launching the \u201cchaos-runner\u201d pod, which is an umbrella executor of different chaos experiments listed in the engine. The chaos-runner creates one pod (job) per each experiment to run the actual experiment business logic, and also manages the lifecycle of these experiment pods (performs functions such as experiment dependencies validation, job cleanup, patching of status back into ChaosEngine etc.,). Optionally, a monitor pod is created to export the chaos metrics. Together, these 3 pods are a standard set created upon execution of the experiment. The experiment job, in turn may spawn dependent (helper) resources if necessary to run the experiments, but this depends on the experiment selected, chaos libraries chosen etc.,

    "},{"location":"experiments/faq/experiments/#is-it-mandatory-to-annotate-application-deployments-for-chaos","title":"Is it mandatory to annotate application deployments for chaos?","text":"

    Typically applications are expected to be annotated with litmuschaos.io/chaos=\"true\" to lend themselves to chaos. This is in order to support selection of the right applications with similar labels in a namespaces, thereby isolating the application under test (AUT) & reduce the blast radius. It is also helpful for supporting automated execution (say, via cron) as a background service. However, in cases where the app deployment specifications are sacrosanct and not expected to be modified, or in cases where annotating a single application for chaos when the experiment itself is known to have a higher blast radius doesn\u2019t make sense (ex: infra chaos), the annotationCheck can be disabled via the ChaosEngine tunable annotationCheck (.spec.annotationCheck: false).

    "},{"location":"experiments/faq/experiments/#how-to-add-custom-annotations-as-chaos-filters","title":"How to add Custom Annotations as chaos filters?","text":"

    Currently Litmus allows you to set your own/custom keys for Annotation filters, the value being true/false. To use your custom annotation, add this key under an ENV named as CUSTOM_ANNOTATION in ChaosOperator deployment. A sample chaos-operator deployment spec is provided here for reference:

    view the manifest
    ---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\nname: chaos-operator-ce\nnamespace: litmus\nspec:\nreplicas: 1\nselector:\n    matchLabels:\n    name: chaos-operator\ntemplate:\n    metadata:\n    labels:\n        name: chaos-operator\n    spec:\n    serviceAccountName: litmus\n    containers:\n        - name: chaos-operator\n        # 'latest' tag corresponds to the latest released image\n        image: litmuschaos/chaos-operator:latest\n        command:\n        - chaos-operator\n        imagePullPolicy: Always\n        env:\n            - name: CUSTOM_ANNOTATION\n            value: \"mayadata.io/chaos\"\n            - name: CHAOS_RUNNER_IMAGE\n            value: \"litmuschaos/chaos-runner:latest\"\n            - name: WATCH_NAMESPACE\n            value: \n            - name: POD_NAME\n            valueFrom:\n                fieldRef:\n                fieldPath: metadata.name\n            - name: OPERATOR_NAME\n          value: \"chaos-operator\"\n
    "},{"location":"experiments/faq/experiments/#is-it-mandatory-for-the-chaosengine-and-chaos-experiment-resources-to-exist-in-the-same-namespace","title":"Is it mandatory for the chaosengine and chaos experiment resources to exist in the same namespace?","text":"

    Yes. As of today, the chaos resources are expected to co-exist in the same namespace, which typically is also the application's (AUT) namespace.

    "},{"location":"experiments/faq/experiments/#how-to-get-the-chaos-logs-in-litmus","title":"How to get the chaos logs in Litmus?","text":"

    The chaos logs can be viewed in the following manner. To view the successful launch/removal of chaos resources upon engine creation, for identification of application under test (AUT) etc., view the chaos-operator logs:

    kubectl logs -f <chaos-operator-(hash)-(hash)> -n <chaos_namespace>\n
    To view lifecycle management logs of a given (or set of) chaos experiments, view the chaos-runner logs:
    kubectl logs -f <chaosengine_name>-runner -n <chaos_namespace>\n
    To view the chaos logs itself (details of experiment chaos injection, application health checks et al), view the experiment pod logs:
    kubectl logs -f <experiment_name_(hash)_(hash)> -n <chaos_namespace>\n

    "},{"location":"experiments/faq/experiments/#does-litmus-support-generation-of-events-during-chaos","title":"Does Litmus support generation of events during chaos?","text":"

    The chaos-operator generates Kubernetes events to signify the creation of removal of chaos resources over the course of a chaos experiment, which can be obtained by running the following command:

    kubectl describe chaosengine <chaosengine-name> -n <namespace>\n
    Note: Efforts are underway to add more events around chaos injection in subsequent releases.

    "},{"location":"experiments/faq/experiments/#how-to-stop-or-abort-a-chaos-experiment","title":"How to stop or abort a chaos experiment?","text":"

    A chaos experiment can be stopped/aborted inflight by patching the .spec.engineState property of the chaosengine to stop . This will delete all the chaos resources associated with the engine/experiment at once.

    kubectl patch chaosengine <chaosengine-name> -n <namespace> --type merge --patch '{\"spec\":{\"engineState\":\"stop\"}}'\n
    The same effect will be caused by deleting the respective chaosengine resource.

    "},{"location":"experiments/faq/experiments/#can-a-chaos-experiment-be-resumed-once-stopped-or-aborted","title":"Can a chaos experiment be resumed once stopped or aborted?","text":"

    Once stopped/aborted, patching the chaosengine .spec.engineState with active causes the experiment to be re-executed. Another way is to re-apply the ChaosEngine YAML, this will delete all stale chaos resources, and restart ChaosEngine lifecycle. However, support is yet to be added for saving state and resuming an in-flight experiment (i.e., execute pending iterations etc.,)

    kubectl patch chaosengine <chaosengine-name> -n <namespace> --type merge --patch '{\"spec\":{\"engineState\":\"active\"}}'\n

    "},{"location":"experiments/faq/experiments/#how-to-restart-chaosengine-after-graceful-completion","title":"How to restart chaosengine after graceful completion?","text":"

    To restart chaosengine, check the .spec.engineState, which should be equal to stop, which means your chaosengine has gracefully completed, or forcefully aborted. In this case, restart is quite easy, as you can re-apply the chaosengine YAML to restart it. This will remove all stale chaos resources linked to this chaosengine, and restart its own lifecycle.

    "},{"location":"experiments/faq/experiments/#does-litmus-support-any-chaos-metrics-for-experiments","title":"Does Litmus support any chaos metrics for experiments?","text":"

    Litmus provides a basic set of prometheus metrics indicating the total count of chaos experiments, passed/failed experiments and individual status of experiments specified in the ChaosEngine, which can be queried against the monitor pod. Work to enhance and improve this is underway.

    "},{"location":"experiments/faq/experiments/#does-litmus-track-any-usage-metrics-on-the-test-clusters","title":"Does Litmus track any usage metrics on the test clusters?","text":"

    By default, the installation count of chaos-operator & run count of a given chaos experiment is collected as part of general analytics to gauge user adoption & chaos trends. However, if you wish to inhibit this, please use the following ENV setting on the chaos-operator deployment:

    env: \n  name: ANALYTICS\n  value: 'FALSE'\n

    "},{"location":"experiments/faq/experiments/#what-to-choose-between-minchaosinterval-and-instancecount","title":"What to choose between minChaosInterval and instanceCount?","text":"

    Only one should be chosen ideally between minChaosInterval and instanceCount. However if both are specified minChaosInterval will be given priority. minChaosInterval specifies the minimum interval that should be present between the launch of 2 chaosengines and instanceCount specifies the exact number of chaosengines to be launched between the range (start and end time). SO we can choose depending on our requirements.

    "},{"location":"experiments/faq/install/","title":"Install","text":""},{"location":"experiments/faq/install/#table-of-contents","title":"Table of Contents","text":"
    1. I encountered the concept of namespace and cluster scope during the installation. What is meant by the scopes, and how does it affect experiments to be performed outside or inside the litmus Namespace?

    2. Does Litmus 2.0 maintain backward compatibility with Kubernetes?

    3. Can I run LitmusChaos Outside of my Kubernetes clusters?

    4. What is the minimum system requirement to run Portal and agent together?

    5. Can I use LitmusChaos in Production?

    6. Why should I use Litmus? What is its distinctive feature?

    7. What licensing model does Litmus use?

    8. What are the prerequisites to get started with Litmus?

    9. How to Install Litmus on the Kubernetes Cluster?

    "},{"location":"experiments/faq/install/#i-encountered-the-concept-of-namespace-and-cluster-scope-during-the-installation-what-is-meant-by-the-scopes-and-how-does-it-affect-experiments-to-be-performed-outside-or-inside-the-litmus-namespace","title":"I encountered the concept of namespace and cluster scope during the installation. What is meant by the scopes, and how does it affect experiments to be performed outside or inside the litmus Namespace?","text":"

    The scope of control plane (portal) installation can be tuned by the env 'PORTAL_SCOPE' in the 'litmusportal-server' deployment. Its value can be kept as a \u201cnamespace\u201d if you want to provide restricted access to litmus. It is useful in strictly multi-tenant environments in which users have namespace-level permissions and need to set up their own chaos-center instances. This is also the case in certain popular SaaS environments like Okteto cloud.

    This setting can be used in combination with a flag, 'AGENT_SCOPE' in the 'litmus-portal-admin-config' ConfigMap to limit the purview of the corresponding self-agent (the execution plane pods on the cluster/namespace where the control plane is installed) to the current namespace, which means the user can perform chaos experiments only in chosen installation namespace. By default, both are set up for cluster-wide access, by which microservices across the cluster can be subjected to chaos.

    In case of external-agents, i.e., the targets being connected to the chaos-center, you can choose the agent\u2019s scope to either cluster or namespace via a 'litmusctl' flag (when using it in non-interactive mode) or by providing the appropriate input (in interactive mode).

    "},{"location":"experiments/faq/install/#does-litmus-20-maintain-backward-compatibility-with-kubernetes","title":"Does Litmus 2.0 maintain backward compatibility with Kubernetes?","text":"

    Yes, Litmus maintains a separate CRD manifest to support backward compatibility.

    "},{"location":"experiments/faq/install/#can-i-run-litmuschaos-outside-of-my-kubernetes-clusters","title":"Can I run LitmusChaos Outside of my Kubernetes clusters?","text":"

    You can run the chaos experiments outside of the k8s cluster as a dockerized container. However, other components such as chaos-operator,chaos-exporter, and runner are Kubernetes native. They require k8s cluster to run on it.

    "},{"location":"experiments/faq/install/#what-is-the-minimum-system-requirement-to-run-portal-and-agent-together","title":"What is the minimum system requirement to run Portal and agent together?","text":"

    To run LitmusPortal you need to have a minimum of 1 GiB memory and 1 core of CPU free.

    "},{"location":"experiments/faq/install/#can-i-use-litmuschaos-in-production","title":"Can I use LitmusChaos in Production?","text":"

    Yes, you can use Litmuschaos in production. Litmus has a wide variety of experiments and is designed according to the principles of chaos engineering. However, if you are new to Chaos Engineering, we would recommend you to first try Litmus on your dev environment, and then after getting the confidence, you should use it in Production.

    "},{"location":"experiments/faq/install/#why-should-i-use-litmus-what-is-its-distinctive-feature","title":"Why should I use Litmus? What is its distinctive feature?","text":"

    Litmus is a toolset for performing cloud-native Chaos Engineering. Litmus provides tools to orchestrate chaos on Kubernetes to help developers and SREs find weaknesses in their application deployments. Litmus can be used to run chaos experiments initially in the staging environment and eventually in production to find bugs and vulnerabilities. Fixing the weaknesses leads to increased resilience of the system. Litmus adopts a \u201cKubernetes-native\u201d approach to define chaos intent in a declarative manner via custom resources.

    "},{"location":"experiments/faq/install/#what-licensing-model-does-litmus-use","title":"What licensing model does Litmus use?","text":"

    Litmus is developed under Apache License 2.0 license at the project level. Some components of the projects are derived from the other Open Source projects and are distributed under their respective licenses.

    "},{"location":"experiments/faq/install/#what-are-the-prerequisites-to-get-started-with-litmus","title":"What are the prerequisites to get started with Litmus?","text":"

    To get started with Litmus, the only prerequisites is to have Kubernetes 1.11+ cluster. While most pod/container level experiments are supported on any Kubernetes platform, some of the infrastructure chaos experiments are supported on specific platforms. To find the list of supported platforms for an experiment, view the \"Platforms\" section on the sidebar in the experiment page.

    "},{"location":"experiments/faq/install/#how-to-install-litmus-on-the-kubernetes-cluster","title":"How to Install Litmus on the Kubernetes Cluster?","text":"

    You can install/deploy stable litmus using this command:

    kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-latest.yaml\n
    "},{"location":"experiments/faq/portal/","title":"Litmus Portal","text":""},{"location":"experiments/faq/portal/#table-of-contents","title":"Table of Contents","text":"
    1. Can we host MongoDB outside the cluster? What connection string is supported? Is SSL connection supported?

    2. What does failed status of workflow means in LitmusPortal?

    3. How can I setup a chaoshub of my gitlab repo in Litmus Portal?

    4. How to achieve High Availability of MongoDB and how can we add persistence to MongoDB?

    5. Can I create workflows without using a dashboard?

    6. Does Litmusctl support actions that are currently performed from the portal dashboard?

    7. How is resilience score is Calculated?

    "},{"location":"experiments/faq/portal/#can-we-host-mongodb-outside-the-cluster-what-connection-string-is-supported-is-ssl-connection-supported","title":"Can we host MongoDB outside the cluster? What connection string is supported? Is SSL connection supported?","text":"

    Yes we can host Mongodb outside the cluster, the mongo string can be updated accordingly DataBaseServer: \"mongodb://mongo-service:27017\" We use the same connection string for both authentication server and graphql server containers in litmus portal-server deployment, also there are the db user and db password keys that can be tuned in the secrets like DB_USER: \"admin\" and DB_PASSWORD: \"1234\". We can connect with SSL if the certificate is optional. If our requirement is ca.cert auth for the SSL connection, then this is not available on the portal

    "},{"location":"experiments/faq/portal/#what-does-failed-status-of-workflow-means-in-litmusportal","title":"What does failed status of workflow means in LitmusPortal?","text":"

    Failed status indicates that either there is some misconfiguration in the workflow or the default hypothesis of the experiment was disproved and some of the experiments in the workflow failed, In such case, the resiliency score will be less than 100.

    "},{"location":"experiments/faq/portal/#how-can-i-setup-a-chaoshub-of-my-gitlab-repo-in-litmus-portal","title":"How can I setup a chaoshub of my gitlab repo in Litmus Portal?","text":"

    In the litmus portal when you go to the chaoshub section and you click on connect new hub button, you can see that there are two modes of authentication i.e public mode and private mode. For public mode, you only have to provide the git URL and branch name. For private mode, we have two types of authentication; Access token and SSH key. For the access token, go to the settings of GitLab and in the Access token section, add a token with read repository permission. After getting the token, go to the Litmus portal and provide the GitLab URL and branch name along with the access token. After submitting, your own chaos hub is connected to the Litmus portal. For the second mode of authentication i.e; SSH key, In SSH key once you click on the SSH, It will generate a public key. You have to use the public key and put it in the GitLab setting. Just go to the settings of GitLab, you can see the SSH key section, go to the SSH key section and add your public key. After adding the public key. Get the ssh type URL of the git repository and put it in the Litmusportal along with the branch, after submitting your chaoshub is connected to the Litmus Portal.

    "},{"location":"experiments/faq/portal/#how-to-achieve-high-availability-of-mongodb-and-how-can-we-add-persistence-to-mongodb","title":"How to achieve High Availability of MongoDB and how can we add persistence to MongoDB?","text":"

    Currently, the MongoDB instance is not HA, we can install the MongoDB operator along with mongo to achieve HA. This MongoDB CRD allows for specifying the desired size and version as well as several other advanced options. Along with the MongoDB operator, we will use the MongoDB sts with PV to add persistence.

    "},{"location":"experiments/faq/portal/#can-i-create-workflows-without-using-a-dashboard","title":"Can I create workflows without using a dashboard?","text":"

    Currently, you can\u2019t.But We are working on it. Shortly we will publish samples for doing this via API/SDK and litmusctl.

    "},{"location":"experiments/faq/portal/#does-litmusctl-support-actions-that-are-currently-performed-from-the-portal-dashboard","title":"Does Litmusctl support actions that are currently performed from the portal dashboard?","text":"

    For now you can create agents and projects, also you can get the agents and project details by using litmusctl. To know more about litmusctl please refer to the documentation of litmusctl.

    "},{"location":"experiments/faq/portal/#how-is-resilience-score-is-calculated","title":"How is resilience score is Calculated?","text":"

    The Resilience score is calculated on the basis of the weightage and the Probe Success Percentage of the experiment. Resilience for one single experiment is the multiplication of the weight given to that experiment and the Probe Success Percentage. Then we get the total test result by adding the resilience score of all the experiments. The Final Resilience Score is calculated by dividing the total test result by the sum of the weights of all the experiments combined in the single workflow. For more detail refer to this blog.

    "},{"location":"experiments/faq/scheduler/","title":"Chaos Scheduler","text":""},{"location":"experiments/faq/scheduler/#table-of-contents","title":"Table of Contents","text":"
    1. What is ChaosScheduler?

    2. How is ChaosScheduler different from ChaosOperator?

    3. What are the pre-requisites for ChaosScheduler?

    4. How to install ChaosScheduler?

    5. How to schedule the chaos using ChaosScheduler?

    6. What are the different techniques of scheduling the chaos?

    7. What fields of spec.schedule are to be specified with spec.schedule.type=now?

    8. What fields of spec.schedule are to be specified with spec.schedule.type=once?

    9. What fields of spec.schedule are to be specified with spec.schedule.type=repeat?

    10. How to run ChaosScheduler in Namespaced mode?

    "},{"location":"experiments/faq/scheduler/#what-is-chaosscheduler","title":"What is ChaosScheduler?","text":"

    ChaosScheduler is an operator built on top of the operator-sdk framework. It keeps on watching resources of kind ChaosSchedule and based on the scheduling parameters automates the formation of ChaosEngines, to be observed by ChaosOperator, instead of manually forming the ChaosEngine every time we wish to inject chaos in the cluster.

    "},{"location":"experiments/faq/scheduler/#how-is-chaosscheduler-different-from-chaosoperator","title":"How is ChaosScheduler different from ChaosOperator?","text":"

    ChaosOperator operates on chaosengines while ChaosScheduler operates on chaosschedules which in turn forms chaosengines, through some scheduling techniques, to be observed by ChaosOperator. So ChaosOperator is a basic building block used to inject chaos in a cluster while ChaosScheduler is just a scheduling strategy that injects chaos in some form of pattern using ChaosOperator only. ChaosScheduler can not be used independently of ChaosOperator.

    "},{"location":"experiments/faq/scheduler/#what-are-the-pre-requisites-for-chaosscheduler","title":"What are the pre-requisites for ChaosScheduler?","text":"

    For getting started with ChaosScheduler, we should just have ChaosOperator and all the litmus infrastructure components installed in the cluster beforehand.

    "},{"location":"experiments/faq/scheduler/#how-to-install-chaosscheduler","title":"How to install ChaosScheduler?","text":"

    Firstly install the rbac and crd -

    kubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/rbac.yaml\nkubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/crds/chaosschedule_crd.yaml\n

    Install ChaosScheduler operator afterwards -

    kubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/chaos-scheduler.yaml\n

    "},{"location":"experiments/faq/scheduler/#how-to-schedule-the-chaos-using-chaosscheduler","title":"How to schedule the chaos using ChaosScheduler?","text":"

    This depends on which type of schedule we want to use for injecting chaos. For basic understanding refer constructing schedule

    "},{"location":"experiments/faq/scheduler/#what-are-the-different-techniques-of-scheduling-the-chaos","title":"What are the different techniques of scheduling the chaos?","text":"

    As of now, there are 3 scheduling techniques which can be selected based on the parameter passed to spec.schedule.type

    • type=now
    • type=once
    • type=repeat
    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletypenow","title":"What fields of spec.schedule are to be specified with spec.schedule.type=now?","text":"

    No fields are needed to be specified for this as it launches the desired chaosengine immediately.

    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletypeonce","title":"What fields of spec.schedule are to be specified with spec.schedule.type=once?","text":"

    We just need to pass spec.executionTime. Scheduler will launch the chaosengine exactly at the point of time mentioned in this parameter.

    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletyperepeat","title":"What fields of spec.schedule are to be specified with spec.schedule.type=repeat?","text":"

    All the fields of spec.schedule except spec.schedule.executionTime are needed to be specified.

    • startTime
    • endTime
    • minChaosInterval
    • includedHours
    • includedDays

    It schedules chaosengines to be launched according to the parameters passed. It works just as a cronjob does, having superior functionalities such as we can control when the schedule will start and end.

    "},{"location":"experiments/faq/scheduler/#how-to-run-chaosscheduler-in-namespaced-mode","title":"How to run ChaosScheduler in Namespaced mode?","text":"

    Firstly install the crd -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-scheduler-namespaced-crd.yaml\n

    Secondly install the rbac in the desired Namespace -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-scheduler-ns-rbac.yaml -n <namespace>\n

    Install ChaosScheduler operator in the desired Namespace afterwards -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-namespaced-scheduler.yaml -n <namespace>\n

    Execute ChaosScheduler with an experiment in the desired Namespace afterward.

    Note: The ChaosServiceAccount used within the embedded ChaosEngine template needs to be chosen appropriately depending on the experiment scope. - ```yaml apiVersion: litmuschaos.io/v1alpha1 kind: ChaosSchedule metadata: name: schedule-nginx namespace: spec: schedule: repeat: timeRange: startTime: \"2020-05-12T05:47:00Z\" #should be modified according to current UTC Time, for type=repeat endTime: \"2020-09-13T02:58:00Z\" #should be modified according to current UTC Time, for type=repeat properties: minChaosInterval: \"2m\" #format should be like \"10m\" or \"2h\" accordingly for minutes and hours, for type=repeat workHours: includedHours: 0-12 workDays: includedDays: \"Mon,Tue,Wed,Sat,Sun\" #should be set for type=repeat engineTemplateSpec: appinfo: appns: 'default' applabel: 'app=nginx' appkind: 'deployment' # It can be true/false annotationCheck: 'false' # It can be active/stop engineState: 'active' #ex. values: ns1:name=percona,ns2:run=nginx auxiliaryAppInfo: '' chaosServiceAccount: pod-delete-sa # It can be delete/retain jobCleanUpPolicy: 'delete' experiments: - name: pod-delete spec: components: env: # set chaos duration (in sec) as desired - name: TOTAL_CHAOS_DURATION value: '30'

              # set chaos interval (in sec) as desired\n          - name: CHAOS_INTERVAL\n            value: '10'\n\n          # pod failures without '--force' & default terminationGracePeriodSeconds\n          - name: FORCE\n            value: 'false'\n
    "},{"location":"experiments/troubleshooting/experiments/","title":"Litmus Experiments","text":""},{"location":"experiments/troubleshooting/experiments/#table-of-contents","title":"Table of Contents","text":"
    1. When I\u2019m executing an experiment the experiment's pod failed with the exec format error

    2. Nothing happens (no pods created) when the chaosengine resource is created?

    3. The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?

    4. The experiment pod enters completed state w/o the desired chaos being injected?

    5. Observing experiment results using describe chaosresult is showing NotFound error?

    6. The helper pod is getting in a failed state due to container runtime issue

    7. Disk Fill fail with the error message

    8. Disk Fill failed with error

    9. Disk fill experiment fails with an error pointing to the helper pods being unable to finish in the given duration

    10. The infra experiments like node drain, node taint, kubelet service kill to act on the litmus pods only

    11. AWS experiments failed with the following error

    12. In AWS SSM Chaos I have provided the aws in secret but still not able to inject the SSM chaos on the target instance

    13. GCP VM Disk Loss experiment fails unexpectedly where the disk gets detached successfully but fails to attach back to the instance. What can be the reason?

    14. In pod level stress chaos experiments like pod memory hog or pod io stress after the chaos is injected successfully the helper fails with an error message

    15. Experiment failed for the istio enabled namespaces

    "},{"location":"experiments/troubleshooting/experiments/#when-im-executing-an-experiment-the-experiments-pod-failed-with-the-exec-format-error","title":"When I\u2019m executing an experiment the experiment's pod failed with the exec format error","text":"View the error message

    standard_init_linux.go:211: exec user process caused \"exec format error\":

    There could be multiple reasons for this. The most common one is mismatched in the binary and the platform on which it is running, try to check out the image binary you're using should have the support for the platform on which you\u2019re trying to run the experiment.

    "},{"location":"experiments/troubleshooting/experiments/#nothing-happens-no-pods-created-when-the-chaosengine-resource-is-created","title":"Nothing happens (no pods created) when the chaosengine resource is created?","text":"

    If the ChaosEngine creation results in no action at all, perform the following checks:

    • Check the Kubernetes events generated against the chaosengine resource.

      kubectl describe chaosengine <chaosengine-name> -n <namespace>\n
      Specifically look for the event reason ChaosResourcesOperationFailed. Typically, these events consist of messages pointing to the problem. Some of the common messages include:

      • Unable to filter app by specified info
      • Unable to get chaos resources
      • Unable to update chaosengine
    • Check the logs of the chaos-operator pod using the following command to get more details (on failed creation of chaos resources). The below example uses litmus namespace, which is the default mode of installation. Please provide the namespace into which the operator has been deployed:

      kubectl logs -f <chaos-operator-(hash)-(hash)>-runner -n litmus\n

    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-for-these-errors-include","title":"Some of the possible reasons for these errors include:","text":"
    • The annotationCheck is set to true in the ChaosEngine spec, but the application deployment (AUT) has not been annotated for chaos. If so, please add it using the following command:

      kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos=\"true\"\n

    • The annotationCheck is set to true in the ChaosEngine spec and there are multiple chaos candidates that share the same label (as provided in the .spec.appinfo of the ChaosEngine) and are also annotated for chaos. If so, please provide a unique label for the AUT, or remove annotations on other applications with the same label. Litmus, by default, doesn't allow selection of multiple applications. If this is a requirement, set the annotationCheck to false.

      kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos-\n

    • The ChaosEngine has the .spec.engineState set to stop, which causes the operator to refrain from creating chaos resources. While it is an unlikely scenario, it is possible to reuse a previously modified ChaosEngine manifest.

    • Verify if the service account used by the Litmus ChaosOperator has enough permissions to launch pods/services (this is available by default if the manifests suggested by the docs have been used).

    "},{"location":"experiments/troubleshooting/experiments/#the-chaos-runner-pod-enters-completed-state-seconds-after-getting-created-no-experiment-jobs-are-created","title":"The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?","text":"

    If the chaos-runner enters completed state immediately post creation, i.e., the creation of experiment resources is unsuccessful, perform the following checks:

    • Check the Kubernetes events generated against the chaosengine resource.
      kubectl describe chaosengine <chaosengine-name> -n <namespace>\n

    Look for one of these events: ExperimentNotFound, ExperimentDependencyCheck, EnvParseError

    • Check the logs of the chaos-runner pod logs.
      kubectl logs -f <chaosengine_name>-runner -n <namespace>\n
    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-may-include","title":"Some of the possible reasons may include:","text":"
    • The ChaosExperiment CR for the experiment (name) specified in the ChaosEngine .spec.experiments list is not installed. If so, please install the desired experiment from the chaoshub

    • The dependent resources for the ChaosExperiment, such as ConfigMap & secret volumes (as specified in the ChaosExperiment CR or the ChaosEngine CR) may not be present in the cluster (or in the desired namespace). The runner pod doesn\u2019t proceed with creation of experiment resources if the dependencies are unavailable.

    • The values provided for the ENV variables in the ChaosExperiment or the ChaosEngines might be invalid

    • The chaosServiceAccount specified in the ChaosEngine CR doesn\u2019t have sufficient permissions to create the experiment resources (For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs).

    "},{"location":"experiments/troubleshooting/experiments/#the-experiment-pod-enters-completed-state-wo-the-desired-chaos-being-injected","title":"The experiment pod enters completed state w/o the desired chaos being injected?","text":"

    If the experiment pod enters completed state immediately (or in a few seconds) after creation w/o injecting the desired chaos, perform the following checks:

    • Check the Kubernetes events generated against the ChaosEngine resource
    kubectl describe chaosengine <chaosengine-name> -n <namespace>\n

    Look for the event with reason Summary with message experiment has been failed

    • Check the logs of the chaos-experiment pod.
    kubectl logs -f <experiment_name_(hash)_(hash)> -n <namespace>\n
    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-may-include_1","title":"Some of the possible reasons may include:","text":"
    • The ChaosExperiment CR or the ChaosEngine CR doesn\u2019t include mandatory ENVs (or consists of incorrect values/info) needed by the experiment. Note that each experiment (see docs) specifies a mandatory set of ENVs along with some optional ones, which are necessary for successful execution of the experiment.

    • The chaosServiceAccount specified in the ChaosEngine CR doesn\u2019t have sufficient permissions to create the experiment helper-resources (i.e., some experiments in turn create other K8s resources like Jobs/Daemonsets/Deployments etc.., For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs)

    • The application's (AUT) unique label provided in the ChaosEngine is set only at the parent resource metadata but not propagated to the pod template spec. Note that the Operator uses this label to filter chaos candidates at the parent resource level (deployment/statefulset/daemonset) but the experiment pod uses this to pick application pods into which the chaos is injected.

    • The experiment pre-chaos checks have failed on account of application (AUT) or auxiliary application unavailability

    "},{"location":"experiments/troubleshooting/experiments/#observing-experiment-results-using-describe-chaosresult-is-showing-notfound-error","title":"Observing experiment results using describe chaosresult is showing NotFound error?","text":"

    Upon observing the ChaosResults by executing the describe command given below, it may give a NotFound error.

    kubectl describe chaosresult <chaos-engine-name>-<chaos-experiment-name>  -n <namespace>\n

    Alternatively, running the describe command without specifying the expected ChaosResult name might execute successfully, but does may not show any output.

    kubectl describe chaosresult  -n <namespace>`\n

    This can occur sometimes due to the time taken in pulling the image starting the experiment pod (note that the ChaosResult resource is generated by the experiment). For the above commands to execute successfully, you should simply wait for the experiment pod to be created. The waiting time will be based upon resource available (network bandwidth, space availability on the node filesyste

    "},{"location":"experiments/troubleshooting/experiments/#the-helper-pod-is-getting-in-a-failed-state-due-to-container-runtime-issue","title":"The helper pod is getting in a failed state due to container runtime issue","text":"View the error message

    time=\"2021-07-15T10:26:04Z\" level=fatal msg=\"helper pod failed, err: Unable to run command, err: exit status 1; error output: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\"

    OR

    time=\"2021-07-16T22:21:02Z\" level=error msg=\"[docker]: Failed to run docker inspect: []\\nError: No such object: 1807fec21ccad1101bbb63a7d412be15414f807316572f9e043b9f4a3e7c4acc\\n\" time=\"2021-07-16T22:21:02Z\" level=fatal msg=\"helper pod failed, err: exit status 1\"

    The default values for CONTAINER_RUNTIME & SOCKET_PATH env is for docker runtime. Please check if the cluster runtime is other than docker i.e, containerd then update above ENVs as follow:

    • For containerd runtime:

      • CONTAINER_RUNTIME: containerd
      • SOCKET_PATH: /run/containerd/containerd.sock
    • For CRIO runtime:

      • CONTAINER_RUNTIME: crio
      • SOCKET_PATH: /run/crio/crio.sock

    NOTE: The above values are the common ones and may vary based on the cluster you\u2019re using.

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-fail-with-the-error-message","title":"Disk Fill fail with the error message","text":"View the error message

    time=\"2021-08-12T05:27:39Z\" level=fatal msg=\"helper pod failed, err: either provide ephemeral storage limit inside target container or define EPHEMERAL_STORAGE_MEBIBYTES ENV\"

    The disk fill experiment needs to have either ephemeral storage limit defined in the application or you can provide the value in mebibytes using EPHEMERAL_STORAGE_MEBIBYTES ENV in the chaos engine. Either of them is required. For more details refer: FILL_PERCENTAGE and EPHEMERAL_STORAGE_MEBIBYTES

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-failed-with-error","title":"Disk Fill failed with error:","text":"View the error message

    time=\"2021-08-12T05:41:45Z\" level=error msg=\"du: /diskfill/8a1088e3fd50a31d5f0d383ae2258d9975f1df152ff92b3efd570a44e952a732: No such file or directory\\n\" time=\"2021-08-12T05:41:45Z\" level=fatal msg=\"helper pod failed, err: exit status 1\"

    This could be due to multiple issues in filling the disk of a container the most common one is invalid CONTAINER_PATH env set in the chaosengine. The default container path env is common for most of the use-cases and that is /var/lib/docker/containers

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-experiment-fails-with-an-error-pointing-to-the-helper-pods-being-unable-to-finish-in-the-given-duration","title":"Disk fill experiment fails with an error pointing to the helper pods being unable to finish in the given duration.","text":"

    This could be possible when the provided block size is quite less and the empirical storage value is high. In this case, it may need more time than the given chaos duration to fill the disk.

    "},{"location":"experiments/troubleshooting/experiments/#the-infra-experiments-like-node-drain-node-taint-kubelet-service-kill-to-act-on-the-litmus-pods-only","title":"The infra experiments like node drain, node taint, kubelet service kill to act on the litmus pods only.","text":"

    Ans: These are the infra level experiments, we need to cordon the target node so that the application pods don\u2019t get scheduled on it and use node selector in the chaos engine to specify the nodes for the experiment pods. Refer to the this to know how to schedule experiments on a certain node.

    "},{"location":"experiments/troubleshooting/experiments/#aws-experiments-failed-with-the-following-error","title":"AWS experiments failed with the following error","text":"View the error message

    time=\"2021-08-12T10:25:57Z\" level=error msg=\"failed perform ssm api calls, err: UnrecognizedClientException: The security token included in the request is invalid.\\n\\tstatus code: 400, request id: 68f0c2e8-a7ed-4576-8c75-0a3ed497efb9\"

    The AWS experiment needs authentication to connect & perform actions on the aws services we can provide this with the help of the secret as shown below:

    View the secret manifest
    apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n

    Make sure you have all the required permissions attached with your IAM to perform the chaos operation on the given service. If you are running the experiment in an EKS cluster then you have one more option than creating a secret, you can map the IAM role with the service account refer to this for more details.

    "},{"location":"experiments/troubleshooting/experiments/#in-aws-ssm-chaos-i-have-provided-the-aws-in-secret-but-still-not-able-to-inject-the-ssm-chaos-on-the-target-instance","title":"In AWS SSM Chaos I have provided the aws in secret but still not able to inject the SSM chaos on the target instance","text":"View the error message

    time='2021-08-13T09:30:47Z' level=error msg='failed perform ssm api calls, err: error: the instance id-qqw2-123-12- might not have suitable permission or IAM attached to it. use \\'aws ssm describe-instance-information\\' to check the available instances'

    Ensure that you have the required AWS access and your target EC2 instances have attached an IAM instance profile. To know more checkout Systems Manager Docs

    "},{"location":"experiments/troubleshooting/experiments/#gcp-vm-disk-loss-experiment-fails-unexpectedly-where-the-disk-gets-detached-successfully-but-fails-to-attach-back-to-the-instance-what-can-be-the-reason","title":"GCP VM Disk Loss experiment fails unexpectedly where the disk gets detached successfully but fails to attach back to the instance. What can be the reason?","text":"

    The GCP VM Disk Loss experiment requires a GCP Service Account having a Project Editor or higher permission to execute. This could be because of an issue in the GCP GoLang Compute Engine API, which fails to attach the disk using the attachDisk method with a Compute Admin or lower permission.

    "},{"location":"experiments/troubleshooting/experiments/#in-pod-level-stress-chaos-experiments-like-pod-memory-hog-or-pod-io-stress-after-the-chaos-is-injected-successfully-the-helper-fails-with-an-error-message","title":"In pod level stress chaos experiments like pod memory hog or pod io stress after the chaos is injected successfully the helper fails with an error message","text":"View the error message

    Error: process exited before the actual cleanup

    The error message indicates that the stress process inside the target container is somehow removed before the actual cleanup. There could be multiple reasons for this: the target container might have just got restarted due to excessive load on the container which it can\u2019t handle and the kubelet terminated that replica and launches a new one (if applicable) and reports an OOM event on the older one.

    "},{"location":"experiments/troubleshooting/experiments/#experiment-failed-for-the-istio-enabled-namespaces","title":"Experiment failed for the istio enabled namespaces","text":"View the error message

    W0817 06:32:26.531145 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. time=\"2021-08-17T06:32:26Z\" level=error msg=\"unable to get ChaosEngineUID, error: unable to get ChaosEngine name: pod-delete-chaos, in namespace: default, error: Get \\\"https://10.100.0.1:443/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/pod-delete-chaos\\\": dial tcp 10.100.0.1:443: connect: connection refused\"

    If istio is enabled for the chaos-namespace, it will launch the chaos-runner and chaos-experiment pods with the istio sidecar. Which may block/delay the external traffic of those pods for the intial few seconds. Which can fail the experiment.

    We can fix the above failure by avoiding istio sidecar for the chaos pods. Refer the following manifest:

    View the ChaosEngine manifest with the required annotations
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  components:\n    runner:\n      # annotation for the chaos-runner\n      runnerAnnotations:\n        sidecar.istio.io/inject: \"false\"\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        #annotations for the experiment pod \n        experimentAnnotations:\n          sidecar.istio.io/inject: \"false\"\n        env:\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/troubleshooting/install/","title":"Install","text":""},{"location":"experiments/troubleshooting/install/#table-of-contents","title":"Table of Contents","text":"
    1. The Litmus ChaosOperator is seen to be in CrashLoopBackOff state immediately after installation?

    2. Litmus uninstallation is not successful and namespace is stuck in terminating state?

    "},{"location":"experiments/troubleshooting/install/#the-litmus-chaosoperator-is-seen-to-be-in-crashloopbackoff-state-immediately-after-installation","title":"The Litmus ChaosOperator is seen to be in CrashLoopBackOff state immediately after installation?","text":"

    Verify if the ChaosEngine custom resource definition (CRD) has been installed in the cluster. This can be verified with the following commands:

    kubectl get crds | grep chaos\n
    kubectl api-resources | grep chaos\n

    If not created, install it from here

    "},{"location":"experiments/troubleshooting/install/#litmus-uninstallation-is-not-successful-and-namespace-is-stuck-in-terminating-state","title":"Litmus uninstallation is not successful and namespace is stuck in terminating state?","text":"

    Under typical operating conditions, the ChaosOperator makes use of finalizers to ensure that the ChaosEngine is deleted only after chaos resources (chaos-runner, experiment pod, any other helper pods) are removed.

    When uninstalling Litmus via the operator manifest, which contains the namespace, operator, and crd specifications in a single YAML, without deleting the existing chaosengine resources first, the ChaosOperator deployment may get deleted before the CRD removal is attempted. Since the stale chaosengines have the finalizer present on them, their deletion (triggered by the CRD delete) and by consequence, the deletion of the chaosengine CRD itself is \"stuck\".

    In such cases, manually remove the finalizer entries on the stale chaosengines to facilitate their successful delete. To get the chaosengine, run:

    kubectl get chaosengine -n <namespace>

    followed by:

    kubectl edit chaosengine <chaosengine-name> -n <namespace> and remove the finalizer entry chaosengine.litmuschaos.io/finalizer

    Repeat this on all the stale chaosengine CRs to remove the CRDs successfully & complete uninstallation process.

    If however, the litmus namespace deletion remains stuck despite the above actions, follow the procedure described here to complete the uninstallation.

    "},{"location":"experiments/troubleshooting/portal/","title":"Litmus Portal","text":""},{"location":"experiments/troubleshooting/portal/#table-of-contents","title":"Table of Contents","text":"
    1. We were setting up a Litmus Portal, however, Self-Agent status is showing pending. Any idea why is happening?

    2. After logging in for the first time to the portal, /get-started page kept loading after I provided the new password

    3. Subscriber is crashing with the error dial:websocket: bad handshake

    4. Not able to connect to the LitmusChaos Control Plane hosted on GKE cluster

    5. I forgot my Litmus portal password. How can I reset my credentials?

    6. While Uninstalling Litmus portal using helm, some components like subscriber, exporter, event, workflows, etc, are not removed

    7. Unable to Install Litmus portal using helm. Server pod and mongo pod are in CrashLoopBackOff state. Got this error while checking the logs of mongo container chown: changing ownership of '/data/db/.snapshot': Read-only file system

    8. Pre-defined workflow Bank Of Anthos showing bus error for accounts-db or ledger-db pod?

    "},{"location":"experiments/troubleshooting/portal/#we-were-setting-up-a-litmus-portal-however-self-agent-status-is-showing-pending-any-idea-why-is-happening","title":"We were setting up a Litmus Portal, however, Self-Agent status is showing pending. Any idea why is happening?","text":"

    The litmusportal-server-service might not be reachable due to inbound rules. You can enable the traffic to it if on GKE/EKS/AKS (by adding the port to inbound rules for traffic). You have to check the logs of the subscriber pod and expose the port mentioned for communication with the server.

    "},{"location":"experiments/troubleshooting/portal/#after-logging-in-for-the-first-time-to-the-portal-get-started-page-kept-loading-after-i-provided-the-new-password","title":"After logging in for the first time to the portal, /get-started page kept loading after I provided the new password.","text":"

    First, try to clear the browser cache and cookies and refresh the page, this might solve your problem. If your problem persists then delete all the cluster role bindings,PV, and PVC used by litmus and try to reinstall the litmus again.

    "},{"location":"experiments/troubleshooting/portal/#subscriber-is-crashing-with-the-error-dialwebsocket-bad-handshake","title":"Subscriber is crashing with the error dial:websocket: bad handshake","text":"

    It is a network issue. It seems your subscriber is unable to access the server. While installing the agent, It creates a config called agent-config to store some metadata like server endpoint, accesskey, etc. That server endpoint can be generated in many ways:

    • Ingress (If INGRESS=true in server deployment envs)
    • Loadbalancer (it generates lb type of IP based on the server svc type)
    • NodePort (it generates nodeport type of IP based on the server svc type)
    • ClusterIP (it generates clusterip type of IP based on the server svc type)

    So, you can edit the agent-config and update the node IP. Once edited, restart the subscriber. We suggest using ingress, so that if the endpoint IP changes, then it won't affect your agent.

    "},{"location":"experiments/troubleshooting/portal/#not-able-to-connect-to-the-litmuschaos-control-plane-hosted-on-gke-cluster","title":"Not able to connect to the LitmusChaos Control Plane hosted on GKE cluster.","text":"

    In GKE you have to setup a firewall rule to allow TCP traffic on the node port.You can use the following command: gcloud compute firewall-rules create test-node-port --allow tcp:port If this firewall rule is set up, it may be accessible on nodeIp:port where nodeIp is the external IP address of your node.

    "},{"location":"experiments/troubleshooting/portal/#i-forgot-my-litmus-portal-password-how-can-i-reset-my-credentials","title":"I forgot my Litmus portal password. How can I reset my credentials?","text":"

    You can reset by running the followin command:

    kubectl exec -it mongo-0 -n litmus -- mongo -u admin -p 1234 <<< $'use auth\\ndb.usercredentials.update({username:\"admin\"},{$set:{password:\"$2a$15$sNuQl9y/Ok92N19UORcro.3wulEyFi0FfJrnN/akOQe3uxTZAzQ0C\"}})\\nexit\\n'\n
    Make sure to update the namespace and mongo pod name according to your setup,the rest should remain the same. This command will update the password to litmus.

    "},{"location":"experiments/troubleshooting/portal/#while-uninstalling-litmus-portal-using-helm-some-components-like-subscriber-exporter-event-workflows-etc-are-not-removed","title":"While Uninstalling Litmus portal using helm, some components like subscriber, exporter, event, workflows, etc, are not removed.","text":"

    These are agent components, which are launched by the control plane server, so first disconnect the agent from the portal then uninstall the portal using helm.

    "},{"location":"experiments/troubleshooting/portal/#unable-to-install-litmus-portal-using-helm-server-pod-and-mongo-pod-are-in-crashloopbackoff-state-got-this-error-while-checking-the-logs-of-mongo-container-chown-changing-ownership-of-datadbsnapshot-read-only-file-system","title":"Unable to Install Litmus portal using helm. Server pod and mongo pod are in CrashLoopBackOff state. Got this error while checking the logs of mongo container chown: changing ownership of '/data/db/.snapshot': Read-only file system","text":"

    It seems the directory somehow existed before litmus installation and might be used by some other application. You have to change the mount path from /consul/config to /consul/myconfig in mongo statefulset then you can successfully deploy the litmus.

    "},{"location":"experiments/troubleshooting/portal/#pre-defined-workflow-bank-of-anthos-showing-bus-error-for-accounts-db-or-ledger-db-pod","title":"Pre-defined workflow Bank Of Anthos showing bus error for accounts-db or ledger-db pod?","text":"

    Bank of anthos is using PostgreSQL and wouldn't fall back properly to not using huge pages. With given possible solution if same scenario occur can be resolve.

    • Modify the docker image to be able to set\u00a0huge_pages = off\u00a0in /usr/share/postgresql/postgresql.conf.sample before initdb was ran (this is what I did).
    • Turn off huge page support on the system (vm.nr_hugepages = 0\u00a0in /etc/sysctl.conf).
    • Fix Postgres's fallback mechanism when\u00a0huge_pages = try\u00a0is set (the default).
    • Modify the k8s manifest to enable huge page support (https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/).
    • Modify k8s to show that huge pages are not supported on the system, when they are not enabled for a specific container.
    "},{"location":"experiments/troubleshooting/scheduler/","title":"Chaos Scheduler","text":""},{"location":"experiments/troubleshooting/scheduler/#table-of-contents","title":"Table of Contents","text":"
    1. Scheduler not creating chaosengines for type=repeat?
    "},{"location":"experiments/troubleshooting/scheduler/#scheduler-not-creating-chaosengines-for-typerepeat","title":"Scheduler not creating chaosengines for type=repeat?","text":"

    If the ChaosSchedule has been created successfully created in the cluster and ChaosEngine is not being formed, the most common problem is that either start or end time has been wrongly specified. We should verify the times. We can identify if this is the problem or not by changing to type=now. If the ChaosEngine is formed successfully then the problem is with the specified time ranges, if ChaosEngine is still not formed, then the problem is with engineSpec.

    "}]} \ No newline at end of file +{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"ROADMAP/","title":"Roadmap","text":""},{"location":"ROADMAP/#litmus-roadmap","title":"LITMUS ROADMAP","text":"

    This document captures only the high level roadmap items. For the detailed backlog, see issues list.

    "},{"location":"ROADMAP/#completed","title":"Completed","text":"
    • Declarative Chaos Intent via custom resources
    • Chaos Operator to orchestrate chaos experiments
    • Off the shelf / ready chaos experiments for general Kubernetes chaos
    • Self sufficient, Centralized Hub for chaos experiments
    • Per-experiment minimal RBAC permissions definition
    • Creation of 'scenarios' involving multiple faults via Argo-based Chaos Workflows (with examples for microservices apps like podtato-head and sock-shop)
    • Cross-Cloud Control Plane (Litmus Portal) to perform chaos against remote clusters
    • Helm charts for LitmusChaos control plane
    • Helm Chart for LitmusChaos execution Plane
    • Support for admin mode (centralized chaos management) as well as namespaced mode (multi-tenant clusters)
    • Continuous chaos via flexible schedules, with support to halt/resume or (manual/conditional) abort experiments
    • Provide complete workflow termination/abort capability
    • Generation of observability data via Prometheus metrics and Kubernetes chaos events for experiments
    • Steady-State hypothesis validation before, during and after chaos injection via different probe types
    • Support for Docker, Containerd & CRI-O runtime
    • Support for scheduling policies (nodeSelector, tolerations) and resource definitions for chaos pods
    • ChaosHub refactor for 2.x user flow
    • Support for ARM64 nodes
    • Minimized role permissions for Chaos Service Accounts
    • Scaffolding scripts (SDK) to help bootstrap a new chaos experiment in Go, Python, Ansible
    • Support orchestration of non-native chaos libraries via the BYOC (Bring-Your-Own-Chaos) model
    • Support for OpenShift platform
    • Workflow YAML linter addition
    • Integration tests & e2e framework creation for control plane components and chaos experiments
    • Documentation (usage guide for chaos operator, resources & developer guide for new experiment creation)
    • Improved documentation and tutorials for Litmus Portal based execution flow
    • Add architecture details & design resources
    • Define community sync up cadence and structure
    "},{"location":"ROADMAP/#in-progress-under-design-or-active-development","title":"In-Progress (Under Design OR Active Development)","text":"
    • Native Chaos Workflows with redesigned subscriber to improve resource delegation, enabling seamless and efficient execution of chaos workflows within Kubernetes clusters.
    • Introduce transient runners to improve resource efficiency during chaos experiments by dynamically creating and cleaning up chaos runner instances.
    • Implement Kubernetes connectors to enable streamlined integration with Kubernetes clusters, providing simplified authentication and configuration management.
    • Integrate with tools like K8sGPT to generate insightful reports that identify potential weaknesses in your Kubernetes environment before executing chaos experiments.
    • Add Terraform support for defining and executing chaos experiments on infrastructure components, enabling infrastructure-as-code-based chaos engineering.
    • Add SDK support for Python and Java, with potential extensions to other programming languages based on community interest.
    • Include in-product documentation, such as tooltips, to improve user experience and ease of adoption.
    • Implement the litmus-java-sdk with a targeted v1.0.0 release by Q1.
    • Integrate distributed tracing by adding attributes or events to spans, and create an OpenTelemetry demo showcasing chaos engineering observability.
    • Enhance the exporter to function as an OpenTelemetry collector, providing compatibility with existing observability pipelines.
    • Add support for DocumentDB by replacing certain MongoDB operations, improving flexibility for database chaos.
    • Upgrade Kubernetes SDK from version 1.21 to 1.26 to stay aligned with the latest Kubernetes features and enhancements.
    • Refactor the chaos charts to:
    • Replace latest tags with specific, versioned image tags.
    • Consolidate multiple images into a single optimized image.
    • Update GraphQL and authentication API documentation for improved clarity and user guidance.
    • Add comprehensive unit and fuzz tests to enhance code reliability and robustness.
    • Implement out-of-the-box Slack integration for better collaboration and monitoring during chaos experiments.
    "},{"location":"ROADMAP/#backlog","title":"Backlog","text":"
    • Validation support for all ChaosEngine schema elements within workflow wizard
    • Chaos-center users account to chaosService account map
    • Cross-hub experiment support within a Chaos Workflow
    • Enhanced CRD schema for ChaosEngine to support advanced CommandProbe configuration
    • Support for S3 artifact sink (helps performance/benchmark runs)
    • Chaos experiments against virtual machines and cloud infrastructure (AWS, GCP, Azure, VMWare, Baremetal)
    • Off the shelf chaos-integrated monitoring dashboards for application chaos categories
    • Support for user defined chaos experiment result definition
    • Increased fault injection types (IOChaos, HTTPChaos, JVMChaos)
    • Special Interest Groups (SIGs) around specific areas in the project to take the roadmap forward
    "},{"location":"experiments/api/contents/","title":"Litmus API Documentation","text":"Name Description References AUTH Server Contains AUTH Server API documentation AUTH Server GraphQL Server Contains GraphQL Server API documentation GraphQL Server"},{"location":"experiments/categories/contents/","title":"Experiments","text":"

    The experiment execution is triggered upon creation of the ChaosEngine resource (various examples of which are provided under the respective experiments). Typically, these chaosengines are embedded within the 'steps' of a Litmus Chaos Workflow here. However, one may also create the chaos engines directly by hand, and the chaos-operator reconciles this resource and triggers the experiment execution.

    Provided below are tables with links to the individual experiment docs for easy navigation

    "},{"location":"experiments/categories/contents/#kubernetes-experiments","title":"Kubernetes Experiments","text":"

    It contains chaos experiments which apply on the resources, which are running on the kubernetes cluster. It contains Generic experiments.

    Following Kubernetes Chaos experiments are available:

    "},{"location":"experiments/categories/contents/#generic","title":"Generic","text":"

    Chaos actions that apply to generic Kubernetes resources are classified into this category. Following chaos experiments are supported under Generic Chaos Chart

    "},{"location":"experiments/categories/contents/#pod-chaos","title":"Pod Chaos","text":"Experiment Name Description User Guide Container Kill Kills the container in the application pod container-kill Disk Fill Fillup Ephemeral Storage of a Resourced disk-fill Pod Autoscaler Scales the application replicas and test the node autoscaling on cluster pod-autoscaler Pod CPU Hog Exec Consumes CPU resources on the application container by invoking a utility within the app container base image pod-cpu-hog-exec Pod CPU Hog Consumes CPU resources on the application container pod-cpu-hog Pod Delete Deletes the application pod pod-delete Pod DNS Error Disrupt dns resolution in kubernetes po pod-dns-error Pod DNS Spoof Spoof dns resolution in kubernetes pod pod-dns-spoof Pod IO Stress Injects IO stress resources on the application container pod-io-stress Pod Memory Hog Exec Consumes Memory resources on the application container by invoking a utility within the app container base image pod-memory-hog-exec Pod Memory Hog Consumes Memory resources on the application container pod-memory-hog Pod Network Corruption Injects Network Packet Corruption into Application Pod pod-network-corruption Pod Network Duplication Injects Network Packet Duplication into Application Pod pod-network-duplication Pod Network Latency Injects Network latency into Application Pod pod-network-latency Pod Network Loss Injects Network loss into Application Pod pod-network-loss Pod HTTP Latency Injects HTTP latency into Application Pod pod-http-latency Pod HTTP Reset Peer Injects HTTP reset peer into Application Pod pod-http-reset-peer Pod HTTP Status Code Injects HTTP status code chaos into Application Pod pod-http-status-code Pod HTTP Modify Body Injects HTTP modify body into Application Pod pod-http-modify-body Pod HTTP Modify Header Injects HTTP Modify Header into Application Pod pod-http-modify-header"},{"location":"experiments/categories/contents/#node-chaos","title":"Node Chaos","text":"Experiment Name Description User Guide Docker Service Kill Kills the docker service on the application node docker-service-kill Kubelet Service Kill Kills the kubelet service on the application node kubelet-service-kill Node CPU Hog Exhaust CPU resources on the Kubernetes Node node-cpu-hog Node Drain Drains the target node node-drain Node IO Stress Injects IO stress resources on the application node node-io-stress Node Memory Hog Exhaust Memory resources on the Kubernetes Node node-memory-hog Node Restart Restarts the target node node-restart Node Taint Taints the target node node-taint"},{"location":"experiments/categories/contents/#application-chaos","title":"Application Chaos","text":"

    While Chaos Experiments under the Generic category offer the ability to induce chaos into Kubernetes resources, it is difficult to analyze and conclude if the chaos induced found a weakness in a given application. The application specific chaos experiments are built with some checks on pre-conditions and some expected outcomes after the chaos injection. The result of the chaos experiment is determined by matching the outcome with the expected outcome.

    Experiment Name Description User Guide Spring Boot App Kill Kill the spring boot application spring-boot-app-kill Spring Boot CPU Stress Stress the CPU of the spring boot application spring-boot-cpu-stress Spring Boot Memory Stress Stress the memory of the spring boot application spring-boot-memory-stress Spring Boot Latency Inject latency to the spring boot application network spring-boot-latency Spring Boot Exception Raise exceptions to the spring boot application spring-boot-exceptions Spring Boot Faults It injects the multiple spring boot faults simultaneously on the target pods spring-boot-faults"},{"location":"experiments/categories/contents/#load-chaos","title":"Load Chaos","text":"

    Load chaos contains different chaos experiments to test the app/platform service availability. It will install all the experiments which can be used to inject load into the services like VMs, Pods and so on.

    Experiment Name Description User Guide k6 Load Generator Generate load using single js script k6-loadgen"},{"location":"experiments/categories/contents/#cloud-infrastructure","title":"Cloud Infrastructure","text":"

    Chaos experiments that inject chaos into the platform resources of Kubernetes are classified into this category. Management of platform resources vary significantly from each other, Chaos Charts may be maintained separately for each platform (For example, AWS, GCP, Azure, etc)

    Following Platform Chaos experiments are available:

    "},{"location":"experiments/categories/contents/#aws","title":"AWS","text":"Experiment Name Description User Guide EC2 Stop By ID Stop the EC2 instance matched by instance id ec2-stop-by-id EC2 Stop By Tag Stop the EC2 instance matched by instance tag ec2-stop-by-tag EBS Loss By ID Detach the EBS volume matched by volume id ebs-loss-by-id EBS Loss By Tag Detach the EBS volume matched by volume tag ebs-loss-by-tag"},{"location":"experiments/categories/contents/#gcp","title":"GCP","text":"Experiment Name Description User Guide GCP VM Instance Stop Stop the gcp vm instance gcp-vm-instance-stop GCP VM Disk Loss Detach the gcp disk gcp-vm-disk-loss"},{"location":"experiments/categories/contents/#azure","title":"Azure","text":"Experiment Name Description User Guide Azure Instance Stop Stop the azure instance azure-instance-stop Azure Disk Loss Detach azure disk from instance azure-disk-loss"},{"location":"experiments/categories/contents/#vmware","title":"VMWare","text":"Experiment Name Description User Guide VM Poweroff Poweroff the vmware VM vm-poweroff"},{"location":"experiments/categories/aws/AWS-experiments-tunables/","title":"AWS experiments tunables","text":"

    It contains the AWS specific experiment tunables.

    "},{"location":"experiments/categories/aws/AWS-experiments-tunables/#managed-nodegroup","title":"Managed Nodegroup","text":"

    It specifies whether aws instances are part of managed nodeGroups. If instances belong to the managed nodeGroups then provide MANAGED_NODEGROUP as enable else provide it as disable. The default value is disabled.

    Use the following example to tune this:

    # it provided as enable if instances are part of self managed groups\n# it is applicable for [ec2-terminate-by-id, ec2-terminate-by-tag]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-terminate-by-tag\n    spec:\n      components:\n        env:\n        # if instance is part of a managed node-group\n        # supports enable and disable values, default value: disable\n        - name: MANAGED_NODEGROUP\n          value: 'enable'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instances>'\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/AWS-experiments-tunables/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-terminate-by-tag\n    spec:\n      components:\n        env:\n         # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: REGION\n          value: '<region for instances>'\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-id/","title":"EBS Loss By ID","text":""},{"location":"experiments/categories/aws/ebs-loss-by-id/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt state of ebs volume by detaching it from the node/ec2 instance for a certain chaos duration using volume id.
    • In case of EBS persistent volumes, the volumes can get self-attached and experiment skips the re-attachment step. Tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the application pod.

    Scenario: Detach EBS Volume

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the ebs-loss-by-id experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient AWS access to attach or detach an ebs volume for the instance.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#default-validations","title":"Default Validations","text":"View the default validations
    • EBS volume is attached to the instance.
    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ebs-loss-by-id-sa\n  namespace: default\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ebs-loss-by-id-sa\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ebs-loss-by-id-sa\n  labels:\n    name: ebs-loss-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ebs-loss-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: ebs-loss-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EBS_VOLUME_ID Comma separated list of volume IDs subjected to ebs detach chaos REGION The region name for the target volumes

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The time duration between the attachment and detachment of the volumes (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple volumes Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ebs-loss-by-id/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ebs-loss-by-id/#detach-volumes-by-id","title":"Detach Volumes By ID","text":"

    It contains comma separated list of volume IDs subjected to ebs detach chaos. It can be tuned via EBS_VOLUME_ID ENV.

    Use the following example to tune this:

    # contains ebs volume id \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-id-sa\n  experiments:\n  - name: ebs-loss-by-id\n    spec:\n      components:\n        env:\n        # id of the ebs volume\n        - name: EBS_VOLUME_ID\n          value: 'ebs-vol-1'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/","title":"EBS Loss By Tag","text":""},{"location":"experiments/categories/aws/ebs-loss-by-tag/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt state of ebs volume by detaching it from the node/ec2 instance for a certain chaos duration using volume tags.
    • In case of EBS persistent volumes, the volumes can get self-attached and experiment skips the re-attachment step. Tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the application pod.

    Scenario: Detach EBS Volume

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the ebs-loss-by-tag experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient AWS access to attach or detach an ebs volume for the instance.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#default-validations","title":"Default Validations","text":"View the default validations
    • EBS volume is attached to the instance.
    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ebs-loss-by-tag-sa\n  namespace: default\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ebs-loss-by-tag-sa\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ebs-loss-by-tag-sa\n  labels:\n    name: ebs-loss-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ebs-loss-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: ebs-loss-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EBS_VOLUME_TAG Provide the common tag for target volumes. It'll be in form of key:value (Ex: 'team:devops') REGION The region name for the target volumes

    Variables Description Notes VOLUME_AFFECTED_PERC The Percentage of total ebs volumes to target Defaults to 0 (corresponds to 1 volume), provide numeric value only TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The time duration between the attachment and detachment of the volumes (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple volumes Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ebs-loss-by-tag/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#target-single-volume","title":"Target single volume","text":"

    It will detach a random single ebs volume with the given EBS_VOLUME_TAG tag and REGION region.

    Use the following example to tune this:

    # contains the tags for the ebs volumes \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-tag-sa\n  experiments:\n  - name: ebs-loss-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ebs volume\n        - name: EBS_VOLUME_TAG\n          value: 'key:value'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_TAG>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ebs-loss-by-tag/#target-percent-of-volumes","title":"Target Percent of volumes","text":"

    It will detach the VOLUME_AFFECTED_PERC percentage of ebs volumes with the given EBS_VOLUME_TAG tag and REGION region.

    Use the following example to tune this:

    # target percentage of the ebs volumes with the provided tag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ebs-loss-by-tag-sa\n  experiments:\n  - name: ebs-loss-by-tag\n    spec:\n      components:\n        env:\n        # percentage of ebs volumes filter by tag\n        - name: VOLUME_AFFECTED_PERC\n          value: '100'\n        # tag of the ebs volume\n        - name: EBS_VOLUME_TAG\n          value: 'key:value'\n        # region for the ebs volume\n        - name: REGION\n          value: '<region for EBS_VOLUME_TAG>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-id/","title":"EC2 Stop By ID","text":""},{"location":"experiments/categories/aws/ec2-stop-by-id/#introduction","title":"Introduction","text":"
    • It causes stopping of an EC2 instance by instance ID or list of instance IDs before bringing it back to running state after the specified chaos duration.
    • It helps to check the performance of the application/process running on the ec2 instance. When the MANAGED_NODEGROUP is enable then the experiment will not try to start the instance post chaos instead it will check of the addition of the new node instance to the cluster.

    Scenario: Stop EC2 Instance

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the ec2-stop-by-id experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient AWS access to stop and start an ec2 instance.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#warning","title":"WARNING","text":"

    If the target EC2 instance is a part of a self-managed nodegroup: Make sure to drain the target node if any application is running on it and also ensure to cordon the target node before running the experiment so that the experiment pods do not schedule on it.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#default-validations","title":"Default Validations","text":"View the default validations
    • EC2 instance should be in healthy state.
    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ec2-stop-by-id-sa\n  namespace: default\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ec2-stop-by-id-sa\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ec2-stop-by-id-sa\n  labels:\n    name: ec2-stop-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ec2-stop-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: ec2-stop-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_ID Instance ID of the target ec2 instance. Multiple IDs can also be provided as a comma(,) separated values Multiple IDs can be provided as id1,id2 REGION The region name of the target instace

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance stop. Defaults to 30s MANAGED_NODEGROUP Set to enable if the target instance is the part of self-managed nodegroups Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ec2-stop-by-id/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ec2-stop-by-id/#stop-instances-by-id","title":"Stop Instances By ID","text":"

    It contains comma separated list of instances IDs subjected to ec2 stop chaos. It can be tuned via EC2_INSTANCE_ID ENV.

    Use the following example to tune this:

    # contains the instance id to be stopped\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-stop-by-id-sa\n  experiments:\n  - name: ec2-stop-by-id\n    spec:\n      components:\n        env:\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-1'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/","title":"EC2 Stop By Tag","text":""},{"location":"experiments/categories/aws/ec2-stop-by-tag/#introduction","title":"Introduction","text":"
    • It causes stopping of an EC2 instance by tag before bringing it back to running state after the specified chaos duration.
    • It helps to check the performance of the application/process running on the ec2 instance. When the MANAGED_NODEGROUP is enable then the experiment will not try to start the instance post chaos instead it will check of the addition of the new node instance to the cluster.

    Scenario: Stop EC2 Instance

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the ec2-stop-by-tag experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient AWS access to stop and start an ec2 instance.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#warning","title":"WARNING","text":"

    If the target EC2 instance is a part of a self-managed nodegroup: Make sure to drain the target node if any application is running on it and also ensure to cordon the target node before running the experiment so that the experiment pods do not schedule on it.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#default-validations","title":"Default Validations","text":"View the default validations
    • EC2 instance should be in healthy state.
    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: ec2-stop-by-tag-sa\n  namespace: default\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: ec2-stop-by-tag-sa\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: ec2-stop-by-tag-sa\n  labels:\n    name: ec2-stop-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: ec2-stop-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: ec2-stop-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_TAG Instance Tag to filter the target ec2 instance. The EC2_INSTANCE_TAG should be provided as key:value ex: team:devops REGION The region name of the target instace

    Variables Description Notes INSTANCE_AFFECTED_PERC The Percentage of total ec2 instance to target Defaults to 0 (corresponds to 1 instance), provide numeric value only TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination. Defaults to 30s MANAGED_NODEGROUP Set to enable if the target instance is the part of self-managed nodegroups Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws/ec2-stop-by-tag/#common-and-aws-specific-tunables","title":"Common and AWS specific tunables","text":"

    Refer the common attributes and AWS specific tunable to tune the common tunables for all experiments and aws specific tunables.

    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#target-single-instance","title":"Target single instance","text":"

    It will stop a random single ec2 instance with the given EC2_INSTANCE_TAG tag and the REGION region.

    Use the following example to tune this:

    # target the ec2 instances with matching tag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-terminate-by-tag-sa\n  experiments:\n  - name: ec2-stop-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instance>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws/ec2-stop-by-tag/#target-percent-of-instances","title":"Target Percent of instances","text":"

    It will stop the INSTANCE_AFFECTED_PERC percentage of ec2 instances with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    # percentage of ec2 instances, needs to terminate with provided tags\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: ec2-stop-by-tag-sa\n  experiments:\n  - name: ec2-stop-by-tag\n    spec:\n      components:\n        env:\n        # percentage of ec2 instance filterd by tags \n        - name: INSTANCE_AFFECTED_PERC\n          value: '100'\n        # tag of the ec2 instance\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region for the ec2 instance\n        - name: REGION\n          value: '<region for instance>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/","title":"AWS SSM experiments tunables","text":"

    It contains the aws-ssm specific experiment tunables.

    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#cpu-cores","title":"CPU Cores","text":"

    It stressed the CPU_CORE cpu cores of the EC2_INSTANCE_ID ec2 instance and REGION region for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # provide the cpu cores to stress the ec2 instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # cpu cores for the stress\n        - name: CPU_CORE\n          value: '1'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#memory-percentage","title":"Memory Percentage","text":"

    It stressed the MEMORY_PERCENTAGE percentage of free space of the EC2_INSTANCE_ID ec2 instance and REGION region for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # provide the memory pecentage to stress the instance memory\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # memory percentage for the stress\n        - name: MEMORY_PERCENTAGE\n          value: '80'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#ssm-docs","title":"SSM Docs","text":"

    It contains the details of the SSM docs i.e, name, type, the format of ssm-docs.

    Use the following example to tune this:

    ## provide the details of the ssm document details\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # name of the ssm docs\n        - name: DOCUMENT_NAME\n          value: 'AWS-SSM-Doc'\n        # format of the ssm docs\n        - name: DOCUMENT_FORMAT\n          value: 'YAML'\n        # type of the ssm docs\n        - name: DOCUMENT_TYPE\n          value: 'command'\n        # path of the ssm docs\n        - name: DOCUMENT_PATH\n          value: ''\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#workers-count","title":"Workers Count","text":"

    It contains the NUMBER_OF_WORKERS workers for the stress.

    Use the following example to tune this:

    # workers details used to stress the instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # number of workers used for stress\n        - name: NUMBER_OF_WORKERS\n          value: '1'\n        # id of the ec2 instance\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/AWS-SSM-experiments-tunables/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    specEC2_INSTANCE_ID:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: CPU_CORE\n          value: '1'\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01'\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/","title":"AWS SSM Chaos By ID","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#introduction","title":"Introduction","text":"
    • AWS SSM Chaos By ID contains chaos to disrupt the state of infra resources. The experiment can induce chaos on AWS EC2 instance using Amazon SSM Run Command This is carried out by using SSM Docs that defines the actions performed by Systems Manager on your managed instances (having SSM agent installed) which let us perform chaos experiments on the instances.
    • It causes chaos (like stress, network, disk or IO) on AWS EC2 instances with given instance ID(s) using SSM docs for a certain chaos duration.
    • For the default execution the experiment uses SSM docs for stress-chaos while you can add your own SSM docs using configMap (.spec.definition.configMaps) in chaosexperiment CR.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the target application pod(if provided).

    Scenario: AWS SSM Chaos

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the aws-ssm-chaos-by-id experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have the required AWS access and your target EC2 instances have attached an IAM instance profile. To know more checkout Systems Manager Docs.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#default-validations","title":"Default Validations","text":"View the default validations
    • EC2 instance should be in healthy state.
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  namespace: default\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n- apiGroups: [\"\"]\n  resources: [\"pods\"]\n  verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n# Performs CRUD operations on the events inside chaosengine and chaosresult\n- apiGroups: [\"\"]\n  resources: [\"events\"]\n  verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n# Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n- apiGroups: [\"\"]\n  resources: [\"secrets\",\"configmaps\"]\n  verbs: [\"get\",\"list\",]\n# Track and get the runner, experiment, and helper pods log \n- apiGroups: [\"\"]\n  resources: [\"pods/log\"]\n  verbs: [\"get\",\"list\",\"watch\"]  \n# for creating and managing to execute comands inside target container\n- apiGroups: [\"\"]\n  resources: [\"pods/exec\"]\n  verbs: [\"get\",\"list\",\"create\"]\n# for configuring and monitor the experiment job by the chaos-runner pod\n- apiGroups: [\"batch\"]\n  resources: [\"jobs\"]\n  verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n# for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n- apiGroups: [\"litmuschaos.io\"]\n  resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: aws-ssm-chaos-by-id-sa\n  labels:\n    name: aws-ssm-chaos-by-id-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: aws-ssm-chaos-by-id-sa\nsubjects:\n- kind: ServiceAccount\n  name: aws-ssm-chaos-by-id-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_ID Instance ID of the target ec2 instance. Multiple IDs can also be provided as a comma(,) separated values Multiple IDs can be provided as id1,id2 REGION The region name of the target instace

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive chaos injection Defaults to 60s AWS_SHARED_CREDENTIALS_FILE Provide the path for aws secret credentials Defaults to /tmp/cloud_config.yml DOCUMENT_NAME Provide the name of addded ssm docs (if not using the default docs) Default to LitmusChaos-AWS-SSM-Doc DOCUMENT_FORMAT Provide the format of the ssm docs. It can be YAML or JSON Defaults to YAML DOCUMENT_TYPE Provide the document type of added ssm docs (if not using the default docs) Defaults to Command DOCUMENT_PATH Provide the document path if added using configmaps Defaults to the litmus ssm docs path INSTALL_DEPENDENCIES Select to install dependencies used to run stress-ng with default docs. It can be either True or False Defaults to True NUMBER_OF_WORKERS Provide the number of workers to run stress-chaos with default ssm docs Defaults to 1 MEMORY_PERCENTAGE Provide the memory consumption in percentage on the instance for default ssm docs Defaults to 80 CPU_CORE Provide the number of cpu cores to run stress-chaos on EC2 with default ssm docs Defaults to 0. It means it'll consume all the available cpu cores on the instance SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#common-and-aws-ssm-specific-tunables","title":"Common and AWS-SSM specific tunables","text":"

    Refer the common attributes and AWS-SSM specific tunable to tune the common tunables for all experiments and aws-ssm specific tunables.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-id/#stress-instances-by-id","title":"Stress Instances By ID","text":"

    It contains comma separated list of instances IDs subjected to ec2 stop chaos. It can be tuned via EC2_INSTANCE_ID ENV.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-id-sa\n  experiments:\n  - name: aws-ssm-chaos-by-id\n    spec:\n      components:\n        env:\n        # comma separated list of ec2 instance id(s)\n        # all instances should belongs to the same region(REGION)\n        - name: EC2_INSTANCE_ID\n          value: 'instance-01,instance-02'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the EC2_INSTANCE_ID>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/","title":"AWS SSM Chaos By Tag","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#introduction","title":"Introduction","text":"
    • AWS SSM Chaos By Tag contains chaos to disrupt the state of infra resources. The experiment can induce chaos on AWS EC2 instance using Amazon SSM Run Command This is carried out by using SSM Docs that defines the actions performed by Systems Manager on your managed instances (having SSM agent installed) which let you perform chaos experiments on the instances.
    • It causes chaos (like stress, network, disk or IO) on AWS EC2 instances with given instance Tag using SSM docs for a certain chaos duration.
    • For the default execution the experiment uses SSM docs for stress-chaos while you can add your own SSM docs using configMap (.spec.definition.configMaps) in ChaosExperiment CR.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the target application pod(if provided).

    Scenario: AWS SSM Chaos

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the aws-ssm-chaos-by-tag experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have the required AWS access and your target EC2 instances have attached an IAM instance profile. To know more checkout Systems Manager Docs.
    • Ensure to create a Kubernetes secret having the AWS access configuration(key) in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n
    • If you change the secret key name (from cloud_config.yml) please also update the AWS_SHARED_CREDENTIALS_FILE ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#default-validations","title":"Default Validations","text":"View the default validations
    • EC2 instance should be in healthy state.
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  namespace: default\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n- apiGroups: [\"\"]\n  resources: [\"pods\"]\n  verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n# Performs CRUD operations on the events inside chaosengine and chaosresult\n- apiGroups: [\"\"]\n  resources: [\"events\"]\n  verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n# Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n- apiGroups: [\"\"]\n  resources: [\"secrets\",\"configmaps\"]\n  verbs: [\"get\",\"list\",]\n# Track and get the runner, experiment, and helper pods log \n- apiGroups: [\"\"]\n  resources: [\"pods/log\"]\n  verbs: [\"get\",\"list\",\"watch\"]  \n# for creating and managing to execute comands inside target container\n- apiGroups: [\"\"]\n  resources: [\"pods/exec\"]\n  verbs: [\"get\",\"list\",\"create\"]\n# for configuring and monitor the experiment job by the chaos-runner pod\n- apiGroups: [\"batch\"]\n  resources: [\"jobs\"]\n  verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n# for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n- apiGroups: [\"litmuschaos.io\"]\n  resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: aws-ssm-chaos-by-tag-sa\n  labels:\n    name: aws-ssm-chaos-by-tag-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: aws-ssm-chaos-by-tag-sa\nsubjects:\n- kind: ServiceAccount\n  name: aws-ssm-chaos-by-tag-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes EC2_INSTANCE_TAG Instance Tag to filter the target ec2 instance The EC2_INSTANCE_TAG should be provided as key:value ex: chaos:ssm REGION The region name of the target instace

    Variables Description Notes INSTANCE_AFFECTED_PERC The Percentage of total ec2 instance to target Defaults to 0 (corresponds to 1 instance), provide numeric value only TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive chaos injection Defaults to 60s AWS_SHARED_CREDENTIALS_FILE Provide the path for aws secret credentials Defaults to /tmp/cloud_config.yml DOCUMENT_NAME Provide the name of addded ssm docs (if not using the default docs) Default to LitmusChaos-AWS-SSM-Doc DOCUMENT_FORMAT Provide the format of the ssm docs. It can be YAML or JSON Defaults to YAML DOCUMENT_TYPE Provide the document type of added ssm docs (if not using the default docs) Defaults to Command DOCUMENT_PATH Provide the document path if added using configmaps Defaults to the litmus ssm docs path INSTALL_DEPENDENCIES Select to install dependencies used to run stress-ng with default docs. It can be either True or False Defaults to True NUMBER_OF_WORKERS Provide the number of workers to run stress-chaos with default ssm docs Defaults to 1 MEMORY_PERCENTAGE Provide the memory consumption in percentage on the instance for default ssm docs Defaults to 80 CPU_CORE Provide the number of cpu cores to run stress-chaos on EC2 with default ssm docs Defaults to 0. It means it'll consume all the available cpu cores on the instance SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#common-and-aws-ssm-specific-tunables","title":"Common and AWS-SSM specific tunables","text":"

    Refer the common attributes and AWS-SSM specific tunable to tune the common tunables for all experiments and aws-ssm specific tunables.

    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#target-single-instance","title":"Target single instance","text":"

    It will stress a random single ec2 instance with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-tag-sa\n  experiments:\n  - name: aws-ssm-chaos-by-tag\n    spec:\n      components:\n        env:\n        # tag of the ec2 instances\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the ec2 instances>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/aws-ssm/aws-ssm-chaos-by-tag/#target-percent-of-instances","title":"Target Percent of instances","text":"

    It will stress the INSTANCE_AFFECTED_PERC percentage of ec2 instances with the given EC2_INSTANCE_TAG tag and REGION region.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: aws-ssm-chaos-by-tag-sa\n  experiments:\n  - name: aws-ssm-chaos-by-tag\n    spec:\n      components:\n        env:\n        # percentage of the ec2 instances filtered by tags\n        - name: INSTANCE_AFFECTED_PERC\n          value: '100'\n        # tag of the ec2 instances\n        - name: EC2_INSTANCE_TAG\n          value: 'key:value'\n        # region of the ec2 instance\n        - name: REGION\n          value: '<region of the ec2 instances>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/","title":"Azure Disk Loss","text":""},{"location":"experiments/categories/azure/azure-disk-loss/#introduction","title":"Introduction","text":"
    • It causes detachment of virtual disk from an Azure instance before re-attaching it back to the instance after the specified chaos duration.
    • It helps to check the performance of the application/process running on the instance.

    Scenario: Detach the virtual disk from instance

    "},{"location":"experiments/categories/azure/azure-disk-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/azure/azure-disk-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the azure-disk-loss experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient Azure access to detach and attach a disk.
    • We will use azure file-based authentication to connect with the instance using azure GO SDK in the experiment. For generating auth file run az ad sp create-for-rbac --sdk-auth > azure.auth Azure CLI command.
    • Ensure to create a Kubernetes secret having the auth file created in the step in CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  azure.auth: |-\n    {\n      \"clientId\": \"XXXXXXXXX\",\n      \"clientSecret\": \"XXXXXXXXX\",\n      \"subscriptionId\": \"XXXXXXXXX\",\n      \"tenantId\": \"XXXXXXXXX\",\n      \"activeDirectoryEndpointUrl\": \"XXXXXXXXX\",\n      \"resourceManagerEndpointUrl\": \"XXXXXXXXX\",\n      \"activeDirectoryGraphResourceId\": \"XXXXXXXXX\",\n      \"sqlManagementEndpointUrl\": \"XXXXXXXXX\",\n      \"galleryEndpointUrl\": \"XXXXXXXXX\",\n      \"managementEndpointUrl\": \"XXXXXXXXX\"\n    }\n
    • If you change the secret key name (from azure.auth) please also update the AZURE_AUTH_LOCATION ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/azure/azure-disk-loss/#default-validations","title":"Default Validations","text":"View the default validations
    • Azure Disk should be connected to an instance.
    "},{"location":"experiments/categories/azure/azure-disk-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: azure-disk-loss-sa\n  namespace: default\n  labels:\n    name: azure-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: azure-disk-loss-sa\n  namespace: default\n  labels:\n    name: azure-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/azure/azure-disk-loss/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes VIRTUAL_DISK_NAMES Name of virtual disks to target. Provide comma separated names for multiple disks RESOURCE_GROUP The resource group of the target disk(s)

    Variables Description Notes SCALE_SET Whether disk is connected to Scale set instance Accepts \"enable\"/\"disable\". Default is \"disable\" TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance poweroff. Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/azure/azure-disk-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/azure/azure-disk-loss/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/azure/azure-disk-loss/#detach-virtual-disks-by-name","title":"Detach Virtual Disks By Name","text":"

    It contains comma separated list of disk names subjected to disk loss chaos. It can be tuned via VIRTUAL_DISK_NAMES ENV.

    Use the following example to tune this:

    # detach multiple azure disks by their names \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated names of the azure disks attached to VMs\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/#detach-virtual-disks-attached-to-scale-set-instances-by-name","title":"Detach Virtual Disks Attached to Scale Set Instances By Name","text":"

    It contains comma separated list of disk names attached to scale set instances subjected to disk loss chaos. It can be tuned via VIRTUAL_DISK_NAMES and SCALE_SET ENV.

    Use the following example to tune this:

    # detach multiple azure disks attached to scale set VMs by their names\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated names of the azure disks attached to scaleset VMs\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n        # VM belongs to scaleset or not\n        - name: SCALE_SET\n          value: 'enable'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-disk-loss/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-disk-loss-sa\n  experiments:\n  - name: azure-disk-loss\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '10'\n         # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: VIRTUAL_DISK_NAMES\n          value: 'disk-01,disk-02'\n        - name: RESOURCE_GROUP\n          value: '<resource group of VIRTUAL_DISK_NAMES>'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/","title":"Azure Instance Stop","text":""},{"location":"experiments/categories/azure/azure-instance-stop/#introduction","title":"Introduction","text":"
    • It causes PowerOff an Azure instance before bringing it back to running state after the specified chaos duration.
    • It helps to check the performance of the application/process running on the instance.

    Scenario: Stop the azure instance

    "},{"location":"experiments/categories/azure/azure-instance-stop/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/azure/azure-instance-stop/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the azure-instance-stop experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient Azure access to stop and start the an instance.
    • We will use azure file-based authentication to connect with the instance using azure GO SDK in the experiment. For generating auth file run az ad sp create-for-rbac --sdk-auth > azure.auth Azure CLI command.
    • Ensure to create a Kubernetes secret having the auth file created in the step in CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  azure.auth: |-\n    {\n      \"clientId\": \"XXXXXXXXX\",\n      \"clientSecret\": \"XXXXXXXXX\",\n      \"subscriptionId\": \"XXXXXXXXX\",\n      \"tenantId\": \"XXXXXXXXX\",\n      \"activeDirectoryEndpointUrl\": \"XXXXXXXXX\",\n      \"resourceManagerEndpointUrl\": \"XXXXXXXXX\",\n      \"activeDirectoryGraphResourceId\": \"XXXXXXXXX\",\n      \"sqlManagementEndpointUrl\": \"XXXXXXXXX\",\n      \"galleryEndpointUrl\": \"XXXXXXXXX\",\n      \"managementEndpointUrl\": \"XXXXXXXXX\"\n    }\n
    • If you change the secret key name (from azure.auth) please also update the AZURE_AUTH_LOCATION ENV value on experiment.yamlwith the same name.

    "},{"location":"experiments/categories/azure/azure-instance-stop/#default-validations","title":"Default Validations","text":"View the default validations
    • Azure instance should be in healthy state.
    "},{"location":"experiments/categories/azure/azure-instance-stop/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: azure-instance-stop-sa\n  namespace: default\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: azure-instance-stop-sa\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: azure-instance-stop-sa\n  labels:\n    name: azure-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: azure-instance-stop-sa\nsubjects:\n- kind: ServiceAccount\n  name: azure-instance-stop-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/azure/azure-instance-stop/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes AZURE_INSTANCE_NAMES Instance name of the target azure instance For AKS nodes, the instance name is from the scale set section in Azure and not the node name from AKS node pool RESOURCE_GROUP The resource group of the target instance

    Variables Description Notes SCALE_SET Whether instance is part of Scale set Accepts \"enable\"/\"disable\". Default is \"disable\" TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance power off. Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/azure/azure-instance-stop/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/azure/azure-instance-stop/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/azure/azure-instance-stop/#stop-instances-by-name","title":"Stop Instances By Name","text":"

    It contains comma separated list of instance names subjected to instance stop chaos. It can be tuned via AZURE_INSTANCE_NAME ENV.

    Use the following example to tune this:

    ## contains the azure instance details\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of azure instance names\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of AZURE_INSTANCE_NAME>'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/#stop-scale-set-instances","title":"Stop Scale Set Instances","text":"

    It contains comma separated list of instance names subjected to instance stop chaos belonging to Scale Set or AKS. It can be tuned via SCALE_SET ENV.

    Use the following example to tune this:

    ## contains the azure instance details for scale set instances or AKS nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of azure instance names\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # name of the resource group\n        - name: RESOURCE_GROUP\n          value: '<resource group of Scale set>'\n        # accepts enable/disable value. default is disable\n        - name: SCALE_SET\n          value: 'enable'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/azure/azure-instance-stop/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: azure-instance-stop-sa\n  experiments:\n  - name: azure-instance-stop\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '10'\n         # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: AZURE_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        - name: RESOURCE_GROUP\n          value: '<resource group of AZURE_INSTANCE_NAME>'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/","title":"Common tunables for all experiments","text":"

    It contains tunables, which are common for all the experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#duration-of-the-chaos","title":"Duration of the chaos","text":"

    It defines the total time duration of the chaos injection. It can be tuned with the TOTAL_CHAOS_DURATION ENV. It is provided in a unit of seconds.

    Use the following example to tune this:

    # define the total chaos duration\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#ramp-time","title":"Ramp Time","text":"

    It defines the period to wait before and after the injection of chaos. It can be tuned with the RAMP_TIME ENV. It is provided in a unit of seconds.

    Use the following example to tune this:

    # waits for the ramp time before and after injection of chaos \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # waits for the time interval before and after injection of chaos\n        - name: RAMP_TIME\n          value: '10' # in seconds\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#sequence-of-chaos-execution","title":"Sequence of chaos execution","text":"

    It defines the sequence of the chaos execution in the case of multiple targets. It can be tuned with the SEQUENCE ENV. It supports the following modes:

    • parallel: The chaos is injected in all the targets at once.
    • serial: The chaos is injected in all the targets one by one. The default value of SEQUENCE is parallel.

    Use the following example to tune this:

    # define the order of execution of chaos in case of multiple targets\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # define the sequence of execution of chaos in case of mutiple targets\n        # supports: serial, parallel. default: parallel\n        - name: SEQUENCE\n          value: 'parallel'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#name-of-chaos-library","title":"Name of chaos library","text":"

    It defines the name of the chaos library used for the chaos injection. It can be tuned with the LIB ENV.

    Use the following example to tune this:

    # lib for the chaos injection\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # defines the name of the chaoslib used for the experiment\n        - name: LIB\n          value: 'litmus'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#instance-id","title":"Instance ID","text":"

    It defines a user-defined string that holds metadata/info about the current run/instance of chaos. Ex: 04-05-2020-9-00. This string is appended as a suffix in the chaosresult CR name. It can be tuned with INSTANCE_ID ENV.

    Use the following example to tune this:

    # provide to append user-defined suffix in the end of chaosresult name\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # user-defined string appended as suffix in the chaosresult name\n        - name: INSTANCE_ID\n          value: '123'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/common/common-tunables-for-all-experiments/#image-used-by-the-helper-pod","title":"Image used by the helper pod","text":"

    It defines the image, which is used to launch the helper pod, if applicable. It can be tuned with the LIB_IMAGE ENV. It is supported by [container-kill, network-experiments, stress-experiments, dns-experiments, disk-fill, kubelet-service-kill, docker-service-kill, node-restart] experiments.

    Use the following example to tune this:

    # it contains the lib image used for the helper pod\n# it support [container-kill, network-experiments, stress-experiments, dns-experiments, disk-fill,\n# kubelet-service-kill, docker-service-kill, node-restart] experiments\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # nane of the lib image\n        - name: LIB_IMAGE\n          value: 'litmuschaos/go-runner:latest'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/","title":"GCP VM Disk Loss By Label","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt the state of GCP persistent disk volume filtered using a label by detaching it from its VM instance for a certain chaos duration.

    Scenario: detach the gcp disk

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the gcp-vm-disk-loss-by-label experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that your service account has an editor access or owner access for the GCP project.
    • Ensure that the target disk volume is not a boot disk of any VM instance.
    • Ensure to create a Kubernetes secret having the GCP service account credentials in the default namespace. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  type: \n  project_id: \n  private_key_id: \n  private_key: \n  client_email: \n  client_id: \n  auth_uri: \n  token_uri: \n  auth_provider_x509_cert_url: \n  client_x509_cert_url: \n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#default-validations","title":"Default Validations","text":"View the default validations
    • All the disk volumes having the target label are attached to their respective instances
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  namespace: default\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-disk-loss-by-label-sa\n  labels:\n    name: gcp-vm-disk-loss-by-label-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-disk-loss-by-label-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-disk-loss-by-label-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID The ID of the GCP Project of which the disk volumes are a part of All the target disk volumes should belong to a single GCP Project DISK_VOLUME_LABEL Label of the targeted non-boot persistent disk volume The DISK_VOLUME_LABEL should be provided as key:value or key if the corresponding value is empty ex: disk:target-disk ZONES The zone of target disk volumes Only one zone can be provided i.e. all target disks should lie in the same zone

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between the successive chaos iterations (sec) Defaults to 30s DISK_AFFECTED_PERC The percentage of total disks filtered using the label to target Defaults to 0 (corresponds to 1 disk), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple disks Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#detach-volumes-by-label","title":"Detach Volumes By Label","text":"

    It contains the label of disk volumes to be subjected to disk loss chaos. It will detach all the disks with the label DISK_VOLUME_LABEL in zone ZONES within the GCP_PROJECT_ID project. It re-attaches the disk volume after waiting for the specified TOTAL_CHAOS_DURATION duration.

    NOTE: The DISK_VOLUME_LABEL accepts only one label and ZONES also accepts only one zone name. Therefore, all the disks must lie in the same zone.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-by-label-sa\n  experiments:\n  - name: gcp-vm-disk-loss-by-label\n    spec:\n      components:\n        env:\n        - name: DISK_VOLUME_LABEL\n          value: 'disk:target-disk'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss-by-label/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-by-label-sa\n  experiments:\n  - name: gcp-vm-disk-loss-by-label\n    spec:\n      components:\n        env:\n        - name: CHAOS_INTERVAL\n          value: '15'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n\n        - name: DISK_VOLUME_LABEL\n          value: 'disk:target-disk'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/","title":"GCP VM Disk Loss","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt state of GCP persistent disk volume by detaching it from its VM instance for a certain chaos duration using the disk name.

    Scenario: detach the gcp disk

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the gcp-vm-disk-loss experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that your service account has an editor access or owner access for the GCP project.
    • Ensure that the target disk volume is not a boot disk of any VM instance.
    • Ensure to create a Kubernetes secret having the GCP service account credentials in the default namespace. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  type: \n  project_id: \n  private_key_id: \n  private_key: \n  client_email: \n  client_id: \n  auth_uri: \n  token_uri: \n  auth_provider_x509_cert_url: \n  client_x509_cert_url: \n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#default-validations","title":"Default Validations","text":"View the default validations
    • Disk volumes are attached to their respective instances
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  namespace: default\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-disk-loss-sa\n  labels:\n    name: gcp-vm-disk-loss-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-disk-loss-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-disk-loss-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID The ID of the GCP Project of which the disk volumes are a part of All the target disk volumes should belong to a single GCP Project DISK_VOLUME_NAMES Target non-boot persistent disk volume names Multiple disk volume names can be provided as disk1,disk2,... ZONES The zones of respective target disk volumes Provide the zone for every target disk name as zone1,zone2... in the respective order of DISK_VOLUME_NAMES DEVICE_NAMES The device names of respective target disk volumes Provide the device name for every target disk name as deviceName1,deviceName2... in the respective order of DISK_VOLUME_NAMES

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between the successive chaos iterations (sec) Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple disks Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#detach-volumes-by-names","title":"Detach Volumes By Names","text":"

    It contains comma separated list of volume names subjected to disk loss chaos. It will detach all the disks with the given DISK_VOLUME_NAMES disk names and corresponding ZONES zone names and the DEVICE_NAMES device names in GCP_PROJECT_ID project. It reattached the volume after waiting for the specified TOTAL_CHAOS_DURATION duration.

    NOTE: The DISK_VOLUME_NAMES contains multiple comma-separated disk names. The comma-separated zone names should be provided in the same order as disk names.

    Use the following example to tune this:

    ## details of the gcp disk\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-sa\n  experiments:\n  - name: gcp-vm-disk-loss\n    spec:\n      components:\n        env:\n        # comma separated list of disk volume names\n        - name: DISK_VOLUME_NAMES\n          value: 'disk-01,disk-02'\n        # comma separated list of zone names corresponds to the DISK_VOLUME_NAMES\n        # it should be provided in same order of DISK_VOLUME_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # comma separated list of device names corresponds to the DISK_VOLUME_NAMES\n        # it should be provided in same order of DISK_VOLUME_NAMES\n        - name: DEVICE_NAMES\n          value: 'device-01,device-02'\n        # gcp project id to which disk volume belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-disk-loss/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-disk-loss-sa\n  experiments:\n  - name: gcp-vm-disk-loss\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: DISK_VOLUME_NAMES\n          value: 'disk-01,disk-02'\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        - name: DEVICE_NAMES\n          value: 'device-01,device-02'\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/","title":"GCP VM Instance Stop By Label","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#introduction","title":"Introduction","text":"
    • It causes power-off of GCP VM instances filtered by a label before bringing it back to the running state after the specified chaos duration.
    • It helps to check the performance of the application/process running on the VM instance.
    • When the MANAGED_INSTANCE_GROUP is enable then the experiment will not try to start the instances post chaos, instead it will check the addition of new instances to the instance group.

    Scenario: stop the gcp vm

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the gcp-vm-instance-stop-by-label experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient GCP permissions to stop and start the GCP VM instances.
    • Ensure to create a Kubernetes secret having the GCP service account credentials in the default namespace. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  type: \n  project_id: \n  private_key_id: \n  private_key: \n  client_email: \n  client_id: \n  auth_uri: \n  token_uri: \n  auth_provider_x509_cert_url: \n  client_x509_cert_url: \n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#default-validations","title":"Default Validations","text":"View the default validations
    • All the VM instances having the target label are in a healthy state
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  namespace: default\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-instance-stop-by-label-sa\n  labels:\n    name: gcp-vm-instance-stop-by-label-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-instance-stop-by-label-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-instance-stop-by-label-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID GCP project ID to which the VM instances belong All the VM instances must belong to a single GCP project INSTANCE_LABEL Name of target VM instances The INSTANCE_LABEL should be provided as key:value or key if the corresponding value is empty ex: vm:target-vm ZONES The zone of the target VM instances Only one zone can be provided i.e. all target instances should lie in the same zone

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s MANAGED_INSTANCE_GROUP Set to enable if the target instance is the part of a managed instance group Defaults to disable INSTANCE_AFFECTED_PERC The percentage of total VMs filtered using the label to target Defaults to 0 (corresponds to 1 instance), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#target-gcp-instances","title":"Target GCP Instances","text":"

    It will stop all the instances with filtered by the label INSTANCE_LABEL and corresponding ZONES zone in GCP_PROJECT_ID project.

    NOTE: The INSTANCE_LABEL accepts only one label and ZONES also accepts only one zone name. Therefore, all the instances must lie in the same zone.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#manged-instance-group","title":"Manged Instance Group","text":"

    If vm instances belong to a managed instance group then provide the MANAGED_INSTANCE_GROUP as enable else provided it as disable, which is the default value.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: MANAGED_INSTANCE_GROUP\n          value: 'enable'\n\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop-by-label/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-by-label-sa\n  experiments:\n  - name: gcp-vm-instance-stop-by-label\n    spec:\n      components:\n        env:\n        - name: CHAOS_INTERVAL\n          value: '15'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n\n        - name: INSTANCE_LABEL\n          value: 'vm:target-vm'\n\n        - name: ZONES\n          value: 'us-east1-b'\n\n        - name: GCP_PROJECT_ID\n          value: 'my-project-4513'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/","title":"GCP VM Instance Stop","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#introduction","title":"Introduction","text":"
    • It causes power-off of a GCP VM instance by instance name or list of instance names before bringing it back to the running state after the specified chaos duration.
    • It helps to check the performance of the application/process running on the VM instance.
    • When the MANAGED_INSTANCE_GROUP is enable then the experiment will not try to start the instances post chaos, instead it will check the addition of new instances to the instance group.

    Scenario: stop the gcp vm

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the gcp-vm-instance-stop experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient GCP permissions to stop and start the GCP VM instances.
    • Ensure to create a Kubernetes secret having the GCP service account credentials in the default namespace. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  type: \n  project_id: \n  private_key_id: \n  private_key: \n  client_email: \n  client_id: \n  auth_uri: \n  token_uri: \n  auth_provider_x509_cert_url: \n  client_x509_cert_url: \n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#default-validations","title":"Default Validations","text":"View the default validations
    • VM instance should be in healthy state.
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  namespace: default\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: gcp-vm-instance-stop-sa\n  labels:\n    name: gcp-vm-instance-stop-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: gcp-vm-instance-stop-sa\nsubjects:\n- kind: ServiceAccount\n  name: gcp-vm-instance-stop-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes GCP_PROJECT_ID GCP project ID to which the VM instances belong All the VM instances must belong to a single GCP project VM_INSTANCE_NAMES Name of target VM instances Multiple instance names can be provided as instance1,instance2,... ZONES The zones of the target VM instances Zone for every instance name has to be provided as zone1,zone2,... in the same order of VM_INSTANCE_NAMES

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s MANAGED_INSTANCE_GROUP Set to enable if the target instance is the part of a managed instance group Defaults to disable SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#target-gcp-instances","title":"Target GCP Instances","text":"

    It will stop all the instances with the given VM_INSTANCE_NAMES instance names and corresponding ZONES zone names in GCP_PROJECT_ID project.

    NOTE: The VM_INSTANCE_NAMES contains multiple comma-separated vm instances. The comma-separated zone names should be provided in the same order as instance names.

    Use the following example to tune this:

    ## details of the gcp instance\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # comma separated list of vm instance names\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # comma separated list of zone names corresponds to the VM_INSTANCE_NAMES\n        # it should be provided in same order of VM_INSTANCE_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # gcp project id to which vm instance belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#managed-instance-group","title":"Managed Instance Group","text":"

    If vm instances belong to a managed instance group then provide the MANAGED_INSTANCE_GROUP as enable else provided it as disable, which is the default value.

    Use the following example to tune this:

    ## scale up and down to maintain the available instance counts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # tells if instances are part of managed instance group\n        # supports: enable, disable. default: disable\n        - name: MANAGED_INSTANCE_GROUP\n          value: 'enable'\n        # comma separated list of vm instance names\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        # comma separated list of zone names corresponds to the VM_INSTANCE_NAMES\n        # it should be provided in same order of VM_INSTANCE_NAMES\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        # gcp project id to which vm instance belongs\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/gcp/gcp-vm-instance-stop/#mutiple-iterations-of-chaos","title":"Mutiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: gcp-vm-instance-stop-sa\n  experiments:\n  - name: gcp-vm-instance-stop\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        - name: VM_INSTANCE_NAMES\n          value: 'instance-01,instance-02'\n        - name: ZONES\n          value: 'zone-01,zone-02'\n        - name: GCP_PROJECT_ID\n          value: 'project-id'\n
    "},{"location":"experiments/categories/load/k6-loadgen/","title":"k6 Load Generator","text":""},{"location":"experiments/categories/load/k6-loadgen/#introduction","title":"Introduction","text":"

    k6 loadgen fault simulates load generation on the target hosts for a specific chaos duration. This fault: - Slows down or makes the target host unavailable due to heavy load. - Checks the performance of the application or process running on the instance. Support various types of load testing (ex. spike, smoke, stress)

    Scenario: Load generating with k6

    "},{"location":"experiments/categories/load/k6-loadgen/#uses","title":"Uses","text":"View the uses of the experiment

    Introduction to k6 Load Chaos in LitmusChaos

    "},{"location":"experiments/categories/load/k6-loadgen/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus). If not, install from here
    • Ensure to create a Kubernetes secret having the JS script file in the Chaos Infrastructure's namespace (litmus by default). The simplest way to create a secret object looks like this:
      kubectl create secret generic k6-script \\\n    --from-file=<<script-path>> -n <<chaos_infrastructure_namespace>>\n
    "},{"location":"experiments/categories/load/k6-loadgen/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\nrules:\n- apiGroups: [\"\",\"litmuschaos.io\",\"batch\",\"apps\"]\n  resources: [\"pods\",\"configmaps\",\"jobs\",\"pods/exec\",\"pods/log\",\"events\",\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n  verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\",\"deletecollection\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\nname: k6-loadgen-sa\nnamespace: default\nlabels:\nname: k6-loadgen-sa\nroleRef:\napiGroup: rbac.authorization.k8s.io\nkind: Role\nname: k6-loadgen-sa\nsubjects:\n- kind: ServiceAccount\n  name: k6-loadgen-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/load/k6-loadgen/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 20s CHAOS_INTERVAL Time interval b/w two successive k6-loadgen (in sec) If the CHAOS_INTERVAL is not provided it will take the default value of 10s RAMP_TIME Period to wait before injection of chaos in sec LIB_IMAGE LIB Image used to excute k6 engine Defaults to ghcr.io/grafana/k6-operator:latest-runner LIB_IMAGE_PULL_POLICY LIB Image pull policy Defaults to Always SCRIPT_SECRET_NAME Provide the k8s secret name of the JS script to run k6. Default value: k6-script SCRIPT_SECRET_KEY Provide the key of the k8s secret named SCRIPT_SECRET_NAME Default value: script.js

    "},{"location":"experiments/categories/load/k6-loadgen/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/load/k6-loadgen/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/load/k6-loadgen/#custom-k6-configuration","title":"Custom k6 configuration","text":"

    You can add k6 options(ex hosts, thresholds) in the script options object. More details can be found here

    "},{"location":"experiments/categories/load/k6-loadgen/#custom-secret-name-and-secret-key","title":"Custom Secret Name and Secret Key","text":"

    You can provide the secret name and secret key of the JS script to be used for k6-loadgen. The secret should be created in the same namespace where the chaos infrastructure is created. For example, if the chaos infrastructure is created in the litmus namespace, then the secret should also be created in the litmus namespace.

    You can write a JS script like below. If you want to know more about the script, checkout this documentation.

    import http from 'k6/http';\nimport { sleep } from 'k6';\nexport const options = {\n    vus: 100,\n    duration: '30s',\n};\nexport default function () {\n    http.get('http://<<target_domain_name>>/');\n    sleep(1);\n}\n

    Then create a secret with the above script.

    kubectl create secret generic custom-k6-script \\\n  --from-file=custom-script.js -n <<chaos_infrastructure_namespace>>\n

    And If we want to use custom-k6-script secret and custom-script.js as the secret key, then the experiment tunable will look like this:

    ---\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: nginx-chaos\n  namespace: default\nspec:\n  engineState: 'active'\n  chaosServiceAccount: litmus-admin\n  experiments:\n    - name: k6-loadgen\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: \"30\"\n\n            # Interval between chaos injection in sec\n            - name: CHAOS_INTERVAL\n              value: \"30\"\n\n            # Period to wait before and after injection of chaos in sec\n            - name: RAMP_TIME\n              value: \"0\"\n\n            # Provide the secret name of the JS script\n            - name: SCRIPT_SECRET_NAME\n              value: \"custom-k6-script\"\n\n            # Provide the secret key of the JS script\n            - name: SCRIPT_SECRET_KEY\n              value: \"custom-script.js\"\n\n            # Provide the image name of the helper pod\n            - name: LIB_IMAGE\n              value: \"ghcr.io/grafana/k6-operator:latest-runner\"\n\n            # Provide the image pull policy of the helper pod\n            - name: LIB_IMAGE_PULL_POLICY\n              value: \"Always\"\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/","title":"Common tunables for node experiments","text":"

    It contains tunables, which are common for all the node experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-single-node","title":"Target Single Node","text":"

    It defines the name of the target node subjected to chaos. The target node can be tuned via TARGET_NODE ENV. It contains only a single node name. NOTE: It is supported by [node-drain, node-taint, node-restart, kubelet-service-kill, docker-service-kill] experiments.

    Use the following example to tune this:

    ## provide the target node name\n## it is applicable for the [node-drain, node-taint, node-restart, kubelet-service-kill, docker-service-kill]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-multiple-nodes","title":"Target Multiple Nodes","text":"

    It defines the comma-separated name of the target nodes subjected to chaos. The target nodes can be tuned via TARGET_NODES ENV. NOTE: It is supported by [node-cpu-hog, node-memory-hog, node-io-stress] experiments

    Use the following example to tune this:

    ## provide the comma separated target node names\n## it is applicable for the [node-cpu-hog, node-memory-hog, node-io-stress]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # comma separated target node names\n        - name: TARGET_NODES\n          value: 'node01,node02'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#target-nodes-with-labels","title":"Target Nodes With Labels","text":"

    It defines the labels of the targeted node(s) subjected to chaos. The node labels can be tuned via NODE_LABEL ENV. It is mutually exclusive with the TARGET_NODE(S) ENV. If TARGET_NODE(S) ENV is set then it will use the nodes provided inside it otherwise, it will derive the node name(s) with matching node labels.

    Use the following example to tune this:

    ## provide the labels of the targeted nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # labels of the targeted node\n        # it will derive the target nodes if TARGET_NODE(S) ENV is not set\n        - name: NODE_LABEL\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/common-tunables-for-node-experiments/#node-affected-percentage","title":"Node Affected Percentage","text":"

    It defines the percentage of nodes subjected to chaos with matching node labels. It can be tuned with NODES_AFFECTED_PERC ENV. If NODES_AFFECTED_PERC is provided as empty or 0 then it will target a minimum of one node. It is supported by [node-cpu-hog, node-memory-hog, node-io-stress] experiments. The rest of the experiment selects only a single node for the chaos.

    Use the following example to tune this:

    ## provide the percentage of nodes to be targeted with matching labels\n## it is applicable for the [node-cpu-hog, node-memory-hog, node-io-stress]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # percentage of nodes to be targeted with matching node labels\n        - name: NODES_AFFECTED_PERC\n          value: '100'\n        # labels of the targeted node\n        # it will derive the target nodes if TARGET_NODE(S) ENV is not set\n        - name: NODE_LABEL\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/docker-service-kill/","title":"Docker Service Kill","text":""},{"location":"experiments/categories/nodes/docker-service-kill/#introduction","title":"Introduction","text":"
    • This experiment Causes the application to become unreachable on account of node turning unschedulable (NotReady) due to docker service kill
    • The docker service has been stopped/killed on a node to make it unschedulable for a certain duration i.e TOTAL_CHAOS_DURATION. The application node should be healthy after the chaos injection and the services should be reaccessable.
    • The application implies services. Can be reframed as: Test application resiliency upon replica getting unreachable caused due to docker service down.

    Scenario: Kill the docker service of the node

    "},{"location":"experiments/categories/nodes/docker-service-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/docker-service-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the docker-service-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that the node specified in the experiment ENV variable TARGET_NODE (the node for which docker service need to be killed) should be cordoned before execution of the chaos experiment (before applying the chaosengine manifest) to ensure that the litmus experiment runner pods are not scheduled on it / subjected to eviction. This can be achieved with the following steps:
      • Get node names against the applications pods: kubectl get pods -o wide
      • Cordon the node kubectl cordon <nodename>
    "},{"location":"experiments/categories/nodes/docker-service-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: docker-service-kill-sa\n  namespace: default\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: docker-service-kill-sa\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: docker-service-kill-sa\n  labels:\n    name: docker-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: docker-service-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: docker-service-kill-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the target node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/docker-service-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/docker-service-kill/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/docker-service-kill/#kill-docker-service","title":"Kill Docker Service","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # kill the docker service of the target node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: docker-service-kill-sa\n  experiments:\n  - name: docker-service-kill\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/kubelet-service-kill/","title":"Kubelet Service Kill","text":""},{"location":"experiments/categories/nodes/kubelet-service-kill/#introduction","title":"Introduction","text":"
    • This experiment Causes the application to become unreachable on account of node turning unschedulable (NotReady) due to kubelet service kill.
    • The kubelet service has been stopped/killed on a node to make it unschedulable for a certain duration i.e TOTAL_CHAOS_DURATION. The application node should be healthy after the chaos injection and the services should be reaccessable.
    • The application implies services. Can be reframed as: Test application resiliency upon replica getting unreachable caused due to kubelet service down.

    Scenario: Kill the kubelet service of the node

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the kubelet-service-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that the node specified in the experiment ENV variable TARGET_NODE (the node for which kubelet service need to be killed) should be cordoned before execution of the chaos experiment (before applying the chaosengine manifest) to ensure that the litmus experiment runner pods are not scheduled on it / subjected to eviction. This can be achieved with the following steps:
      • Get node names against the applications pods: kubectl get pods -o wide
      • Cordon the node kubectl cordon <nodename>
    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: kubelet-service-kill-sa\n  namespace: default\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: kubelet-service-kill-sa\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: kubelet-service-kill-sa\n  labels:\n    name: kubelet-service-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: kubelet-service-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: kubelet-service-kill-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the target node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus LIB_IMAGE The lib image used to inject kubelet kill chaos the image should have systemd installed in it. Defaults to ubuntu:16.04 RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/kubelet-service-kill/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/kubelet-service-kill/#kill-kubelet-service","title":"Kill Kubelet Service","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # kill the kubelet service of the target node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: kubelet-service-kill-sa\n  experiments:\n  - name: kubelet-service-kill\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-cpu-hog/","title":"Node CPU Hog","text":""},{"location":"experiments/categories/nodes/node-cpu-hog/#introduction","title":"Introduction","text":"
    • This experiment causes CPU resource exhaustion on the Kubernetes node. The experiment aims to verify resiliency of applications whose replicas may be evicted on account on nodes turning unschedulable (Not Ready) due to lack of CPU resources.
    • The CPU chaos is injected using a helper pod running the linux stress tool (a workload generator). The chaos is effected for a period equalling the TOTAL_CHAOS_DURATION Application implies services. Can be reframed as: Tests application resiliency upon replica evictions caused due to lack of CPU resources

    Scenario: Stress the CPU of node

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-cpu-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/nodes/node-cpu-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-cpu-hog-sa\n  namespace: default\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-cpu-hog-sa\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-cpu-hog-sa\n  labels:\n    name: node-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-cpu-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-cpu-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node cpu hog chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60 LIB The chaos lib used to inject the chaos Defaults to litmus LIB_IMAGE Image used to run the stress command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before & after injection of chaos in sec Optional NODE_CPU_CORE Number of cores of node CPU to be consumed Defaults to 2 NODES_AFFECTED_PERC The Percentage of total nodes to target Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-cpu-hog/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-cpu-hog/#node-cpu-cores","title":"Node CPU Cores","text":"

    It contains number of cores of node CPU to be consumed. It can be tuned via NODE_CPU_CORE ENV.

    Use the following example to tune this:

    # stress the cpu of the targeted nodes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # number of cpu cores to be stressed\n        - name: NODE_CPU_CORE\n          value: '2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-cpu-hog/#node-cpu-load","title":"Node CPU Load","text":"

    It contains percentage of node CPU to be consumed. It can be tuned via CPU_LOAD ENV.

    Use the following example to tune this:

    # stress the cpu of the targeted nodes by load percentage\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-cpu-hog-sa\n  experiments:\n  - name: node-cpu-hog\n    spec:\n      components:\n        env:\n        # percentage of cpu to be stressed\n        - name: CPU_LOAD\n          value: \"100\"\n        # node cpu core should be provided as 0 for cpu load\n        # to work otherwise it will take cpu core as priority\n        - name: NODE_CPU_CORE\n          value: '0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-drain/","title":"Node Drain","text":""},{"location":"experiments/categories/nodes/node-drain/#introduction","title":"Introduction","text":"
    • It drain the node. The resources which are running on the target node should be reschedule on the other nodes.

    Scenario: Drain the node

    "},{"location":"experiments/categories/nodes/node-drain/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-drain/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-drain experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that the node specified in the experiment ENV variable TARGET_NODE (the node which will be drained) should be cordoned before execution of the chaos experiment (before applying the chaosengine manifest) to ensure that the litmus experiment runner pods are not scheduled on it / subjected to eviction. This can be achieved with the following steps:
      • Get node names against the applications pods: kubectl get pods -o wide
      • Cordon the node kubectl cordon <nodename>
    "},{"location":"experiments/categories/nodes/node-drain/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-drain/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-drain-sa\n  namespace: default\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-drain-sa\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\",\"pods/eviction\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # ignore daemonsets while draining the node\n  - apiGroups: [\"apps\"]\n    resources: [\"daemonsets\"]\n    verbs: [\"list\",\"get\",\"delete\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\",\"patch\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-drain-sa\n  labels:\n    name: node-drain-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-drain-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-drain-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-drain/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the node to be tainted NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/node-drain/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-drain/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-drain/#drain-node","title":"Drain Node","text":"

    It contains name of target node subjected to the chaos. It can be tuned via TARGET_NODE ENV.

    Use the following example to tune this:

    # drain the targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/","title":"Node IO Stress","text":""},{"location":"experiments/categories/nodes/node-io-stress/#introduction","title":"Introduction","text":"
    • This experiment causes io stress on the Kubernetes node. The experiment aims to verify the resiliency of applications that share this disk resource for ephemeral or persistent storage purposes.
    • The amount of io stress can be either specifed as the size in percentage of the total free space on the file system or simply in Gigabytes(GB). When provided both it will execute with the utilization percentage specified and non of them are provided it will execute with default value of 10%.
    • It tests application resiliency upon replica evictions caused due IO stress on the available Disk space.

    Scenario: Stress the IO of Node

    "},{"location":"experiments/categories/nodes/node-io-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-io-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-io-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/nodes/node-io-stress/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-io-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-io-stress-sa\n  namespace: default\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-io-stress-sa\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-io-stress-sa\n  labels:\n    name: node-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-io-stress-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-io-stress-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-io-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node io stress chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos (seconds) Default to 120 FILESYSTEM_UTILIZATION_PERCENTAGE Specify the size as percentage of free space on the file system Default to 10% FILESYSTEM_UTILIZATION_BYTES Specify the size in GigaBytes(GB). FILESYSTEM_UTILIZATION_PERCENTAGE & FILESYSTEM_UTILIZATION_BYTES are mutually exclusive. If both are provided, FILESYSTEM_UTILIZATION_PERCENTAGE is prioritized. CPU Number of core of CPU to be used Default to 1 NUMBER_OF_WORKERS It is the number of IO workers involved in IO disk stress Default to 4 VM_WORKERS It is the number vm workers involved in IO disk stress Default to 1 LIB The chaos lib used to inject the chaos Default to litmus LIB_IMAGE Image used to run the stress command Default to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec NODES_AFFECTED_PERC The Percentage of total nodes to target Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/nodes/node-io-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-io-stress/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-io-stress/#filesystem-utilization-percentage","title":"Filesystem Utilization Percentage","text":"

    It stresses the FILESYSTEM_UTILIZATION_PERCENTAGE percentage of total free space available in the node.

    Use the following example to tune this:

    # stress the i/o of the targeted node with FILESYSTEM_UTILIZATION_PERCENTAGE of total free space \n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_BYTES.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # percentage of total free space of file system\n        - name: FILESYSTEM_UTILIZATION_PERCENTAGE\n          value: '10' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#filesystem-utilization-bytes","title":"Filesystem Utilization Bytes","text":"

    It stresses the FILESYSTEM_UTILIZATION_BYTES GB of the i/o of the targeted node. It is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE ENV. If FILESYSTEM_UTILIZATION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on FILESYSTEM_UTILIZATION_BYTES ENV.

    Use the following example to tune this:

    # stress the i/o of the targeted node with given FILESYSTEM_UTILIZATION_BYTES\n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # file system to be stress in GB\n        - name: FILESYSTEM_UTILIZATION_BYTES\n          value: '500' # in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#limit-cpu-utilization","title":"Limit CPU Utilization","text":"

    The CPU usage can be limit to CPU cpu while performing io stress. It can be tuned via CPU ENV.

    Use the following example to tune this:

    # limit the cpu uses to the provided value while performing io stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # number of cpu cores to be stressed\n        - name: CPU\n          value: '1' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-io-stress/#workers-for-stress","title":"Workers For Stress","text":"

    The i/o and VM workers count for the stress can be tuned with NUMBER_OF_WORKERS and VM_WORKERS ENV respectively.

    Use the following example to tune this:

    # define the workers count for the i/o and vm\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-io-stress-sa\n  experiments:\n  - name: node-io-stress\n    spec:\n      components:\n        env:\n        # total number of io workers involved in stress\n        - name: NUMBER_OF_WORKERS\n          value: '4' \n          # total number of vm workers involved in stress\n        - name: VM_WORKERS\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/","title":"Node Memory Hog","text":""},{"location":"experiments/categories/nodes/node-memory-hog/#introduction","title":"Introduction","text":"
    • This experiment causes Memory resource exhaustion on the Kubernetes node. The experiment aims to verify resiliency of applications whose replicas may be evicted on account on nodes turning unschedulable (Not Ready) due to lack of Memory resources.
    • The Memory chaos is injected using a helper pod running the linux stress-ng tool (a workload generator)- The chaos is effected for a period equalling the TOTAL_CHAOS_DURATION and upto MEMORY_CONSUMPTION_PERCENTAGE(out of 100) or MEMORY_CONSUMPTION_MEBIBYTES(in Mebibytes out of total available memory).
    • Application implies services. Can be reframed as: Tests application resiliency upon replica evictions caused due to lack of Memory resources

    Scenario: Stress the memory of node

    "},{"location":"experiments/categories/nodes/node-memory-hog/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-memory-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-memory-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/nodes/node-memory-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-memory-hog-sa\n  namespace: default\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-memory-hog-sa\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-memory-hog-sa\n  labels:\n    name: node-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-memory-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-memory-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODES Comma separated list of nodes, subjected to node memory hog chaos NODE_LABEL It contains node label, which will be used to filter the target nodes if TARGET_NODES ENV is not set It is mutually exclusive with the TARGET_NODES ENV. If both are provided then it will use the TARGET_NODES

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (in seconds) Optional Defaults to 120 LIB The chaos lib used to inject the chaos Optional Defaults to litmus LIB_IMAGE Image used to run the stress command Optional Defaults to litmuschaos/go-runner:latest MEMORY_CONSUMPTION_PERCENTAGE Percent of the total node memory capacity Optional Defaults to 30 MEMORY_CONSUMPTION_MEBIBYTES The size in Mebibytes of total available memory. When using this we need to keep MEMORY_CONSUMPTION_PERCENTAGE empty as the percentage have more precedence Optional NUMBER_OF_WORKERS It is the number of VM workers involved in IO disk stress Optional Default to 1 RAMP_TIME Period to wait before and after injection of chaos in sec Optional NODES_AFFECTED_PERC The Percentage of total nodes to target Optional Defaults to 0 (corresponds to 1 node), provide numeric value only SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/nodes/node-memory-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-memory-hog/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-memory-hog/#memory-consumption-percentage","title":"Memory Consumption Percentage","text":"

    It stresses the MEMORY_CONSUMPTION_PERCENTAGE percentage of total node capacity of the targeted node.

    Use the following example to tune this:

    # stress the memory of the targeted node with MEMORY_CONSUMPTION_PERCENTAGE of node capacity\n# it is mutually exclusive with the MEMORY_CONSUMPTION_MEBIBYTES.\n# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # percentage of total node capacity to be stressed\n        - name: MEMORY_CONSUMPTION_PERCENTAGE\n          value: '10' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/#memory-consumption-mebibytes","title":"Memory Consumption Mebibytes","text":"

    It stresses the MEMORY_CONSUMPTION_MEBIBYTES MiBi of the memory of the targeted node. It is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE ENV. If MEMORY_CONSUMPTION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on MEMORY_CONSUMPTION_MEBIBYTES ENV.

    Use the following example to tune this:

    # stress the memory of the targeted node with given MEMORY_CONSUMPTION_MEBIBYTES\n# it is mutually exclusive with the MEMORY_CONSUMPTION_PERCENTAGE.\n# if both are provided then it will use MEMORY_CONSUMPTION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # node memory to be stressed\n        - name: MEMORY_CONSUMPTION_MEBIBYTES\n          value: '500' # in MiBi\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-memory-hog/#workers-for-stress","title":"Workers For Stress","text":"

    The workers count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # provide for the workers count for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-memory-hog-sa\n  experiments:\n  - name: node-memory-hog\n    spec:\n      components:\n        env:\n        # total number of workers involved in stress\n        - name: NUMBER_OF_WORKERS\n          value: '1' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/","title":"Node Restart","text":""},{"location":"experiments/categories/nodes/node-restart/#introduction","title":"Introduction","text":"
    • It causes chaos to disrupt state of node by restarting it.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflows of the application pod

    Scenario: Restart the node

    "},{"location":"experiments/categories/nodes/node-restart/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-restart/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-restart experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Create a Kubernetes secret named id-rsa where the experiment will run, where its contents will be the private SSH key for SSH_USER used to connect to the node that hosts the target pod in the secret field ssh-privatekey. A sample secret is shown below:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: id-rsa\ntype: kubernetes.io/ssh-auth\nstringData:\n  ssh-privatekey: |-\n    # SSH private key for ssh contained here\n

    Creating the RSA key pair for remote SSH access should be a trivial exercise for those who are already familiar with an ssh client, which entails the following actions:

    1. Create a new key pair and store the keys in a file named my-id-rsa-key and my-id-rsa-key.pub for the private and public keys respectively:
      ssh-keygen -f ~/my-id-rsa-key -t rsa -b 4096\n
    2. For each node available, run this following command to copy the public key of my-id-rsa-key:
      ssh-copy-id -i my-id-rsa-key user@node\n

    For further details, please check this documentation. Once you have copied the public key to all nodes and created the secret described earlier, you are ready to start your experiment.

    "},{"location":"experiments/categories/nodes/node-restart/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-restart/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-restart-sa\n  namespace: default\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-restart-sa\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\",\"secrets\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-restart-sa\n  labels:\n    name: node-restart-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-restart-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-restart-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-restart/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of target node, subjected to chaos. If not provided it will select the random node NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE

    Variables Description Notes LIB_IMAGE The image used to restart the node Defaults to litmuschaos/go-runner:latest SSH_USER name of ssh user Defaults to root TARGET_NODE_IP Internal IP of the target node, subjected to chaos. If not provided, the experiment will lookup the node IP of the TARGET_NODE node Defaults to empty REBOOT_COMMAND Command used for reboot Defaults to sudo systemctl reboot TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 30s RAMP_TIME Period to wait before and after injection of chaos in sec LIB The chaos lib used to inject the chaos Defaults to litmus supported litmus only

    "},{"location":"experiments/categories/nodes/node-restart/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-restart/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-restart/#reboot-command","title":"Reboot Command","text":"

    It defines the command used to restart the targeted node. It can be tuned via REBOOT_COMMAND ENV.

    Use the following example to tune this:

    # provide the reboot command\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # command used for the reboot\n        - name: REBOOT_COMMAND\n          value: 'sudo systemctl reboot'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/#ssh-user","title":"SSH User","text":"

    It defines the name of the SSH user for the targeted node. It can be tuned via SSH_USER ENV.

    Use the following example to tune this:

    # name of the ssh user used to ssh into targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # name of the ssh user\n        - name: SSH_USER\n          value: 'root'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-restart/#target-node-internal-ip","title":"Target Node Internal IP","text":"

    It defines the internal IP of the targeted node. It is an optional field, if internal IP is not provided then it will derive the internal IP of the targeted node. It can be tuned via TARGET_NODE_IP ENV.

    Use the following example to tune this:

    # internal ip of the targeted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-restart-sa\n  experiments:\n  - name: node-restart\n    spec:\n      components:\n        env:\n        # internal ip of the targeted node\n        - name: TARGET_NODE_IP\n          value: '<ip of node01>'\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/nodes/node-taint/","title":"Node Taint","text":""},{"location":"experiments/categories/nodes/node-taint/#introduction","title":"Introduction","text":"
    • It taints the node to apply the desired effect. The resources which contains the correspoing tolerations can only bypass the taints.

    Scenario: Taint the node

    "},{"location":"experiments/categories/nodes/node-taint/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/nodes/node-taint/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the node-taint experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that the node specified in the experiment ENV variable TARGET_NODE (the node which will be tainted) should be cordoned before execution of the chaos experiment (before applying the chaosengine manifest) to ensure that the litmus experiment runner pods are not scheduled on it / subjected to eviction. This can be achieved with the following steps:
      • Get node names against the applications pods: kubectl get pods -o wide
      • Cordon the node kubectl cordon <nodename>
    "},{"location":"experiments/categories/nodes/node-taint/#default-validations","title":"Default Validations","text":"View the default validations

    The target nodes should be in ready state before and after chaos injection.

    "},{"location":"experiments/categories/nodes/node-taint/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: node-taint-sa\n  namespace: default\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: node-taint-sa\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n# Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\",\"pods/eviction\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # ignore daemonsets while draining the node\n  - apiGroups: [\"apps\"]\n    resources: [\"daemonsets\"]\n    verbs: [\"list\",\"get\",\"delete\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n  # for experiment to perform node status checks\n  - apiGroups: [\"\"]\n    resources: [\"nodes\"]\n    verbs: [\"get\",\"list\",\"patch\",\"update\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: node-taint-sa\n  labels:\n    name: node-taint-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: node-taint-sa\nsubjects:\n- kind: ServiceAccount\n  name: node-taint-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/nodes/node-taint/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_NODE Name of the node to be tainted NODE_LABEL It contains node label, which will be used to filter the target node if TARGET_NODE ENV is not set It is mutually exclusive with the TARGET_NODE ENV. If both are provided then it will use the TARGET_NODE TAINT_LABEL Label and effect to be tainted on application node

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before injection of chaos in sec

    "},{"location":"experiments/categories/nodes/node-taint/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/nodes/node-taint/#common-and-node-specific-tunables","title":"Common and Node specific tunables","text":"

    Refer the common attributes and Node specific tunable to tune the common tunables for all experiments and node specific tunables.

    "},{"location":"experiments/categories/nodes/node-taint/#taint-label","title":"Taint Label","text":"

    It contains label and effect to be tainted on application node. It can be tuned via TAINT_LABEL ENV.

    Use the following example to tune this:

    # node tainted with provided key and effect\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: node-taint-sa\n  experiments:\n  - name: node-taint\n    spec:\n      components:\n        env:\n        # label and effect to be tainted on the targeted node\n        - name: TAINT_LABEL\n          value: 'key=value:effect'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/","title":"Common tunables for pod experiments","text":"

    It contains tunables, which are common for all pod-level experiments. These tunables can be provided at .spec.experiment[*].spec.components.env in chaosengine.

    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#target-specific-pods","title":"Target Specific Pods","text":"

    It defines the comma-separated name of the target pods subjected to chaos. The target pods can be tuned via TARGET_PODS ENV.

    Use the following example to tune this:

    ## it contains comma separated target pod names\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        ## comma separated target pod names\n        - name: TARGET_PODS\n          value: 'pod1,pod2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#pod-affected-percentage","title":"Pod Affected Percentage","text":"

    It defines the percentage of pods subjected to chaos with matching labels provided at .spec.appinfo.applabel inside chaosengine. It can be tuned with PODS_AFFECTED_PERC ENV. If PODS_AFFECTED_PERC is provided as empty or 0 then it will target a minimum of one pod.

    Use the following example to tune this:

    ## it contains percentage of application pods to be targeted with matching labels or names in the application namespace\n## supported for all pod-level experiment expect pod-autoscaler\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # percentage of application pods\n        - name: PODS_AFFECTED_PERC\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#target-specific-container","title":"Target Specific Container","text":"

    It defines the name of the targeted container subjected to chaos. It can be tuned via TARGET_CONTAINER ENV. If TARGET_CONTAINER is provided as empty then it will use the first container of the targeted pod.

    Use the following example to tune this:

    ## name of the target container\n## it will use first container as target container if TARGET_CONTAINER is provided as empty\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # name of the target container\n        - name: TARGET_CONTAINER\n          value: 'nginx'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#default-application-health-check","title":"Default Application Health Check","text":"

    It defines the default application status checks as a tunable. It is helpful for the scenarios where you don\u2019t want to validate the application status as a mandatory check during pre & post chaos. It can be tuned via DEFAULT_APP_HEALTH_CHECK ENV. If DEFAULT_APP_HEALTH_CHECK is not provided by default it is set to true.

    Use the following example to tune this:

    ## application status check as tunable\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        - name: DEFAULT_APP_HEALTH_CHECK\n          value: 'false'\n
    "},{"location":"experiments/categories/pods/common-tunables-for-pod-experiments/#node-label-filter-for-selecting-the-target-pods","title":"Node Label Filter For Selecting The Target Pods","text":"

    It defines the target application pod selection from a specific node. It is helpful for the scenarios where you want to select the pods scheduled on specific nodes as chaos candidates considering the pod affected percentage. It can be tuned via NODE_LABEL ENV.

    NOTE: This feature requires having node-level permission or clusterrole service account for filtering pods on a specific node.

    APP_LABEL TARGET_PODS NODE_LABEL SELECTED PODS Provided Provided Provided The target pods that are filtered from applabel and resides on node containing the given node label and also provided in TARGET_PODS env is selected Provided Not Provided Provided The pods that are filtered from applabel and resides on node containing the given node label is selected Not Provided Provided Provided The target pods are selected that resides on node with given node label Not Provided Not Provided Provided Invalid Not Provided Not Provided Not Provided Invalid

    Use the following example to tune this:

    ## node label to filter target pods\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        - name: NODE_LABEL\n          value: 'kubernetes.io/hostname=worker-01'\n
    "},{"location":"experiments/categories/pods/container-kill/","title":"Container Kill","text":""},{"location":"experiments/categories/pods/container-kill/#introduction","title":"Introduction","text":"
    • It Causes container failure of specific/random replicas of an application resources.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application
    • Good for testing recovery of pods having side-car containers

    Scenario: Kill target container

    "},{"location":"experiments/categories/pods/container-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/container-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the container-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/container-kill/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/container-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: container-kill-sa\n  namespace: default\n  labels:\n    name: container-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: container-kill-sa\nsubjects:\n- kind: ServiceAccount\n  name: container-kill-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/container-kill/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER The name of container to be killed inside the pod If the TARGET_CONTAINER is not provided it will delete the first container CHAOS_INTERVAL Time interval b/w two successive container kill (in sec) If the CHAOS_INTERVAL is not provided it will take the default value of 10s TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 20s PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only TARGET_PODS Comma separated list of application pod name subjected to container kill chaos If not provided, it will select target pods randomly based on provided appLabels LIB_IMAGE LIB Image used to kill the container Defaults to litmuschaos/go-runner:latest LIB The category of lib use to inject chaos Default value: litmus, supported values: pumba and litmus RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel SIGNAL It contains termination signal used for container kill Default value: SIGKILL SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB

    "},{"location":"experiments/categories/pods/container-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/container-kill/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/container-kill/#kill-specific-container","title":"Kill Specific Container","text":"

    It defines the name of the targeted container subjected to chaos. It can be tuned via TARGET_CONTAINER ENV. If TARGET_CONTAINER is provided as empty then it will use the first container of the targeted pod.

    # kill the specific target container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # name of the target container\n        - name: TARGET_CONTAINER\n          value: 'nginx'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path:

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.
    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#signal-for-kill","title":"Signal For Kill","text":"

    It defines the Linux signal passed while killing the container. It can be tuned via SIGNAL ENV. It defaults to the SIGTERM.

    # specific linux signal passed while kiiling container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # signal passed while killing container\n        # defaults to SIGTERM\n        - name: SIGNAL\n          value: 'SIGKILL'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/container-kill/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus.

    # pumba chaoslib used to kill the container\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        env:\n        # name of the lib\n        # supoorts pumba and litmus\n        - name: LIB\n          value: 'pumba'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/","title":"Disk Fill","text":""},{"location":"experiments/categories/pods/disk-fill/#introduction","title":"Introduction","text":"
    • It causes Disk Stress by filling up the ephemeral storage of the pod on any given node.
    • It causes the application pod to get evicted if the capacity filled exceeds the pod's ephemeral storage limit.
    • It tests the Ephemeral Storage Limits, to ensure those parameters are sufficient.
    • It tests the application's resiliency to disk stress/replica evictions.

    Scenario: Fill ephemeral-storage

    "},{"location":"experiments/categories/pods/disk-fill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/disk-fill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the disk-fill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Appropriate Ephemeral Storage Requests and Limits should be set for the application before running the experiment. An example specification is shown below:
      apiVersion: v1\nkind: Pod\nmetadata:\n  name: frontend\nspec:\n  containers:\n  - name: db\n    image: mysql\n    env:\n    - name: MYSQL_ROOT_PASSWORD\n      value: \"password\"\n    resources:\n      requests:\n        ephemeral-storage: \"2Gi\"\n      limits:\n        ephemeral-storage: \"4Gi\"\n  - name: wp\n    image: wordpress\n    resources:\n      requests:\n        ephemeral-storage: \"2Gi\"\n      limits:\n        ephemeral-storage: \"4Gi\"\n
    "},{"location":"experiments/categories/pods/disk-fill/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/disk-fill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: disk-fill-sa\n  namespace: default\n  labels:\n    name: disk-fill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: disk-fill-sa\nsubjects:\n- kind: ServiceAccount\n  name: disk-fill-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/disk-fill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes FILL_PERCENTAGE Percentage to fill the Ephemeral storage limit Can be set to more than 100 also, to force evict the pod. The ephemeral-storage limits must be set in targeted pod to use this ENV. EPHEMERAL_STORAGE_MEBIBYTES Ephemeral storage which need to fill (unit: MiBi) It is mutually exclusive with the FILL_PERCENTAGE ENV. If both are provided then it will use the FILL_PERCENTAGE

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to disk-fill If not provided, the first container in the targeted pod will be subject to chaos CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (sec) Defaults to 60s TARGET_PODS Comma separated list of application pod name subjected to disk fill chaos If not provided, it will select target pods randomly based on provided appLabels DATA_BLOCK_SIZE It contains data block size used to fill the disk(in KB) Defaults to 256, it supports unit as KB only PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Defaults to litmus supported litmus only LIB_IMAGE The image used to fill the disk Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/disk-fill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/disk-fill/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/disk-fill/#disk-fill-percentage","title":"Disk Fill Percentage","text":"

    It fills the FILL_PERCENTAGE percentage of the ephemeral-storage limit specified at resource.limits.ephemeral-storage inside the target application.

    Use the following example to tune this:

    ## percentage of ephemeral storage limit specified at `resource.limits.ephemeral-storage` inside target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## percentage of ephemeral storage limit, which needs to be filled\n        - name: FILL_PERCENTAGE\n          value: '80' # in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#disk-fill-mebibytes","title":"Disk Fill Mebibytes","text":"

    It fills the EPHEMERAL_STORAGE_MEBIBYTES MiBi of ephemeral storage of the targeted pod. It is mutually exclusive with the FILL_PERCENTAGE ENV. If FILL_PERCENTAGE ENV is set then it will use the percentage for the fill otherwise, it will fill the ephemeral storage based on EPHEMERAL_STORAGE_MEBIBYTES ENV.

    Use the following example to tune this:

    # ephemeral storage which needs to fill in will application\n# if ephemeral-storage limits is not specified inside target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## ephemeral storage size, which needs to be filled\n        - name: EPHEMERAL_STORAGE_MEBIBYTES\n          value: '256' #in MiBi\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#data-block-size","title":"Data Block Size","text":"

    It defines the size of the data block used to fill the ephemeral storage of the targeted pod. It can be tuned via DATA_BLOCK_SIZE ENV. Its unit is KB. The default value of DATA_BLOCK_SIZE is 256.

    Use the following example to tune this:

    # size of the data block used to fill the disk\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        ## size of data block used to fill the disk\n        - name: DATA_BLOCK_SIZE\n          value: '256' #in KB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/disk-fill/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is containerd.
    • SOCKET_PATH: It contains path of containerd socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    # path inside node/vm where containers are present\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: disk-fill-sa\n  experiments:\n  - name: disk-fill\n    spec:\n      components:\n        env:\n        # provide the name of container runtime, it supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # provide the socket file path\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-autoscaler/","title":"Pod Autoscaler","text":""},{"location":"experiments/categories/pods/pod-autoscaler/#introduction","title":"Introduction","text":"
    • The experiment aims to check the ability of nodes to accommodate the number of replicas a given application pod.

    • This experiment can be used for other scenarios as well, such as for checking the Node auto-scaling feature. For example, check if the pods are successfully rescheduled within a specified period in cases where the existing nodes are already running at the specified limits.

    Scenario: Scale the replicas

    "},{"location":"experiments/categories/pods/pod-autoscaler/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-autoscaler/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-autoscaler experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-autoscaler/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-autoscaler-sa\n  namespace: default\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: pod-autoscaler-sa\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # performs CRUD operations on the deployments and statefulsets\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\"]\n    verbs: [\"list\",\"get\",\"patch\",\"update\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: pod-autoscaler-sa\n  labels:\n    name: pod-autoscaler-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: pod-autoscaler-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-autoscaler-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes REPLICA_COUNT Number of replicas upto which we want to scale nil

    Variables Description Notes TOTAL_CHAOS_DURATION The timeout for the chaos experiment (in seconds) Defaults to 60 LIB The chaos lib used to inject the chaos Defaults to litmus RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/pods/pod-autoscaler/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-autoscaler/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-autoscaler/#replica-counts","title":"Replica counts","text":"

    It defines the number of replicas, which should be present in the targeted application during the chaos. It can be tuned via REPLICA_COUNT ENV.

    Use the following example to tune this:

    # provide the number of replicas \napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-autoscaler-sa\n  experiments:\n  - name: pod-autoscaler\n    spec:\n      components:\n        env:\n        # number of replica, needs to scale\n        - name: REPLICA_COUNT\n          value: '3'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/","title":"Pod CPU Hog Exec","text":""},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#introduction","title":"Introduction","text":"
    • This experiment consumes the CPU resources of the application container

    • It simulates conditions where app pods experience CPU spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the CPU

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems.

    Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-cpu-hog-exec experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-cpu-hog-exec-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-cpu-hog-exec-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes CPU_CORES Number of the cpu cores subjected to CPU stress Default to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus Default to litmus TARGET_PODS Comma separated list of application pod name subjected to pod cpu hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CHAOS_INJECT_COMMAND The command to inject the cpu chaos Default to md5sum /dev/zero CHAOS_KILL_COMMAND The command to kill the chaos process Default to kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}'). Another useful one that generally works (in case the default doesn't) is kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print $1}' | tr '\\n' ' '). In case neither works, please check whether the target pod's base image offers a shell. If yes, identify appropriate shell command to kill the chaos process RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#cpu-cores","title":"CPU Cores","text":"

    It stresses the CPU_CORE cpu cores of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # cpu cores for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-exec-sa\n  experiments:\n  - name: pod-cpu-hog-exec\n    spec:\n      components:\n        env:\n        # cpu cores for stress\n        - name: CPU_CORES\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog-exec/#chaos-inject-and-kill-commands","title":"Chaos Inject and Kill Commands","text":"

    It defines the CHAOS_INJECT_COMMAND and CHAOS_KILL_COMMAND ENV to set the chaos inject and chaos kill commands respectively. Default values of commands:

    • CHAOS_INJECT_COMMAND: \"md5sum /dev/zero\"
    • CHAOS_KILL_COMMAND: \"kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}')\"

    Use the following example to tune this:

    # provide the chaos kill, used to kill the chaos process\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-exec-sa\n  experiments:\n  - name: pod-cpu-hog-exec\n    spec:\n      components:\n        env:\n        # command to create the md5sum process to stress the cpu\n        - name: CHAOS_INJECT_COMMAND\n          value: 'md5sum /dev/zero'\n        # command to kill the md5sum process\n        # alternative command: \"kill -9 $(ps afx | grep \"[md5sum] /dev/zero\" | awk '{print $1}' | tr '\\n' ' ')\"\n        - name: CHAOS_KILL_COMMAND\n          value: \"kill $(find /proc -name exe -lname '*/md5sum' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}')\"\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/","title":"Pod CPU Hog","text":""},{"location":"experiments/categories/pods/pod-cpu-hog/#introduction","title":"Introduction","text":"
    • This experiment consumes the CPU resources of the application container
    • It simulates conditions where app pods experience CPU spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.
    • It can test the application's resilience to potential slowness/unavailability of some replicas due to high CPU load

    Scenario: Stress the CPU

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems.

    Injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-cpu-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions
    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-cpu-hog-sa\n  namespace: default\n  labels:\n    name: pod-cpu-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-cpu-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-cpu-hog-sa\n  namespace: default\n

    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes CPU_CORES Number of the cpu cores subjected to CPU stress Default to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus and pumba Default to litmus LIB_IMAGE Image used to run the helper pod. Defaults to litmuschaos/go-runner:1.13.8 STRESS_IMAGE Container run on the node at runtime by the pumba lib to inject stressors. Only used in LIB pumba Default to alexeiled/stress-ng:latest-ubuntu TARGET_PODS Comma separated list of application pod name subjected to pod cpu hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-cpu-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-cpu-hog/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-cpu-hog/#cpu-cores","title":"CPU Cores","text":"

    It stresses the CPU_CORE of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # cpu cores for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # cpu cores for stress\n        - name: CPU_CORES\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#cpu-load","title":"CPU Load","text":"

    It contains percentage of pod CPU to be consumed. It can be tuned via CPU_LOAD ENV.

    Use the following example to tune this:

    # cpu load for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # cpu load in percentage for the stress\n        - name: CPU_LOAD\n          value: '100'\n        # cpu core should be provided as 0 for cpu load\n        # to work, otherwise it will take cpu core as priority\n        - name: CPU_CORES\n          value: '0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-cpu-hog/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the stress image via STRESS_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-cpu-hog-sa\n  experiments:\n  - name: pod-cpu-hog\n    spec:\n      components:\n        env:\n        # name of chaos lib\n        # supports litmus and pumba\n        - name: LIB\n          value: 'pumba'\n        # stress image - applicable for pumba only\n        - name: STRESS_IMAGE\n          value: 'alexeiled/stress-ng:latest-ubuntu'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/","title":"Pod Delete","text":""},{"location":"experiments/categories/pods/pod-delete/#introduction","title":"Introduction","text":"
    • It Causes (forced/graceful) pod failure of specific/random replicas of an application resources.
    • It tests deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application

    Scenario: Deletes kubernetes pod

    "},{"location":"experiments/categories/pods/pod-delete/#uses","title":"Uses","text":"View the uses of the experiment

    In the distributed system like kubernetes it is very likely that your application replicas may not be sufficient to manage the traffic (indicated by SLIs) when some of the replicas are unavailable due to any failure (can be system or application) the application needs to meet the SLO(service level objectives) for this, we need to make sure that the applications have minimum number of available replicas. One of the common application failures is when the pressure on other replicas increases then to how the horizontal pod autoscaler scales based on observed resource utilization and also how much PV mount takes time upon rescheduling. The other important aspects to test are the MTTR for the application replica, re-elections of leader or follower like in kafka application the selection of broker leader, validating minimum quorum to run the application for example in applications like percona, resync/redistribution of data.

    This experiment helps to reproduce such a scenario with forced/graceful pod failure on specific or random replicas of an application resource and checks the deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application.

    "},{"location":"experiments/categories/pods/pod-delete/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-delete experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-delete/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-delete/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-delete-sa\n  namespace: default\n  labels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-delete-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-delete-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-delete/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (in sec) Defaults to 15s, NOTE: Overall run duration of the experiment may exceed the TOTAL_CHAOS_DURATION by a few min CHAOS_INTERVAL Time interval b/w two successive pod failures (in sec) Defaults to 5s RANDOMNESS Introduces randomness to pod deletions with a minimum period defined by CHAOS_INTERVAL It supports true or false. Default value: false FORCE Application Pod deletion mode. false indicates graceful deletion with default termination period of 30s. true indicates an immediate forceful deletion with 0s grace period Default to true, With terminationGracePeriodSeconds=0 TARGET_PODS Comma separated list of application pod name subjected to pod delete chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-delete/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-delete/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-delete/#force-delete","title":"Force Delete","text":"

    The targeted pod can be deleted forcefully or gracefully. It can be tuned with the FORCE env. It will delete the pod forcefully if FORCE is provided as true and it will delete the pod gracefully if FORCE is provided as false.

    Use the following example to tune this:

    # tune the deletion of target pods forcefully or gracefully\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # provided as true for the force deletion of pod\n        # supports true and false value\n        - name: FORCE\n          value: 'true'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/#multiple-iterations-of-chaos","title":"Multiple Iterations Of Chaos","text":"

    The multiple iterations of chaos can be tuned via setting CHAOS_INTERVAL ENV. Which defines the delay between each iteration of chaos.

    Use the following example to tune this:

    # defines delay between each successive iteration of the chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # delay between each iteration of chaos\n        - name: CHAOS_INTERVAL\n          value: '15'\n        # time duration for the chaos execution\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-delete/#random-interval","title":"Random Interval","text":"

    The randomness in the chaos interval can be enabled via setting RANDOMNESS ENV to true. It supports boolean values. The default value is false. The chaos interval can be tuned via CHAOS_INTERVAL ENV.

    • If CHAOS_INTERVAL is set in the form of l-r i.e, 5-10 then it will select a random interval between l & r.
    • If CHAOS_INTERVAL is set in the form of value i.e, 10 then it will select a random interval between 0 & value.

    Use the following example to tune this:

    # contains random chaos interval with lower and upper bound of range i.e [l,r]\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        env:\n        # randomness enables iterations at random time interval\n        # it supports true and false value\n        - name: RANDOMNESS\n          value: 'true'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n        # it will select a random interval within this range\n        # if only one value is provided then it will select a random interval within 0-CHAOS_INTERVAL range\n        - name: CHAOS_INTERVAL\n          value: '5-10' \n
    "},{"location":"experiments/categories/pods/pod-dns-error/","title":"Pod Dns Error","text":""},{"location":"experiments/categories/pods/pod-dns-error/#introduction","title":"Introduction","text":"
    • Pod-dns-error injects chaos to disrupt dns resolution in kubernetes pods.
    • It causes loss of access to services by blocking dns resolution of hostnames/domains

    Scenario: DNS error for the target pod

    "},{"location":"experiments/categories/pods/pod-dns-error/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-dns-error/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-dns-error experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-dns-error/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-dns-error/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-dns-error-sa\n  namespace: default\n  labels:\n    name: pod-dns-error-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-dns-error-sa\nsubjects:\n  - kind: ServiceAccount\n    name: pod-dns-error-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-dns-error/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to dns-error None TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_HOSTNAMES List of the target hostnames or keywords eg. '[\"litmuschaos\"]' If not provided, all hostnames/domains will be targeted MATCH_SCHEME Determines whether the dns query has to match exactly with one of the targets or can have any of the targets as substring. Can be either exact or substring if not provided, it will be set as exact PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker SOCKET_PATH Path of the docker socket file Defaults to /run/containerd/containerd.sock LIB The chaos lib used to inject the chaos Default value: litmus, supported values: litmus LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-dns-error/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-dns-error/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-dns-error/#target-host-names","title":"Target Host Names","text":"

    It defines the comma-separated name of the target hosts subjected to chaos. It can be tuned with the TARGET_HOSTNAMES ENV. If TARGET_HOSTNAMESnot provided then all hostnames/domains will be targeted.

    Use the following example to tune this:

    # contains the target host names for the dns error\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        ## comma separated list of host names\n        ## if not provided, all hostnames/domains will be targeted\n        - name: TARGET_HOSTNAMES\n          value: '[\"litmuschaos\"]'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-error/#match-scheme","title":"Match Scheme","text":"

    It determines whether the DNS query has to match exactly with one of the targets or can have any of the targets as a substring. It can be tuned with MATCH_SCHEME ENV. It supports exact or substring values.

    Use the following example to tune this:

    # contains match scheme for the dns error\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        ## it supports 'exact' and 'substring' values\n        - name: MATCH_SCHEME\n          value: 'exact' \n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-error/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker runtime only.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock).

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-error-sa\n  experiments:\n  - name: pod-dns-error\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-spoof/","title":"Pod Dns Spoof","text":""},{"location":"experiments/categories/pods/pod-dns-spoof/#introduction","title":"Introduction","text":"
    • Pod-dns-spoof injects chaos to spoof dns resolution in kubernetes pods.
    • It causes dns resolution of target hostnames/domains to wrong IPs as specified by SPOOF_MAP in the engine config.

    Scenario: DNS spoof for the target pod

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-dns-spoof experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-dns-spoof/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n    # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-dns-spoof-sa\n  namespace: default\n  labels:\n    name: pod-dns-spoof-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-dns-spoof-sa\nsubjects:\n  - kind: ServiceAccount\n    name: pod-dns-spoof-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_CONTAINER Name of container which is subjected to dns spoof None TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) SPOOF_MAP Map of the target hostnames eg. '{\"abc.com\":\"spoofabc.com\"}' where key is the hostname that needs to be spoofed and value is the hostname where it will be spoofed/redirected to. If not provided, no hostnames/domains will be spoofed PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker SOCKET_PATH Path of the docker socket file Defaults to /run/containerd/containerd.sock LIB The chaos lib used to inject the chaos Default value: litmus, supported values: litmus LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-dns-spoof/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-dns-spoof/#spoof-map","title":"Spoof Map","text":"

    It defines the map of the target hostnames eg. '{\"abc.com\":\"spoofabc.com\"}' where the key is the hostname that needs to be spoofed and value is the hostname where it will be spoofed/redirected to. It can be tuned via SPOOF_MAP ENV.

    Use the following example to tune this:

    # contains the spoof map for the dns spoofing\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-spoof-sa\n  experiments:\n  - name: pod-dns-spoof\n    spec:\n      components:\n        env:\n        # map of host names\n        - name: SPOOF_MAP\n          value: '{\"abc.com\":\"spoofabc.com\"}'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-dns-spoof/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker runtime only.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock).

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-dns-spoof-sa\n  experiments:\n  - name: pod-dns-spoof\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # map of host names\n        - name: SPOOF_MAP\n          value: '{\"abc.com\":\"spoofabc.com\"}'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-http-latency/","title":"Pod HTTP Latency","text":""},{"location":"experiments/categories/pods/pod-http-latency/#introduction","title":"Introduction","text":"
    • It injects http response latency on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to lossy/flaky http responses.

    Scenario: Add latency to the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-latency/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-latency/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-latency-sa\n  namespace: default\n  labels:\n    name: pod-http-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-latency-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-latency-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-latency/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 LATENCY Latency value in ms to be added to requests Defaults to 2000

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http latency chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-latency/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-latency/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    # provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#latency","title":"Latency","text":"

    It defines the latency value to be added to the http request. It can be tuned via LATENCY ENV.

    Use the following example to tune this:

    ## provide the latency value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the latency value\n        - name: LATENCY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n
    "},{"location":"experiments/categories/pods/pod-http-latency/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-latency-sa\n  experiments:\n  - name: pod-http-latency\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/","title":"Pod HTTP Modify Body","text":""},{"location":"experiments/categories/pods/pod-http-modify-body/#introduction","title":"Introduction","text":"
    • It injects http modify body chaos on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • Can be used to overwrite the http response body by providing the new body value as RESPONSE_BODY.
    • It can test the application's resilience to error or incorrect http response body.

    Scenario: Modify Body of the HTTP response

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-modify-body experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-modify-body-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-body-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-modify-body-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-modify-body-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 RESPONSE_BODY Body string to overwrite the http response body If no value is provided, response will be an empty body. Defaults to empty body

    Variables Description Notes CONTENT_ENCODING Encoding type to compress/encodde the response body Accepted values are: gzip, deflate, br, identity. Defaults to none (no encoding) CONTENT_TYPE Content type of the response body Defaults to text/plain PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http modify body chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-http-modify-body/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-modify-body/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-modify-body/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#response-body","title":"RESPONSE BODY","text":"

    It defines the body string that will overwrite the http response body. It can be tuned via RESPONSE_BODY ENV.

    Use the following example to tune this:

    ## provide the response body value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#content-encoding-and-content-type","title":"Content Encoding and Content Type","text":"

    It defines the content encoding and content type of the response body. It can be tuned via CONTENT_ENCODING and CONTENT_TYPE ENV.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the encoding type for the response body\n        # currently supported value are gzip, deflate\n        # if empty no encoding will be applied\n        - name: CONTENT_ENCODING\n          value: 'gzip'\n        # provide the content type for the response body\n        - name: CONTENT_TYPE\n          value: 'text/html'\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-body/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-body-sa\n  experiments:\n  - name: pod-http-modify-body\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # provide the body string to overwrite the response body\n        - name: RESPONSE_BODY\n          value: '2000'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/","title":"Pod HTTP Modify Header","text":""},{"location":"experiments/categories/pods/pod-http-modify-header/#introduction","title":"Introduction","text":"
    • It injects http modify header on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can cause modification of headers of requests and responses of the service. This can be used to test service resilience towards incorrect or incomplete headers.

    Scenario: Modify Header of the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-modify-header experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-modify-header-sa\n  namespace: default\n  labels:\n    name: pod-http-modify-header-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-modify-header-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-modify-header-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 HEADERS_MAP Map of headers to modify/add Eg: {\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}. To remove a header, just set the value to \"\"; Eg: {\"X-Litmus-Test-Header\": \"\"} HEADER_MODE Whether to modify response headers or request headers. Accepted values: request, response Defaults to response

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http modify header chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-modify-header/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-modify-header/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#headers-map","title":"Headers Map","text":"

    It is the map of headers that are to be modified or added to the Http request/response. It can be tuned via HEADERS_MAP ENV.

    Use the following example to tune this:

    ## provide the headers as a map\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # map of headers to modify/add; Eg: {\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}\n        # to remove a header, just set the value to \"\"; Eg: {\"X-Litmus-Test-Header\": \"\"}\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#header-mode","title":"Header Mode","text":"

    It defined whether the request or the response header has to be modified. It can be tuned via HEADER_MODE ENV.

    Use the following example to tune this:

    ## provide the mode of the header modification; request/response\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # whether to modify response headers or request headers. Accepted values: request, response\n        - name: HEADER_MODE\n          value: 'response'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-modify-header-sa\n  experiments:\n  - name: pod-http-modify-header\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-modify-header/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-chaos-sa\n  experiments:\n  - name: pod-http-chaos\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # map of headers to modify/add\n        - name: HEADERS_MAP\n          value: '{\"X-Litmus-Test-Header\": \"X-Litmus-Test-Value\"}'\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/","title":"Pod HTTP Reset Peer","text":""},{"location":"experiments/categories/pods/pod-http-reset-peer/#introduction","title":"Introduction","text":"
    • It injects http reset on the service whose port is provided as TARGET_SERVICE_PORT which stops outgoing http requests by resetting the TCP connection by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to lossy/flaky http connection.

    Scenario: Add reset peer to the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-reset-peer experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-reset-peer-sa\n  namespace: default\n  labels:\n    name: pod-http-reset-peer-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-reset-peer-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-reset-peer-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target Defaults to port 80 RESET_TIMEOUT Reset Timeout specifies after how much duration to reset the connection Defaults to 0

    Variables Description Notes PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http reset peer chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-reset-peer/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV. Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#reset-timeout","title":"RESET TIMEOUT","text":"

    It defines the reset timeout value to be added to the http request. It can be tuned via RESET_TIMEOUT ENV.

    Use the following example to tune this:

    ## provide the reset timeout value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # reset timeout specifies after how much duration to reset the connection\n        - name: RESET_TIMEOUT #in ms\n          value: '2000'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n
    "},{"location":"experiments/categories/pods/pod-http-reset-peer/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-reset-peer-sa\n  experiments:\n  - name: pod-http-reset-peer\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/","title":"Pod HTTP Status Code","text":""},{"location":"experiments/categories/pods/pod-http-status-code/#introduction","title":"Introduction","text":"
    • It injects http status code chaos inside the pod which modifies the status code of the response from the provided application server to desired status code provided by user on the service whose port is provided as TARGET_SERVICE_PORT by starting proxy server and then redirecting the traffic through the proxy server.
    • It can test the application's resilience to error code http responses from the provided application server.

    Scenario: Modify http response status code of the HTTP request

    "},{"location":"experiments/categories/pods/pod-http-status-code/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-http-status-code/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.17
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-http-status-code experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-http-status-code/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-http-status-code-sa\n  namespace: default\n  labels:\n    name: pod-http-status-code-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-http-status-code-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-http-status-code-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes TARGET_SERVICE_PORT Port of the service to target This should be the port on which the application container runs at the pod level, not at the service level. Defaults to port 80 STATUS_CODE Modified status code for the HTTP response If no value is provided, then a random value is selected from the list of supported values. Multiple values can be provided as comma separated, a random value from the provided list will be selected Supported values: [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]. Defaults to random status code MODIFY_RESPONSE_BODY Whether to modify the body as per the status code provided. If true, then the body is replaced by a default template for the status code. Defaults to true

    Variables Description Notes RESPONSE_BODY Body string to overwrite the http response body This will be used only if MODIFY_RESPONSE_BODY is set to true. If no value is provided, response will be an empty body. Defaults to empty body CONTENT_ENCODING Encoding type to compress/encodde the response body Accepted values are: gzip, deflate, br, identity. Defaults to none (no encoding) CONTENT_TYPE Content type of the response body Defaults to text/plain PROXY_PORT Port where the proxy will be listening for requests Defaults to 20000 NETWORK_INTERFACE Network interface to be used for the proxy Defaults to eth0 TOXICITY Percentage of HTTP requests to be affected Defaults to 100 CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The duration of chaos injection (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod http status code chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-http-status-code/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-http-status-code/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-http-status-code/#target-service-port","title":"Target Service Port","text":"

    It defines the port of the targeted service that is being targeted. It can be tuned via TARGET_SERVICE_PORT ENV. This should be the port where the application runs at the pod level, not at the service level. This means if the application pod is running the service at port 8080 and we create a service exposing that at port 80, then the target service port should be 8080 and not 80, which is the port at pod-level.

    Use the following example to tune this:

    ## provide the port of the targeted service\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#proxy-port","title":"Proxy Port","text":"

    It defines the port on which the proxy server will listen for requests. It can be tuned via PROXY_PORT ENV.

    Use the following example to tune this:

    ## provide the port for proxy server\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the port for proxy server\n        - name: PROXY_PORT\n          value: '8080'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#status-code","title":"Status Code","text":"

    It defines the status code value for the http response. It can be tuned via STATUS_CODE ENV.

    Use the following example to tune this:

    ## modified status code for the http response\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # modified status code for the http response\n        # if no value is provided, a random status code from the supported code list will selected\n        # if multiple comma separated values are provided, then a random value from the provided list will be selected\n        # if an invalid status code is provided, the experiment will fail\n        # supported status code list: [200, 201, 202, 204, 300, 301, 302, 304, 307, 400, 401, 403, 404, 500, 501, 502, 503, 504]\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#modify-response-body","title":"Modify Response Body","text":"

    It defines whether to modify the respone body with a pre-defined template to match with the status code value of the http response. It can be tuned via MODIFY_RESPONSE_BODY ENV.

    Use the following example to tune this:

    ##  whether to modify the body as per the status code provided\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#toxicity","title":"Toxicity","text":"

    It defines the toxicity value to be added to the http request. It can be tuned via TOXICITY ENV. Toxicity value defines the percentage of the total number of http requests to be affected.

    Use the following example to tune this:

    ## provide the toxicity\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # toxicity is the probability of the request to be affected\n        # provide the percentage value in the range of 0-100\n        # 0 means no request will be affected and 100 means all request will be affected\n        - name: TOXICITY\n          value: \"100\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#response-body","title":"RESPONSE BODY","text":"

    It defines the body string that will overwrite the http response body. It can be tuned via RESPONSE_BODY and MODIFY_RESPONSE_BODY ENV. The MODIFY_RESPONSE_BODY ENV should be set to true to enable this feature.

    Use the following example to tune this:

    ## provide the response body value\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the body string to overwrite the response body. This will be used only if MODIFY_RESPONSE_BODY is set to true\n        - name: RESPONSE_BODY\n          value: '<h1>Hello World</h1>'\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#content-encoding-and-content-type","title":"Content Encoding and Content Type","text":"

    It defines the content encoding and content type of the response body. It can be tuned via CONTENT_ENCODING and CONTENT_TYPE ENV.

    Use the following example to tune this:

    ##  whether to modify the body as per the status code provided\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the encoding type for the response body\n        # currently supported value are gzip, deflate\n        # if empty no encoding will be applied\n        - name: CONTENT_ENCODING\n          value: 'gzip'\n        # provide the content type for the response body\n        - name: CONTENT_TYPE\n          value: 'text/html'\n        #  whether to modify the body as per the status code provided\n        - name: \"MODIFY_RESPONSE_BODY\"\n          value: \"true\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#network-interface","title":"Network Interface","text":"

    It defines the network interface to be used for the proxy. It can be tuned via NETWORK_INTERFACE ENV.

    Use the following example to tune this:

    ## provide the network interface for proxy\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # provide the network interface for proxy\n        - name: NETWORK_INTERFACE\n          value: \"eth0\"\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: '80'\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-http-status-code/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-http-status-code-sa\n  experiments:\n  - name: pod-http-status-code\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        # provide the port of the targeted service\n        - name: TARGET_SERVICE_PORT\n          value: \"80\"\n        # modified status code for the http response\n        - name: STATUS_CODE\n          value: '500'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/","title":"Pod IO Stress","text":""},{"location":"experiments/categories/pods/pod-io-stress/#introduction","title":"Introduction","text":"
    • This experiment causes disk stress on the application pod. The experiment aims to verify the resiliency of applications that share this disk resource for ephemeral or persistent storage purposes

    Scenario: Stress the IO of the target pod

    "},{"location":"experiments/categories/pods/pod-io-stress/#uses","title":"Uses","text":"View the uses of the experiment

    Disk Pressure or CPU hogs is another very common and frequent scenario we find in kubernetes applications that can result in the eviction of the application replica and impact its delivery. Such scenarios that can still occur despite whatever availability aids K8s provides. These problems are generally referred to as \"Noisy Neighbour\" problems

    Stressing the disk with continuous and heavy IO for example can cause degradation in reads written by other microservices that use this shared disk for example modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out. Another issue is the amount of scratch space eaten up on a node which leads to the lack of space for newer containers to get scheduled (kubernetes too gives up by applying an \"eviction\" taint like \"disk-pressure\") and causes a wholesale movement of all pods to other nodes

    "},{"location":"experiments/categories/pods/pod-io-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-io-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-io-stress/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-io-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup..

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-io-stress-sa\n  namespace: default\n  labels:\n    name: pod-io-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-io-stress-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-io-stress-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-io-stress/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes FILESYSTEM_UTILIZATION_PERCENTAGE Specify the size as percentage of free space on the file system Default to 10% FILESYSTEM_UTILIZATION_BYTES Specify the size in GigaBytes(GB). FILESYSTEM_UTILIZATION_PERCENTAGE & FILESYSTEM_UTILIZATION_BYTES are mutually exclusive. If both are provided, FILESYSTEM_UTILIZATION_PERCENTAGE is prioritized. NUMBER_OF_WORKERS It is the number of IO workers involved in IO disk stress Default to 4 TOTAL_CHAOS_DURATION The time duration for chaos (seconds) Default to 120s VOLUME_MOUNT_PATH Fill the given volume mount path LIB The chaos lib used to inject the chaos Default to litmus. Available litmus and pumba. LIB_IMAGE Image used to run the stress command Default to litmuschaos/go-runner:latest TARGET_PODS Comma separated list of application pod name subjected to pod io stress chaos If not provided, it will select target pods randomly based on provided appLabels PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-io-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-io-stress/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-io-stress/#filesystem-utilization-percentage","title":"Filesystem Utilization Percentage","text":"

    It stresses the FILESYSTEM_UTILIZATION_PERCENTAGE percentage of total free space available in the pod.

    Use the following example to tune this:

    # stress the i/o of the targeted pod with FILESYSTEM_UTILIZATION_PERCENTAGE of total free space \n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_BYTES.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # percentage of free space of file system, need to be stressed\n        - name: FILESYSTEM_UTILIZATION_PERCENTAGE\n          value: '10' #in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#filesystem-utilization-bytes","title":"Filesystem Utilization Bytes","text":"

    It stresses the FILESYSTEM_UTILIZATION_BYTES GB of the i/o of the targeted pod. It is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE ENV. If FILESYSTEM_UTILIZATION_PERCENTAGE ENV is set then it will use the percentage for the stress otherwise, it will stress the i/o based on FILESYSTEM_UTILIZATION_BYTES ENV.

    Use the following example to tune this:

    # stress the i/o of the targeted pod with given FILESYSTEM_UTILIZATION_BYTES\n# it is mutually exclusive with the FILESYSTEM_UTILIZATION_PERCENTAGE.\n# if both are provided then it will use FILESYSTEM_UTILIZATION_PERCENTAGE for stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # size of io to be stressed\n        - name: FILESYSTEM_UTILIZATION_BYTES\n          value: '1' #in GB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#mount-path","title":"Mount Path","text":"

    The volume mount path, which needs to be filled. It can be tuned with VOLUME_MOUNT_PATH ENV.

    Use the following example to tune this:

    # provide the volume mount path, which needs to be filled\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # path need to be stressed/filled\n        - name: VOLUME_MOUNT_PATH\n          value: '/some-dir-in-container'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#workers-for-stress","title":"Workers For Stress","text":"

    The worker's count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # number of workers for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # number of io workers \n        - name: NUMBER_OF_WORKERS\n          value: '4'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-io-stress/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus.

    Use the following example to tune this:

    # use the pumba lib for io stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-io-stress-sa\n  experiments:\n  - name: pod-io-stress\n    spec:\n      components:\n        env:\n        # name of lib\n        # it supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/","title":"Pod Memory Hog Exec","text":""},{"location":"experiments/categories/pods/pod-memory-hog-exec/#introduction","title":"Introduction","text":"
    • This experiment consumes the Memory resources on the application container on specified memory in megabytes.

    • It simulates conditions where app pods experience Memory spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the Memory

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#uses","title":"Uses","text":"View the uses of the experiment

    Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.

    This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-memory-hog-exec experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-memory-hog-exec-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-exec-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-memory-hog-exec-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-memory-hog-exec-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes MEMORY_CONSUMPTION The amount of memory used of hogging a Kubernetes pod (megabytes) Defaults to 500MB (Up to 2000MB) TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus Defaults to litmus TARGET_PODS Comma separated list of application pod name subjected to pod memory hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos If not provided, it will select the first container of the target pod CHAOS_KILL_COMMAND The command to kill the chaos process Defaults to kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1). Another useful one that generally works (in case the default doesn't) is kill -9 $(ps afx | grep \"[dd] if=/dev/zero\" | awk '{print $1}' | tr '\\n' ' '). In case neither works, please check whether the target pod's base image offers a shell. If yes, identify appropriate shell command to kill the chaos process PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-memory-hog-exec/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#memory-consumption","title":"Memory Consumption","text":"

    It stresses the MEMORY_CONSUMPTION MB memory of the targeted pod for the TOTAL_CHAOS_DURATION duration. The memory consumption limit is 2000MB

    Use the following example to tune this:

    # memory to be stressed in MB\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # memory consuption value in MB\n        # it is limited to 2000MB\n        - name: MEMORY_CONSUMPTION\n          value: '500' #in MB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog-exec/#chaos-kill-commands","title":"Chaos Kill Commands","text":"

    It defines the CHAOS_KILL_COMMAND ENV to set the chaos kill command. Default values of CHAOS_KILL_COMMAND command:

    • CHAOS_KILL_COMMAND: \"kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1)\"

    Use the following example to tune this:

    # provide the chaos kill command used to kill the chaos process\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-exec-sa\n  experiments:\n  - name: pod-memory-hog-exec\n    spec:\n      components:\n        env:\n        # command to kill the dd process\n        # alternative command: \"kill -9 $(ps afx | grep \"[dd] if=/dev/zero\" | awk '{print $1}' | tr '\\n' ' ')\"\n        - name: CHAOS_KILL_COMMAND\n          value: \"kill $(find /proc -name exe -lname '*/dd' 2>&1 | grep -v 'Permission denied' | awk -F/ '{print $(NF-1)}' | head -n 1)\"\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/","title":"Pod Memory Hog","text":""},{"location":"experiments/categories/pods/pod-memory-hog/#introduction","title":"Introduction","text":"
    • This experiment consumes the Memory resources on the application container on specified memory in megabytes.
    • It simulates conditions where app pods experience Memory spikes either due to expected/undesired processes thereby testing how the overall application stack behaves when this occurs.

    Scenario: Stress the Memory

    "},{"location":"experiments/categories/pods/pod-memory-hog/#uses","title":"Uses","text":"View the uses of the experiment

    Memory usage within containers is subject to various constraints in Kubernetes. If the limits are specified in their spec, exceeding them can cause termination of the container (due to OOMKill of the primary process, often pid 1) - the restart of the container by kubelet, subject to the policy specified. For containers with no limits placed, the memory usage is uninhibited until such time as the Node level OOM Behaviour takes over. In this case, containers on the node can be killed based on their oom_score and the QoS class a given pod belongs to (bestEffort ones are first to be targeted). This eval is extended to all pods running on the node - thereby causing a bigger blast radius.

    This experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified

    "},{"location":"experiments/categories/pods/pod-memory-hog/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-memory-hog experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-memory-hog/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)  \n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-memory-hog-sa\n  namespace: default\n  labels:\n    name: pod-memory-hog-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-memory-hog-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-memory-hog-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes MEMORY_CONSUMPTION The amount of memory used of hogging a Kubernetes pod (megabytes) Defaults to 500MB NUMBER_OF_WORKERS The number of workers used to run the stress process Defaults to 1 TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Defaults to 60s LIB The chaos lib used to inject the chaos. Available libs are litmus and pumba Defaults to litmus LIB_IMAGE Image used to run the helper pod. Defaults to litmuschaos/go-runner:1.13.8 STRESS_IMAGE Container run on the node at runtime by the pumba lib to inject stressors. Only used in LIB pumba Default to alexeiled/stress-ng:latest-ubuntu TARGET_PODS Comma separated list of application pod name subjected to pod memory hog chaos If not provided, it will select target pods randomly based on provided appLabels TARGET_CONTAINER Name of the target container under chaos. If not provided, it will select the first container of the target pod CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only RAMP_TIME Period to wait before injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel"},{"location":"experiments/categories/pods/pod-memory-hog/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-memory-hog/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-memory-hog/#memory-consumption","title":"Memory Consumption","text":"

    It stresses the MEMORY_CONSUMPTION MB memory of the targeted pod for the TOTAL_CHAOS_DURATION duration.

    Use the following example to tune this:

    # define the memory consumption in MB\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # memory consumption value\n        - name: MEMORY_CONSUMPTION\n          value: '500' #in MB\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/#workers-for-stress","title":"Workers For Stress","text":"

    The worker's count for the stress can be tuned with NUMBER_OF_WORKERS ENV.

    Use the following example to tune this:

    # number of workers used for the stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # number of workers for stress\n        - name: NUMBER_OF_WORKERS\n          value: '1'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-memory-hog/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.
    "},{"location":"experiments/categories/pods/pod-memory-hog/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the stress image via STRESS_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use the pumba lib for the memory stress\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-memory-hog-sa\n  experiments:\n  - name: pod-memory-hog\n    spec:\n      components:\n        env:\n        # name of chaoslib\n        # it supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # stress image - applicable for pumba lib only\n        - name: STRESS_IMAGE\n          value: 'alexeiled/stress-ng:latest-ubuntu'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/","title":"Pod Network Corruption","text":""},{"location":"experiments/categories/pods/pod-network-corruption/#introduction","title":"Introduction","text":"
    • It injects packet corruption on the specified container by starting a traffic control (tc) process with netem rules to add egress packet corruption
    • It can test the application's resilience to lossy/flaky network

    Scenario: Corrupt the network packets of target pod

    "},{"location":"experiments/categories/pods/pod-network-corruption/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-corruption/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-corruption experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-corruption/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-corruption-sa\n  namespace: default\n  labels:\n    name: pod-network-corruption-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-corruption-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-corruption-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network corruption Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_CORRUPTION_PERCENTAGE Packet corruption in percentage Default (100) CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-corruption/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-corruption/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-corruption/#network-packet-corruption","title":"Network Packet Corruption","text":"

    It defines the network packet corruption percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_CORRUPTION_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-corruption for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # network packet corruption percentage\n        - name: NETWORK_PACKET_CORRUPTION_PERCENTAGE\n          value: '100' #in percentage\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-corruption/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-corruption-sa\n  experiments:\n  - name: pod-network-corruption\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/","title":"Pod Network Duplication","text":""},{"location":"experiments/categories/pods/pod-network-duplication/#introduction","title":"Introduction","text":"
    • It injects chaos to disrupt network connectivity to kubernetes pods.
    • It causes Injection of network duplication on the specified container by starting a traffic control (tc) process with netem rules to add egress delays. It Can test the application's resilience to duplicate network.

    Scenario: Duplicate the network packets of target pod

    "},{"location":"experiments/categories/pods/pod-network-duplication/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-duplication/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-duplication experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-duplication/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-duplication-sa\n  namespace: default\n  labels:\n    name: pod-network-duplication-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-duplication-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-duplication-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network latency Optional Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_DUPLICATION_PERCENTAGE The packet duplication in percentage Optional Default to 100 percentage CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-duplication/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-duplication/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-duplication/#network-packet-duplication","title":"Network Packet Duplication","text":"

    It defines the network packet duplication percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_DUPLICATION_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-duplication for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # network packet duplication percentage\n        - name: NETWORK_PACKET_DUPLICATION_PERCENTAGE\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-duplication/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-duplication-sa\n  experiments:\n  - name: pod-network-duplication\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/","title":"Pod Network Latency","text":""},{"location":"experiments/categories/pods/pod-network-latency/#introduction","title":"Introduction","text":"
    • It injects latency on the specified container by starting a traffic control (tc) process with netem rules to add egress delays
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce letency in the network of target pod

    "},{"location":"experiments/categories/pods/pod-network-latency/#uses","title":"Uses","text":"View the uses of the experiment

    The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container). The idea of this experiment is to simulate issues within your pod network OR microservice communication across services in different availability zones/regions etc.

    Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things. Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree. The experiment provides DESTINATION_IPS or DESTINATION_HOSTS so that you can control the chaos against specific services within or outside the cluster.

    The applications may stall or get corrupted while they wait endlessly for a packet. The experiment limits the impact (blast radius) to only the traffic you want to test by specifying IP addresses or application information.This experiment will help to improve the resilience of your services over time

    "},{"location":"experiments/categories/pods/pod-network-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-latency/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-latency-sa\n  namespace: default\n  labels:\n    name: pod-network-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-latency-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-latency-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-latency/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network latency Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_LATENCY The latency/delay in milliseconds Default 2000, provide numeric value only JITTER The network jitter value in ms Default 0, provide numeric value only CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-latency/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-latency/#network-latency","title":"Network Latency","text":"

    It defines the network latency(in ms) to be injected in the targeted application. It can be tuned via NETWORK_LATENCY ENV.

    Use the following example to tune this:

    # it inject the network-latency for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # network latency to be injected\n        - name: NETWORK_LATENCY\n          value: '2000' #in ms\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#jitter","title":"Jitter","text":"

    It defines the jitter (in ms), a parameter that allows introducing a network delay variation. It can be tuned via JITTER ENV. Its default value is 0.

    Use the following example to tune this:

    # provide the network latency jitter\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # value of the network latency jitter (in ms)\n        - name: JITTER\n          value: '200'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-latency/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-latency-sa\n  experiments:\n  - name: pod-network-latency\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/","title":"Pod Network Loss","text":""},{"location":"experiments/categories/pods/pod-network-loss/#introduction","title":"Introduction","text":"
    • It injects packet loss on the specified container by starting a traffic control (tc) process with netem rules to add egress loss
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce network loss of the target pod

    "},{"location":"experiments/categories/pods/pod-network-loss/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-loss/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-loss experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-loss/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-loss/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log\n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # deriving the parent/owner details of the pod(if parent is anyof {deployment, statefulset, daemonsets})\n  - apiGroups: [\"apps\"]\n    resources: [\"deployments\",\"statefulsets\",\"replicasets\", \"daemonsets\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"apps.openshift.io\"]\n    resources: [\"deploymentconfigs\"]\n    verbs: [\"list\",\"get\"]\n  # deriving the parent/owner details of the pod(if parent is deploymentConfig)\n  - apiGroups: [\"\"]\n    resources: [\"replicationcontrollers\"]\n    verbs: [\"get\",\"list\"]\n  # deriving the parent/owner details of the pod(if parent is argo-rollouts)\n  - apiGroups: [\"argoproj.io\"]\n    resources: [\"rollouts\"]\n    verbs: [\"list\",\"get\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-loss-sa\n  namespace: default\n  labels:\n    name: pod-network-loss-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-loss-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-loss-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-loss/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes NETWORK_INTERFACE Name of ethernet interface considered for shaping traffic TARGET_CONTAINER Name of container which is subjected to network loss Optional Applicable for containerd & CRI-O runtime only. Even with these runtimes, if the value is not provided, it injects chaos on the first container of the pod NETWORK_PACKET_LOSS_PERCENTAGE The packet loss in percentage Optional Default to 100 percentage CONTAINER_RUNTIME container runtime interface for the cluster Defaults to containerd, supported values: docker, containerd and crio for litmus and only docker for pumba LIB SOCKET_PATH Path of the containerd/crio/docker socket file Defaults to /run/containerd/containerd.sock TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) TARGET_PODS Comma separated list of application pod name subjected to pod network corruption chaos If not provided, it will select target pods randomly based on provided appLabels DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined SOURCE_PORTS ports of the target application, the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports DESTINATION_PORTS ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated port(s) can be provided. If not provided, it will induce network chaos for all ports PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0 (corresponds to 1 replica), provide numeric value only LIB The chaos lib used to inject the chaos Default value: litmus, supported values: pumba and litmus TC_IMAGE Image used for traffic control in linux default value is gaiadocker/iproute2 LIB_IMAGE Image used to run the netem command Defaults to litmuschaos/go-runner:latest RAMP_TIME Period to wait before and after injection of chaos in sec SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel

    "},{"location":"experiments/categories/pods/pod-network-loss/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-loss/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-loss/#network-packet-loss","title":"Network Packet Loss","text":"

    It defines the network packet loss percentage to be injected in the targeted application. It can be tuned via NETWORK_PACKET_LOSS_PERCENTAGE ENV.

    Use the following example to tune this:

    # it inject the network-loss for the egress traffic\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # network packet loss percentage\n        - name: NETWORK_PACKET_LOSS_PERCENTAGE\n          value: '100'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network experiments interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for the egress traffic for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#source-and-destination-ports","title":"Source And Destination Ports","text":"

    The network experiments interrupt traffic for all the source & destination ports by default. The interruption of specific port(s) can be tuned via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: It contains ports of the target application, the accessibility to which is impacted
    • DESTINATION_PORTS: It contains the ports of the destination services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted

    Use the following example to tune this:

    # it inject the chaos for the ingrees and egress traffic for specific ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # supports comma separated source ports\n        - name: SOURCE_PORTS\n          value: '80'\n        # supports comma separated destination ports\n        - name: DESTINATION_PORTS\n          value: '8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#blacklist-source-and-destination-ports","title":"Blacklist Source and Destination Ports","text":"

    By default, the network experiments disrupt traffic for all the source and destination ports. The specific ports can be blacklisted via SOURCE_PORTS and DESTINATION_PORTS ENV.

    • SOURCE_PORTS: Provide the comma separated source ports preceded by !, that you'd like to blacklist from the chaos.
    • DESTINATION_PORTS: Provide the comma separated destination ports preceded by ! , that you'd like to blacklist from the chaos.

    Use the following example to tune this:

    # blacklist the source and destination ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # it will blacklist 80 and 8080 source ports\n        - name: SOURCE_PORTS\n          value: '!80,8080'\n        # it will blacklist 8080 and 9000 destination ports\n        - name: DESTINATION_PORTS\n          value: '!8080,9000'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#network-interface","title":"Network Interface","text":"

    The defined name of the ethernet interface, which is considered for shaping traffic. It can be tuned via NETWORK_INTERFACE ENV. Its default value is eth0.

    Use the following example to tune this:

    # provide the network interface\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # name of the network interface\n        - name: NETWORK_INTERFACE\n          value: 'eth0'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#container-runtime-socket-path","title":"Container Runtime Socket Path","text":"

    It defines the CONTAINER_RUNTIME and SOCKET_PATH ENV to set the container runtime and socket file path.

    • CONTAINER_RUNTIME: It supports docker, containerd, and crio runtimes. The default value is docker.
    • SOCKET_PATH: It contains path of docker socket file by default(/run/containerd/containerd.sock). For other runtimes provide the appropriate path.

    Use the following example to tune this:

    ## provide the container runtime and socket file path\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # runtime for the container\n        # supports docker, containerd, crio\n        - name: CONTAINER_RUNTIME\n          value: 'containerd'\n        # path of the socket file\n        - name: SOCKET_PATH\n          value: '/run/containerd/containerd.sock'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-loss/#pumba-chaos-library","title":"Pumba Chaos Library","text":"

    It specifies the Pumba chaos library for the chaos injection. It can be tuned via LIB ENV. The defaults chaos library is litmus. Provide the traffic control image via TC_IMAGE ENV for the pumba library.

    Use the following example to tune this:

    # use pumba chaoslib for the network chaos\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-loss-sa\n  experiments:\n  - name: pod-network-loss\n    spec:\n      components:\n        env:\n        # name of the chaoslib\n        # supports litmus and pumba lib\n        - name: LIB\n          value: 'pumba'\n        # image used for the traffic control in linux\n        # applicable for pumba lib only\n        - name: TC_IMAGE\n          value: 'gaiadocker/iproute2'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/","title":"Pod Network Partition","text":""},{"location":"experiments/categories/pods/pod-network-partition/#introduction","title":"Introduction","text":"
    • It blocks the 100% Ingress and Egress traffic of the target application by creating network policy.
    • It can test the application's resilience to lossy/flaky network

    Scenario: Induce network loss of the target pod

    "},{"location":"experiments/categories/pods/pod-network-partition/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/pods/pod-network-partition/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the pod-network-partition experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    "},{"location":"experiments/categories/pods/pod-network-partition/#default-validations","title":"Default Validations","text":"View the default validations

    The application pods should be in running state before and after chaos injection.

    "},{"location":"experiments/categories/pods/pod-network-partition/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # performs CRUD operations on the network policies\n  - apiGroups: [\"networking.k8s.io\"]\n    resources: [\"networkpolicies\"]\n    verbs: [\"create\",\"delete\",\"list\",\"get\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: pod-network-partition-sa\n  namespace: default\n  labels:\n    name: pod-network-partition-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: pod-network-partition-sa\nsubjects:\n- kind: ServiceAccount\n  name: pod-network-partition-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/pods/pod-network-partition/#experiment-tunables","title":"Experiment tunablesOptional Fields","text":"check the experiment tunables

    Variables Description Notes TOTAL_CHAOS_DURATION The time duration for chaos insertion (seconds) Default (60s) POLICY_TYPES Contains type of network policy It supports egress, ingress and all values POD_SELECTOR Contains labels of the destination pods NAMESPACE_SELECTOR Contains labels of the destination namespaces PORTS Comma separated list of the targeted ports DESTINATION_IPS IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted comma separated IP(S) or CIDR(S) can be provided. if not provided, it will induce network chaos for all ips/destinations DESTINATION_HOSTS DNS Names/FQDN names of the services, the accessibility to which, is impacted if not provided, it will induce network chaos for all ips/destinations or DESTINATION_IPS if already defined LIB The chaos lib used to inject the chaos supported value: litmus RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/pods/pod-network-partition/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/pods/pod-network-partition/#common-and-pod-specific-tunables","title":"Common and Pod specific tunables","text":"

    Refer the common attributes and Pod specific tunable to tune the common tunables for all experiments and pod specific tunables.

    "},{"location":"experiments/categories/pods/pod-network-partition/#destination-ips-and-destination-hosts","title":"Destination IPs And Destination Hosts","text":"

    The network partition experiment interrupt traffic for all the IPs/hosts by default. The interruption of specific IPs/Hosts can be tuned via DESTINATION_IPS and DESTINATION_HOSTS ENV.

    • DESTINATION_IPS: It contains the IP addresses of the services or pods or the CIDR blocks(range of IPs), the accessibility to which is impacted.
    • DESTINATION_HOSTS: It contains the DNS Names/FQDN names of the services, the accessibility to which, is impacted.

    Use the following example to tune this:

    # it inject the chaos for specific ips/hosts\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # supports comma separated destination ips\n        - name: DESTINATION_IPS\n          value: '8.8.8.8,192.168.5.6'\n        # supports comma separated destination hosts\n        - name: DESTINATION_HOSTS\n          value: 'nginx.default.svc.cluster.local,google.com'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#target-specific-namespaces","title":"Target Specific Namespace(s)","text":"

    The network partition experiment interrupt traffic for all the namespaces by default. The access to/from pods in specific namespace can be allowed via providing namespace labels inside NAMESPACE_SELECTOR ENV.

    Use the following example to tune this:

    # it inject the chaos for specified namespaces, matched by labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # labels of the destination namespace\n        - name: NAMESPACE_SELECTOR\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#target-specific-pods","title":"Target Specific Pod(s)","text":"

    The network partition experiment interrupt traffic for all the extranal pods by default. The access to/from specific pod(s) can be allowed via providing pod labels inside POD_SELECTOR ENV.

    Use the following example to tune this:

    # it inject the chaos for specified pods, matched by labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # labels of the destination pods\n        - name: POD_SELECTOR\n          value: 'key=value'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#policy-type","title":"Policy Type","text":"

    The network partition experiment interrupt both ingress and egress traffic by default. The interruption of either ingress or egress traffic can be tuned via POLICY_TYPES ENV.

    Use the following example to tune this:

    # inject network loss for only ingress or only engress or all traffics\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # provide the network policy type\n        # it supports `ingress`, `egress`, and `all` values\n        # default value is `all`\n        - name: POLICY_TYPES\n          value: 'all'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/pods/pod-network-partition/#destination-ports","title":"Destination Ports","text":"

    The network partition experiment interrupt traffic for all the external ports by default. Access to specific port(s) can be allowed by providing comma separated list of ports inside PORTS ENV.

    Note:

    • If PORT is not set and none of the pod-selector, namespace-selector and destination_ips are provided then it will block traffic for all ports for all pods/ips
    • If PORT is not set but any of the podselector, nsselector and destination ips are provided then it will allow all ports for all the pods/ips filtered by the specified selectors

    Use the following example to tune this:

    # it inject the chaos for specified ports\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-network-partition-sa\n  experiments:\n  - name: pod-network-partition\n    spec:\n      components:\n        env:\n        # comma separated list of ports\n        - name: PORTS\n          value: 'tcp: [8080,80], udp: [9000,90]'\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/","title":"Spring Boot App Kill","text":""},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject app-kill. When the configured methods are called in the application, it will shut down the application.

    Scenario: Kill Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-app-kill experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in the application. It can be enabled in two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this, it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-app-kill-sa\n  namespace: default\n  labels:\n    name: spring-boot-app-kill-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-app-kill-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-app-kill-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application

    Variables Description Notes CM_LEVEL It contains the number of requests to be attacked, n value means the nth request will be affected Default value is 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods by providing a comma-seperated list of fully qualified packages(class and/or method names) Default is an empty list, which means it will target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] Default it is restController SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-app-kill/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/","title":"Spring Boot CPU Stress","text":""},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject cpu-stress. Which attacks the CPU of the Java Virtual Machine. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Stress CPU of Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-cpu-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in the application. It can be enabled in two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-cpu-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-cpu-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-cpu-stress-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-cpu-stress-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application CPU_LOAD_FRACTION It contains fraction of CPU to be stressed, Eg: 0.95 equals 95% Default value is 0.9. It supports a value in range [0.1,1.0]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Default value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) Default is an empty list, which means it will target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] Default is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # stress cpu of spring-boot application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-cpu-stress-sa\n  experiments:\n    - name: spring-boot-cpu-stress\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-cpu-stress/#cpu-load-fraction","title":"CPU Load Fraction","text":"

    It contains fraction of cpu to be stressed, 0.95 equals 95%. It can be tunes via CPU_LOAD_FRACTION ENV

    Use the following example to tune this:

    # provide the cpu load fraction to be stressed\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-cpu-stress-sa\n  experiments:\n    - name: spring-boot-cpu-stress\n      spec:\n        components:\n          env:\n            # it contains the fraction of the used CPU. Eg: 0.95 equals 95%.\n            # it supports value in range [0.1,1.0]\n            - name: CPU_LOAD_FRACTION\n              value: '0.9'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/","title":"Spring Boot Exceptions","text":""},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject exceptions at runtime when the method is used. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Inject exceptions to Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-exceptions experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-exceptions-sa\n  namespace: default\n  labels:\n    name: spring-boot-exceptions-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-exceptions-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-exceptions-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application

    Variables Description Notes CM_EXCEPTIONS_TYPE It contains type of raised exception Defaults value: java.lang.IllegalArgumentException CM_EXCEPTIONS_ARGUMENTS It contains argument of raised exception Defaults value: java.lang.String:custom illegal argument exception CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-exceptions-sa\n  experiments:\n    - name: spring-boot-exceptions\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-exceptions/#exception-type-and-arguments","title":"Exception Type and Arguments","text":"

    Spring boot exception type and arguments can be tuned via CM_EXCEPTIONS_TYPE and CM_EXCEPTIONS_ARGUMENTS ENV

    Use the following example to tune this:

    # provide the exception type and args\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-exceptions-sa\n  experiments:\n    - name: spring-boot-exceptions\n      spec:\n        components:\n          env:\n            # Type of raised exception\n            - name: CM_EXCEPTIONS_TYPE\n              value: 'java.lang.IllegalArgumentException'\n\n             # Argument of the raised exception\n            - name: CM_EXCEPTIONS_ARGUMENTS\n              value: 'java.lang.String:custom illegal argument exception'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/","title":"Spring boot experiments tunables","text":"

    It contains the Spring Boot specific experiment tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#spring-boot-request-level","title":"Spring Boot request Level","text":"

    It contains number of requests are to be attacked, n value means each nth request will be affected. It can be tuned by CM_LEVEL ENV.

    Use the following example to tune this:

    # limits the number of requests to be attacked\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it contains the number of requests that are to be attacked.\n            # n value means nth request will be affected\n            - name: CM_LEVEL\n              value: '1'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#watch-custom-services","title":"Watch Custom Services","text":"

    It contains comma seperated list of fully qualified packages(class and/or method names), which limits watched packages/classes/methods. It can be tuned by CM_WATCHED_CUSTOM_SERVICES ENV.

    Use the following example to tune this:

    # it contains comma separated list of custom services\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it limits watched packages/classes/methods\n            - name: CM_WATCHED_CUSTOM_SERVICES\n              value: 'com.example.chaosdemo.controller.HelloController.sayHello,com.example.chaosdemo.service.HelloService'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-experiments-tunables/#watchers","title":"Watchers","text":"

    It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient]. It can be tuned by CM_WATCHERS ENV.

    Use the following example to tune this:

    # it contains comma separated list of watchers\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-app-kill-sa\n  experiments:\n    - name: spring-boot-app-kill\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # provide name of watcher\n            # it supports controller, restController, service, repository, component, webClient\n            - name: CM_WATCHERS\n              value: 'restController'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/","title":"Spring Boot Faults","text":""},{"location":"experiments/categories/spring-boot/spring-boot-faults/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject multiple spring boot faults simultaneously on the target pod.
    • It supports app-kill, cpu-stress, memory-stress, latency, and exceptions faults

    Scenario: Inject Spring Boot Faults

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-faults experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-faults-sa\n  namespace: default\n  labels:\n    name: spring-boot-faults-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-faults-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-faults-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application CM_KILL_APPLICATION_ACTIVE It enable the app-kill faults It supports boolean values. Default is false CM_LATENCY_ACTIVE It enable the latency faults It supports boolean values. Default is false CM_MEMORY_ACTIVE It enable the memory stress faults It supports boolean values. Default is false CM_CPU_ACTIVE It enable the cpu stress faults It supports boolean values. Default is false CM_EXCEPTIONS_ACTIVE It enable the exceptions faults It supports boolean values. Default is false CPU_LOAD_FRACTION It contains fraction of cpu to be stressed, 0.95 equals 95% default value is 0.9. It supports value in range [0.1,1.0] CM_EXCEPTIONS_TYPE It contains type of raised exception Defaults value: java.lang.IllegalArgumentException CM_EXCEPTIONS_ARGUMENTS It contains argument of raised exception Defaults value: java.lang.String:custom illegal argument exception LATENCY It contains network latency to be injected(in ms) default value is 2000 MEMORY_FILL_FRACTION It contains fraction of memory to be stressed, 0.7 equals 70% default value is 0.70. It supports value in range [0.01,0.95]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-faults/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#inject-multiple-faults-simultaneously-cpu-latency-and-exceptions","title":"Inject Multiple Faults Simultaneously (CPU, Latency and Exceptions)","text":"

    It injects cpu, latency, and exceptions faults simultaneously on the target pods

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-faults-sa\n  experiments:\n    - name: spring-boot-faults\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: '30'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it enables spring-boot latency fault\n            - name: CM_LATENCY_ACTIVE\n              value: 'true'\n\n            # provide the latency (ms)\n            # it is applicable when latency is active\n            - name: LATENCY\n              value: '2000'\n\n            # it enables spring-boot cpu stress fault\n            - name: CM_CPU_ACTIVE\n              value: 'true'\n\n            # it contains fraction of cpu to be stressed(0.95 equals 95%)\n            # it supports value in range [0.1,1.0]\n            # it is applicable when cpu is active\n            - name: CPU_LOAD_FRACTION\n              value: '0.9'\n\n            # it enables spring-boot exceptions fault\n            - name: CM_EXCEPTIONS_ACTIVE\n              value: 'true'\n\n            # Type of raised exception\n            # it is applicable when exceptions is active\n            - name: CM_EXCEPTIONS_TYPE\n              value: 'java.lang.IllegalArgumentException'\n\n              # Argument of raised exception\n              # it is applicable when exceptions is active\n            - name: CM_EXCEPTIONS_ARGUMENTS\n              value: 'java.lang.String:custom illegal argument exception'\n\n            ## percentage of total pods to target\n            - name: PODS_AFFECTED_PERC\n              value: ''\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-faults/#inject-multiple-faults-simultaneously-appkill-and-memory","title":"Inject Multiple Faults Simultaneously (Appkill and Memory)","text":"

    It injects appkill and memory stress faults simultaneously on the target pods

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-faults-sa\n  experiments:\n    - name: spring-boot-faults\n      spec:\n        components:\n          env:\n            # set chaos duration (in sec) as desired\n            - name: TOTAL_CHAOS_DURATION\n              value: '30'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n\n            # it enables spring app-kill fault\n            - name: CM_KILL_APPLICATION_ACTIVE\n              value: 'true'\n\n            # it enables spring-boot memory stress fault\n            - name: CM_MEMORY_ACTIVE\n              value: ''\n\n            # it contains fraction of memory to be stressed(0.70 equals 70%)\n            # it supports value in range [0.01,0.95]\n            # it is applicable when memory is active\n            - name: MEMORY_FILL_FRACTION\n              value: '0.70'\n\n            ## percentage of total pods to target\n            - name: PODS_AFFECTED_PERC\n              value: ''\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/","title":"Spring Boot Latency","text":""},{"location":"experiments/categories/spring-boot/spring-boot-latency/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject network latency to every nth request. This can be tuned via CM_LEVEL ENV.

    Scenario: Inject network latency to Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-latency experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-latency-sa\n  namespace: default\n  labels:\n    name: spring-boot-latency-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-latency-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-latency-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application LATENCY It contains network latency to be injected(in ms) default value is 2000

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-latency/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # kill spring-boot target application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-latency-sa\n  experiments:\n    - name: spring-boot-latency\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-latency/#network-latency","title":"Network Latency","text":"

    It contains network latency value in ms. It can be tunes via LATENCY ENV

    Use the following example to tune this:

    # provide the network latency\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-latency-sa\n  experiments:\n    - name: spring-boot-latency\n      spec:\n        components:\n          env:\n            # provide the latency (ms)\n            - name: LATENCY\n              value: '2000'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/","title":"Spring Boot Memory Stress","text":""},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#introduction","title":"Introduction","text":"
    • It can target random pods with a Spring Boot application and allows configuring the assaults to inject memory-stress. Which attacks the memory of the Java Virtual Machine. It tests the resiliency of the system when some applications are having unexpected faulty behavior.

    Scenario: Stress Memory of Spring Boot Application

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites

    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the spring-boot-memory-stress experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Chaos Monkey Spring Boot dependency should be present in application. It can be enabled by two ways:
      1. Add internal dependency inside the spring boot application
        1. Add Chaos Monkey for Spring Boot as dependency for your project
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot App with the chaos-monkey spring profile enabled
          java -jar your-app.jar --spring.profiles.active=chaos-monkey --chaos.monkey.enabled=true\n
      2. Add as external dependency
        1. You can extend your existing application with the chaos-monkey and add it as an external dependency at startup, for this it is necessary to use the PropertiesLauncher of Spring Boot
          <dependency>\n    <groupId>de.codecentric</groupId>\n    <artifactId>chaos-monkey-spring-boot</artifactId>\n    <classifier>jar-with-dependencies</classifier>\n    <version>2.6.1</version>\n</dependency>\n
        2. Start your Spring Boot application, add Chaos Monkey for Spring Boot JAR and properties
          java -cp your-app.jar -Dloader.path=chaos-monkey-spring-boot-2.6.1-jar-with-dependencies.jar org.springframework.boot.loader.PropertiesLauncher --spring.profiles.active=chaos-monkey --spring.config.location=file:./chaos-monkey.properties\n

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#default-validations","title":"Default Validations","text":"View the default validations
    • Spring boot pods are healthy before and after chaos injection
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre-installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]\n  # for creating and managing to execute commands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\n  name: spring-boot-memory-stress-sa\n  namespace: default\n  labels:\n    name: spring-boot-memory-stress-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: Role\n  name: spring-boot-memory-stress-sa\nsubjects:\n  - kind: ServiceAccount\n    name: spring-boot-memory-stress-sa\n    namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes CM_PORT It contains port of the spring boot application MEMORY_FILL_FRACTION It contains fraction of memory to be stressed, 0.7 equals 70% default value is 0.70. It supports value in range [0.01,0.95]

    Variables Description Notes CM_LEVEL It contains number of requests are to be attacked, n value means nth request will be affected Defaults value: 1, it lies in [1,10000] range CM_WATCHED_CUSTOM_SERVICES It limits watched packages/classes/methods, it contains comma seperated list of fully qualified packages(class and/or method names) ByDefault it is empty list, which means it target all services CM_WATCHERS It contains comma separated list of watchers from the following watchers list [controller, restController, service, repository, component, webClient] ByDefault it is restController TOTAL_CHAOS_DURATION The time duration for chaos injection (seconds) Defaults to 30 SEQUENCE It defines sequence of chaos execution for multiple target pods Default value: parallel. Supported: serial, parallel PODS_AFFECTED_PERC The Percentage of total pods to target Defaults to 0% (corresponds to 1 replica) LIB The chaos lib used to inject the chaos Defaults to litmus. Supported litmus only RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes and Spring Boot specific tunable to tune the common tunables for all experiments and spring-boot specific tunables.

    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#spring-boot-application-port","title":"Spring Boot Application Port","text":"

    It tunes the spring-boot application port via CM_PORT ENV

    Use the following example to tune this:

    # stress memory of spring-boot application\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-memory-stress-sa\n  experiments:\n    - name: spring-boot-memory-stress\n      spec:\n        components:\n          env:\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/spring-boot/spring-boot-memory-stress/#memory-fill-fraction","title":"Memory Fill Fraction","text":"

    It contains fraction of memory to be stressed, 0.70 equals 70%. It can be tunes via MEMORY_FILL_FRACTION ENV

    Use the following example to tune this:

    # provide the memory fraction to be filled\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: spring-boot-chaos\n  namespace: default\nspec:\n  appinfo:\n    appns: 'default'\n    applabel: 'app=spring-boot'\n    appkind: 'deployment'\n  # It can be active/stop\n  engineState: 'active'\n  chaosServiceAccount: spring-boot-memory-stress-sa\n  experiments:\n    - name: spring-boot-memory-stress\n      spec:\n        components:\n          env:\n            # it contains the fraction of used CPU. Eg: 0.70 equals 70%.\n            # it supports value in range [0.01,0.95]\n            - name: MEMORY_FILL_FRACTION\n              value: '0.70'\n\n            # port of the spring boot application\n            - name: CM_PORT\n              value: '8080'\n
    "},{"location":"experiments/categories/vmware/vm-poweroff/","title":"VM Poweroff","text":""},{"location":"experiments/categories/vmware/vm-poweroff/#introduction","title":"Introduction","text":"
    • It causes VMWare VMs to Stop/Power-off before bringing them back to Powered-on state after a specified chaos duration using the VMWare APIs to start/stop the target VM.
    • It helps to check the performance of the application/process running on the VMWare VMs.

    Scenario: poweroff the vm

    "},{"location":"experiments/categories/vmware/vm-poweroff/#uses","title":"Uses","text":"View the uses of the experiment

    coming soon

    "},{"location":"experiments/categories/vmware/vm-poweroff/#prerequisites","title":"Prerequisites","text":"Verify the prerequisites
    • Ensure that Kubernetes Version > 1.16
    • Ensure that the Litmus Chaos Operator is running by executing kubectl get pods in operator namespace (typically, litmus).If not, install from here
    • Ensure that the vm-poweroff experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
    • Ensure that you have sufficient Vcenter access to stop and start the VM.
    • (Optional) Ensure to create a Kubernetes secret having the Vcenter credentials in the CHAOS_NAMESPACE. A sample secret file looks like:

      apiVersion: v1\nkind: Secret\nmetadata:\n  name: vcenter-secret\n  namespace: litmus\ntype: Opaque\nstringData:\n    VCENTERSERVER: XXXXXXXXXXX\n    VCENTERUSER: XXXXXXXXXXXXX\n    VCENTERPASS: XXXXXXXXXXXXX\n

    Note: You can pass the VM credentials as secrets or as an chaosengine ENV variable.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#default-validations","title":"Default Validations","text":"View the default validations
    • VM should be in healthy state.
    "},{"location":"experiments/categories/vmware/vm-poweroff/#minimal-rbac-configuration-example-optional","title":"Minimal RBAC configuration example (optional)","text":"

    NOTE

    If you are using this experiment as part of a litmus workflow scheduled constructed & executed from chaos-center, then you may be making use of the litmus-admin RBAC, which is pre installed in the cluster as part of the agent setup.

    View the Minimal RBAC permissions

    ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\n  name: vm-poweroff-sa\n  namespace: default\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRole\nmetadata:\n  name: vm-poweroff-sa\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n  # Create and monitor the experiment & helper pods\n  - apiGroups: [\"\"]\n    resources: [\"pods\"]\n    verbs: [\"create\",\"delete\",\"get\",\"list\",\"patch\",\"update\", \"deletecollection\"]\n  # Performs CRUD operations on the events inside chaosengine and chaosresult\n  - apiGroups: [\"\"]\n    resources: [\"events\"]\n    verbs: [\"create\",\"get\",\"list\",\"patch\",\"update\"]\n  # Fetch configmaps & secrets details and mount it to the experiment pod (if specified)\n  - apiGroups: [\"\"]\n    resources: [\"secrets\",\"configmaps\"]\n    verbs: [\"get\",\"list\",]\n  # Track and get the runner, experiment, and helper pods log \n  - apiGroups: [\"\"]\n    resources: [\"pods/log\"]\n    verbs: [\"get\",\"list\",\"watch\"]  \n  # for creating and managing to execute comands inside target container\n  - apiGroups: [\"\"]\n    resources: [\"pods/exec\"]\n    verbs: [\"get\",\"list\",\"create\"]\n  # for configuring and monitor the experiment job by the chaos-runner pod\n  - apiGroups: [\"batch\"]\n    resources: [\"jobs\"]\n    verbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n  # for creation, status polling and deletion of litmus chaos resources used within a chaos workflow\n  - apiGroups: [\"litmuschaos.io\"]\n    resources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\n    verbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\"]\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: ClusterRoleBinding\nmetadata:\n  name: vm-poweroff-sa\n  labels:\n    name: vm-poweroff-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\n  apiGroup: rbac.authorization.k8s.io\n  kind: ClusterRole\n  name: vm-poweroff-sa\nsubjects:\n- kind: ServiceAccount\n  name: vm-poweroff-sa\n  namespace: default\n
    Use this sample RBAC manifest to create a chaosServiceAccount in the desired (app) namespace. This example consists of the minimum necessary role permissions to execute the experiment.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#experiment-tunables","title":"Experiment tunablesMandatory FieldsOptional Fields","text":"check the experiment tunables

    Variables Description Notes APP_VM_MOIDS MOIDs of the vmware instance Once you open VM in vCenter WebClient, you can find MOID in address field (VirtualMachine:vm-5365). Alternatively you can use the CLI to fetch the MOID. Eg: vm-5365

    Variables Description Notes TOTAL_CHAOS_DURATION The total time duration for chaos insertion (sec) Defaults to 30s CHAOS_INTERVAL The interval (in sec) between successive instance termination Defaults to 30s SEQUENCE It defines sequence of chaos execution for multiple instance Default value: parallel. Supported: serial, parallel RAMP_TIME Period to wait before and after injection of chaos in sec

    "},{"location":"experiments/categories/vmware/vm-poweroff/#experiment-examples","title":"Experiment Examples","text":""},{"location":"experiments/categories/vmware/vm-poweroff/#common-experiment-tunables","title":"Common Experiment Tunables","text":"

    Refer the common attributes to tune the common tunables for all the experiments.

    "},{"location":"experiments/categories/vmware/vm-poweroff/#stoppoweroff-vm-by-moid","title":"Stop/Poweroff VM By MOID","text":"

    It contains MOID of the vm instance. It can be tuned via APP_VM_MOIDS ENV.

    Use the following example to tune this:

    # power-off the VMWare VM\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  chaosServiceAccount: vm-poweroff-sa\n  experiments:\n  - name: vm-poweroff\n    spec:\n      components:\n        env:\n        # MOID of the VM\n        - name: APP_VM_MOIDS\n          value: 'vm-53,vm-65'\n\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/","title":"IAM integration for Litmus service accounts","text":"

    You can execute Litmus AWS experiments to target different AWS services from the EKS cluster itself, for this we need to authenticate Litmus with the AWS platform, we can do this in two different ways:

    • Using secrets: It is one of the common ways to authenticate litmus with AWS irrespective of the Kubernetes cluster used for the deployment. In other words, it is Kubernetes\u2019 native way for the authentication of litmus with the AWS platform.
    • IAM Integration: It can be used when we\u2019ve deployed Litmus on EKS cluster, we can associate an IAM role with a Kubernetes service account. This service account can then provide AWS permissions to the experiment pod that uses that service account. We\u2019ll discuss more this method in the below sections.
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#why-should-we-use-iam-integration-for-aws-authentication","title":"Why should we use IAM integration for AWS authentication?","text":"

    The IAM roles for service accounts feature provides the following benefits:

    • Least privilege: By using the IAM roles for service accounts feature, you no longer need to provide extended permissions to the node IAM role so that pods on that node can call AWS APIs. You can scope IAM permissions to a service account, and only pods that use that service account have access to those permissions.
    • Credential isolation: The experiment can only retrieve credentials for the IAM role that is associated with the service account to which it belongs. The experiment never has access to credentials that are intended for another experiment that belongs to another pod.
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#enable-service-accounts-to-access-aws-resources","title":"Enable service accounts to access AWS resources:","text":""},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-1-create-an-iam-oidc-provider-for-your-cluster","title":"Step 1: Create an IAM OIDC provider for your cluster","text":"

    We need to perform this once for a cluster. We\u2019re going to follow the AWS documentation to setup an OIDC provider with eksctl.

    Check whether you have an existing IAM OIDC provider for your cluster: To check this you can follow the given instruction.

    Note: For demonstration we\u2019ll be using cluster name as litmus-demo and region us-west-1 you can replace these values according to your ENV.

    aws eks describe-cluster --name <litmus-demo> --query \"cluster.identity.oidc.issuer\" --output text\n
    Output:

    https://oidc.eks.us-west-1.amazonaws.com/id/D054E55B6947B1A7B3F200297789662C\n

    Now list the IAM OIDC providers in your account.

    Command:

    aws iam list-open-id-connect-providers | grep <EXAMPLED539D4633E53DE1B716D3041E>\n

    Replace <D054E55B6947B1A7B3F200297789662C> (including <>) with the value returned from the previous command.

    So now here we don\u2019t have an IAM OIDC identity provider, So we need to create it for your cluster with the following command. Replace <litmus-demo> (including <>) with your own value.

    eksctl utils associate-iam-oidc-provider --cluster litmus-demo --approve\n2021-09-07 14:54:01 [\u2139]  eksctl version 0.52.0\n2021-09-07 14:54:01 [\u2139]  using region us-west-1\n2021-09-07 14:54:04 [\u2139]  will create IAM Open ID Connect provider for cluster \"udit-cluster-11\" in \"us-west-1\"\n2021-09-07 14:54:05 [\u2714]  created IAM Open ID Connect provider for cluster \"litmus-demo\" in \"us-west-1\"\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-2-creating-an-iam-role-and-policy-for-your-service-account","title":"Step 2: Creating an IAM role and policy for your service account","text":"

    You must create an IAM policy that specifies the permissions that you would like the experiment should to have. You have several ways to create a new IAM permission policy. Check out the AWS docs for creating the IAM policy. We will make use of eksctl command to setup the same.

    eksctl create iamserviceaccount \\\n--name <service_account_name> \\\n--namespace <service_account_namespace> \\\n--cluster <cluster_name> \\\n--attach-policy-arn <IAM_policy_ARN> \\\n--approve \\\n--override-existing-serviceaccounts\n
    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#step-3-associate-an-iam-role-with-a-service-account","title":"Step 3: Associate an IAM role with a service account","text":"

    Complete this task for each Kubernetes service account that needs access to AWS resources. We can do this by defining the IAM role to associate with a service account in your cluster by adding the following annotation to the service account.

    apiVersion: v1\nkind: ServiceAccount\nmetadata:\n  annotations:\n    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\n

    You can also annotate the experiment service account running the following command.

    Notes: 1. Ideally, annotating the litmus-admin service account in litmus namespace should work for most of the experiments. 2. For the cluster autoscaler experiment, annotate the service account in the kube-system namespace.

    kubectl annotate serviceaccount -n <SERVICE_ACCOUNT_NAMESPACE> <SERVICE_ACCOUNT_NAME> \\\neks.amazonaws.com/role-arn=arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\n

    Verify that the experiment service account is now associated with the IAM.

    If you run an experiment and describe one of the pods, you can verify that the AWS_WEB_IDENTITY_TOKEN_FILE and AWS_ROLE_ARN environment variables exist.

    kubectl exec -n litmus <ec2-terminate-by-id-z4zdf> env | grep AWS\n
    Output:
    AWS_VPC_K8S_CNI_LOGLEVEL=DEBUG\nAWS_ROLE_ARN=arn:aws:iam::<ACCOUNT_ID>:role/<IAM_ROLE_NAME>\nAWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token\n

    Now we have successfully enabled the experiment service accounts to access AWS resources.

    "},{"location":"experiments/concepts/IAM/awsIamIntegration/#configure-the-experiment-cr","title":"Configure the Experiment CR.","text":"

    Since we have already configured the IAM for the experiment service account we don\u2019t need to create secret and mount it with experiment CR which is enabled by default. To remove the secret mount we have to remove the following lines from experiment YAML.

    secrets:\n- name: cloud-secret\n    mountPath: /tmp/\n
    We can now run the experiment with the direct IAM integration.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/","title":"IAM integration for Litmus service accounts","text":"

    To execute LitmusChaos GCP experiments, one needs to authenticate with the GCP by means of a service account before trying to access the target resources. Usually, you have only one way of providing the service account credentials to the experiment, using a service account key, but if you're using a GKE cluster you have a keyless medium of authentication as well.

    Therefore you have two ways of providing the service account credentials to your GKE cluster:

    • Using Secrets: As you would normally do, you can create a secret containing the GCP service account in your GKE cluster, which gets utilized by the experiment for authentication to access your GCP resources.

    • IAM Integration: When you're using a GKE cluster, you can bind a GCP service account to a Kubernetes service account as an IAM policy, which can be then used by the experiment for keyless authentication using GCP Workload Identity. We\u2019ll discuss more on this method in the following sections.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#why-use-iam-integration-for-gcp-authentication","title":"Why use IAM integration for GCP authentication?","text":"

    A Google API request can be made using a GCP IAM service account, which is an identity that an application uses to make calls to Google APIs. You might create individual IAM service accounts for each application as an application developer, then download and save the keys as a Kubernetes secret that you manually rotate. Not only is this a time-consuming process, but service account keys only last ten years (or until you manually rotate them). An unaccounted-for key could give an attacker extended access in the event of a breach or compromise. Using service account keys as secrets is not an optimal way of authenticating GKE workloads due to this potential blind spot and the management cost of key inventory and rotation.

    Workload Identity allows you to restrict the possible \"blast radius\" of a breach or compromise while enforcing the principle of least privilege across your environment. It accomplishes this by automating workload authentication best practices, eliminating the need for workarounds, and making it simple to implement recommended security best practices.

    • Your tasks will only have the permissions they require to fulfil their role with the principle of least privilege. It minimizes the breadth of a potential compromise by not granting broad permissions.

    • Unlike the 10-year lifetime service account keys, credentials supplied to the Workload Identity are only valid for a short time, decreasing the blast radius in the case of a compromise.

    • The risk of unintentional disclosure of credentials due to a human mistake is greatly reduced because Google controls the namespace service account credentials for you. It also eliminates the need for you to manually rotate these credentials.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#how-to-enable-service-accounts-to-access-gcp-resources","title":"How to enable service accounts to access GCP resources?","text":"

    We will be following the steps from the GCP Documentation for Workload Identity

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-1-enable-workload-identity","title":"STEP 1: Enable Workload Identity","text":"

    You can enable Workload Identity on clusters and node pools using the Google Cloud CLI or the Google Cloud Console. Workload Identity must be enabled at the cluster level before you can enable Workload Identity on node pools.

    Workload Identity can be enabled for an existing cluster as well as a new cluster. To enable Workload Identity on a new cluster, run the following command:

    gcloud container clusters create CLUSTER_NAME \\\n    --region=COMPUTE_REGION \\\n    --workload-pool=PROJECT_ID.svc.id.goog\n
    Replace the following: - CLUSTER_NAME: the name of your new cluster. - COMPUTE_REGION: the Compute Engine region of your cluster. For zonal clusters, use --zone=COMPUTE_ZONE. - PROJECT_ID: your Google Cloud project ID.

    You can enable Workload Identity on an existing Standard cluster by using the gcloud CLI or the Cloud Console. Existing node pools are unaffected, but any new node pools in the cluster use Workload Identity. To enable Workload Identity on an existing cluster, run the following command:

    gcloud container clusters update CLUSTER_NAME \\\n    --region=COMPUTE_REGION \\\n    --workload-pool=PROJECT_ID.svc.id.goog\n
    Replace the following: - CLUSTER_NAME: the name of your new cluster. - COMPUTE_REGION: the Compute Engine region of your cluster. For zonal clusters, use --zone=COMPUTE_ZONE. - PROJECT_ID: your Google Cloud project ID.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-2-configure-litmuschaos-to-use-workload-identity","title":"STEP 2: Configure LitmusChaos to use Workload Identity","text":"

    Assuming that you already have LitmusChaos installed in your GKE cluster as well as the Kubernetes service account you want to use for your GCP experiments, execute the following steps.

    1. Get Credentials for your cluster.

      gcloud container clusters get-credentials CLUSTER_NAME\n
      Replace CLUSTER_NAME with the name of your cluster that has Workload Identity enabled.

    2. Create an IAM service account for your application or use an existing IAM service account instead. You can use any IAM service account in any project in your organization. For Config Connector, apply the IAMServiceAccount object for your selected service account. To create a new IAM service account using the gcloud CLI, run the following command:

      gcloud iam service-accounts create GSA_NAME \\\n    --project=GSA_PROJECT\n
      Replace the following:

    3. GSA_NAME: the name of the new IAM service account.
    4. GSA_PROJECT: the project ID of the Google Cloud project for your IAM service account.

    5. Please ensure that this service account has all the roles requisite for interacting with the Compute Engine resources including VM Instances and Persistent Disks according to the GCP experiments that you're willing to run. You can grant additional roles using the following command:

      gcloud projects add-iam-policy-binding PROJECT_ID \\\n    --member \"serviceAccount:GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com\" \\\n    --role \"ROLE_NAME\"\n
      Replace the following:

    6. PROJECT_ID: your Google Cloud project ID.
    7. GSA_NAME: the name of your IAM service account.
    8. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    9. ROLE_NAME: the IAM role to assign to your service account, like roles/spanner.viewer.

    10. Allow the Kubernetes service account to be used for the GCP experiments to impersonate the GCP IAM service account by adding an IAM policy binding between the two service accounts. This binding allows the Kubernetes service account to act as the IAM service account.

      gcloud iam service-accounts add-iam-policy-binding GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com \\\n    --role roles/iam.workloadIdentityUser \\\n    --member \"serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]\"\n
      Replace the following:

    11. GSA_NAME: the name of your IAM service account.
    12. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    13. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    14. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.

    15. Annotate the Kubernetes service account to be used for LitmusChaos GCP experiments with the email address of the GCP IAM service account.

      kubectl annotate serviceaccount KSA_NAME \\\n    --namespace NAMESPACE \\\n    iam.gke.io/gcp-service-account=GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com\n
      Replace the following:

    16. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    17. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.
    18. GSA_NAME: the name of your IAM service account.
    19. GSA_PROJECT: the project ID of the Google Cloud project of your IAM service account.
    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-3-update-chaosengine-manifest","title":"STEP 3: Update ChaosEngine Manifest","text":"

    Add the following value to the ChaosEngine manifest field .spec.experiments[].spec.components.nodeSelector to schedule the experiment pod on nodes that use Workload Identity.

    iam.gke.io/gke-metadata-server-enabled: \"true\"\n

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-4-update-chaosexperiment-manifest","title":"STEP 4: Update ChaosExperiment Manifest","text":"

    Remove cloud-secret at .spec.definition.secrets in the ChaosExperiment manifest as we are not using a secret to provide our GCP Service Account credentials.

    Now you can run your GCP experiments with a keyless authentication provided by GCP using Workload Identity.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#how-to-disable-iam-service-accounts-from-accessing-gcp-resources","title":"How to disable IAM service accounts from accessing GCP resources?","text":"

    To stop using Workload Identity, revoke access to the GCP IAM service account and disable Workload Identity on the cluster.

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-1-revoke-access-to-the-iam-service-account","title":"STEP 1: Revoke access to the IAM service account","text":"
    1. To revoke access to the GCP IAM service account, use the following command:
      gcloud iam service-accounts remove-iam-policy-binding GSA_NAME@GSA_PROJECT.iam.gserviceaccount.com \\\n    --role roles/iam.workloadIdentityUser \\\n    --member \"serviceAccount:PROJECT_ID.svc.id.goog[NAMESPACE/KSA_NAME]\"\n
      Replace the following:
    2. PROJECT_ID: the project ID of the GKE cluster.
    3. NAMESPACE: the namespace in which the Kubernetes service account to be used for LitmusChaos GCP experiments is present.
    4. KSA_NAME: the name of the service account to be used for LitmusChaos GCP experiments.
    5. GSA_NAME: the name of the IAM service account.
    6. GSA_PROJECT: the project ID of the IAM service account.

    It can take up to 30 minutes for cached tokens to expire.

    1. Remove the annotation from the service account being used for LitmusChaos GCP experiments:
      kubectl annotate serviceaccount KSA_NAME \\\n    --namespace NAMESPACE iam.gke.io/gcp-service-account-\n
    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#step-2-disable-workload-identity","title":"STEP 2: Disable Workload Identity","text":"
    1. Disable Workload Identity on each node pool:

      gcloud container node-pools update NODEPOOL_NAME \\\n    --cluster=CLUSTER_NAME \\\n    --workload-metadata=GCE_METADATA\n
      Repeat this command for every node pool in the cluster.

    2. Disable Workload Identity in the cluster:

      gcloud container clusters update CLUSTER_NAME \\\n    --disable-workload-identity\n

    "},{"location":"experiments/concepts/IAM/gcpIamIntegration/#troubleshooting-guide","title":"Troubleshooting Guide","text":"

    Refer to the GCP documentation on troubleshooting Workload Identity here.

    "},{"location":"experiments/concepts/chaos-resources/contents/","title":"Chaos Resources","text":"

    At the heart of the Litmus Platform are the chaos custom resources. This section consists of the specification (details of each field within the .spec & .status of the resources) as well as standard examples for tuning the supported parameters.

    Chaos Resource Name Description User Guide ChaosEngine Contains the ChaosEngine specifications user-guide ChaosEngine ChaosExperiment Contains the ChaosExperiment specifications user-guide ChaosExperiment ChaosResult Contains the ChaosResult specifications user-guide ChaosResult ChaosScheduler Contains the ChaosScheduler specifications user-guide ChaosScheduler Probes Contains the Probes specifications user-guide Probes"},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/","title":"Application Specifications","text":"

    It contains AUT and auxiliary applications details provided at spec.appinfo and spec.auxiliaryAppInfo respectively inside chaosengine.

    View the application specification schema

    Field .spec.appinfo.appns Description Flag to specify namespace of application under test Type Optional Range user-defined (type: string) Default n/a Notes The appns in the spec specifies the namespace of the AUT. Usually provided as a quoted string. It is optional for the infra chaos.

    Field .spec.appinfo.applabel Description Flag to specify unique label of application under test Type Optional Range user-defined (type: string)(pattern: \"label_key=label_value\") Default n/a Notes The applabel in the spec specifies a unique label of the AUT. Usually provided as a quoted string of pattern key=value. Note that if multiple applications share the same label within a given namespace, the AUT is filtered based on the presence of the chaos annotation litmuschaos.io/chaos: \"true\". If, however, the annotationCheck is disabled, then a random application (pod) sharing the specified label is selected for chaos. It is optional for the infra chaos.

    Field .spec.appinfo.appkind Description Flag to specify resource kind of application under test Type Optional Range deployment, statefulset, daemonset, deploymentconfig, rollout Default n/a (depends on app type) Notes The appkind in the spec specifies the Kubernetes resource type of the app deployment. The Litmus ChaosOperator supports chaos on deployments, statefulsets and daemonsets. Application health check routines are dependent on the resource types, in case of some experiments. It is optional for the infra chaos

    Field .spec.auxiliaryAppInfo Description Flag to specify one or more app namespace-label pairs whose health is also monitored as part of the chaos experiment, in addition to a primary application specified in the .spec.appInfo. NOTE: If the auxiliary applications are deployed in namespaces other than the AUT, ensure that the chaosServiceAccount is bound to a cluster role and has adequate permissions to list pods on other namespaces. Type Optional Range user-defined (type: string)(pattern: \"namespace:label_key=label_value\"). Default n/a Notes The auxiliaryAppInfo in the spec specifies a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/#application-under-test","title":"Application Under Test","text":"

    It defines the appns, applabel, and appkind to set the namespace, labels, and kind of the application under test.

    • appkind: It supports deployment, statefulset, daemonset, deploymentconfig, and rollout. It is mandatory for the pod-level experiments and optional for the rest of the experiments.

    Use the following example to tune this:

    # contains details of the AUT(application under test)\n# appns: name of the application\n# applabel: label of the applicaton\n# appkind: kind of the application. supports: deployment, statefulset, daemonset, rollout, deploymentconfig\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # AUT details\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/application-details/#auxiliary-application-info","title":"Auxiliary Application Info","text":"

    The contains a (comma-separated) list of namespace-label pairs for downstream (dependent) apps of the primary app specified in .spec.appInfo in case of pod-level chaos experiments. In the case of infra-level chaos experiments, this flag specifies those apps that may be directly impacted by chaos and upon which health checks are necessary. It can be tuned via auxiliaryAppInfo field. It supports input the below format:

    • auxiliaryAppInfo: <namespace1>:<key1=value1>,<namespace2>:<key2=value2>

    Note: Auxiliary application check is only supported for node-level experiments.

    Use the following example to tune this:

    # contains the comma seperated list of auxiliary applications details\n# it is provide in `<namespace1>:<key1=value1>,<namespace2>:<key2=value2>` format\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # provide the comma separated auxiliary applications details\n  auxiliaryAppInfo: \"nginx:app=nginx,default:app=busybox\"\n  chaosServiceAccount: node-drain-sa\n  experiments:\n  - name: node-drain\n    spec:\n      components:\n        env:\n        # name of the target node\n        - name: TARGET_NODE\n          value: 'node01'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/contents/","title":"Chaos Engine Specifications","text":"

    Bind an instance of a given app with one or more chaos experiments, define run characteristics, override chaos defaults, define steady-state hypothesis, reconciled by Litmus Chaos Operator.

    This section describes the fields in the ChaosEngine spec and the possible values that can be set against the same.

    Field Name Description User Guide State Specification It defines the state of the chaosengine State Specifications Application Specification It defines the details of AUT and auxiliary applications Application Specifications RBAC Specification It defines the chaos-service-account name RBAC Specifications Runtime Specification It defines the runtime details of the chaosengine Runtime Specifications Runner Specification It defines the runner pod specifications Runner Specifications Experiment Specification It defines the experiment pod specifications Experiment Specifications"},{"location":"experiments/concepts/chaos-resources/chaos-engine/engine-state/","title":"State Specifications","text":"

    It is a user-defined flag to trigger chaos. Setting it to active ensures the successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on the actual state of the ChaosEngine. It can be tuned via engineState field. It supports active and stop values.

    View the state specification schema

    Field .spec.engineState Description Flag to control the state of the chaosengine Type Mandatory Range active, stop Default active Notes The engineState in the spec is a user defined flag to trigger chaos. Setting it to active ensures successful execution of chaos. Patching it with stop aborts ongoing experiments. It has a corresponding flag in the chaosengine status field, called engineStatus which is updated by the controller based on actual state of the ChaosEngine.

    Use the following example to tune this:

    # contains the chaosengine state\n# supports: active and stop states\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  # contains the state of engine\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/","title":"Experiment Specifications","text":"

    It contains all the experiment tunables provided at .spec.experiments[].spec.components inside chaosengine.

    View the experiment specification schema

    Field .spec.experiments[].spec.components.configMaps Description Configmaps passed to the chaos experiment Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The experiment[].spec.components.configMaps provides for a means to insert config information into the experiment. The configmaps definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Field .spec.experiments[].spec.components.secrets Description Kubernetes secrets passed to the chaos experiment Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The experiment[].spec.components.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the experiment pods. These are especially useful in case of platform-level/infra-level chaos experiments. The secrets definition is validated for correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Field .spec.experiments[].spec.components.experimentImage Description Override the image of the chaos experiment Type Optional Range string Default n/a Notes The experiment[].spec.components.experimentImage overrides the experiment image for the chaoexperiment.

    Field .spec.experiments[].spec.components.experimentImagePullSecrets Description Flag to specify imagePullSecrets for the ChaosExperiment Type Optional Range user-defined (type: []corev1.LocalObjectReference) Default n/a Notes The .components.runner.experimentImagePullSecrets allows developers to specify the imagePullSecret name for ChaosExperiment.

    Field .spec.experiments[].spec.components.nodeSelector Description Provide the node selector for the experiment pod Type Optional Range Labels in the from of label key=value Default n/a Notes The experiment[].spec.components.nodeSelector The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos.

    Field .spec.experiments[].spec.components.statusCheckTimeouts Description Provides the timeout and retry values for the status checks. Defaults to 180s & 90 retries (2s per retry) Type Optional Range It contains values in the form {delay: int, timeout: int} Default delay: 2s and timeout: 180s Notes The experiment[].spec.components.statusCheckTimeouts The statusCheckTimeouts override the status timeouts inside chaosexperiments. It contains timeout & delay in seconds.

    Field .spec.experiments[].spec.components.resources Description Specify the resource requirements for the ChaosExperiment pod Type Optional Range user-defined (type: corev1.ResourceRequirements) Default n/a Notes The experiment[].spec.components.resources contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod.

    Field .spec.experiments[].spec.components.experimentAnnotations Description Annotations that needs to be provided in the pod which will be created (experiment-pod) Type Optional Range user-defined (type: label key=value) Default n/a Notes The .spec.components.experimentAnnotation allows developers to specify the custom annotations for the experiment pod.

    Field .spec.experiments[].spec.components.tolerations Description Toleration for the experiment pod Type Optional Range user-defined (type: []corev1.Toleration) Default n/a Notes The .spec.components.tolerationsTolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-annotations","title":"Experiment Annotations","text":"

    It allows developers to specify the custom annotations for the experiment pod. It can be tuned via experimentAnnotations field.

    Use the following example to tune this:

    # contains annotations for the chaos runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # annotations for the experiment pod\n        experimentAnnotations:\n          name: chaos-experiment\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-configmaps-and-secrets","title":"Experiment Configmaps And Secrets","text":"

    It defines the configMaps and secrets to set the configmaps and secrets mounted to the experiment pod respectively.

    • configMaps: It provides for a means to insert config information into the experiment. The configmaps definition is validated for the correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.
    • secrets: It provides for a means to push secrets (typically project ids, access credentials, etc.,) into the experiment pods. These are especially useful in the case of platform-level/infra-level chaos experiments. The secrets definition is validated for the correctness and those specified are checked for availability (in the cluster/namespace) before being mounted into the experiment pods.

    Use the following example to tune this:

    # contains configmaps and secrets for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # configmaps details mounted to the experiment pod\n        configMaps:\n        - name: \"configmap-01\"\n          mountPath: \"/mnt\"\n        # secrets details mounted to the experiment pod\n        secrets:\n        - name: \"secret-01\"\n          mountPath: \"/tmp\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-image","title":"Experiment Image","text":"

    It overrides the experiment image for the chaosexperiment. It allows developers to specify the experiment image. It can be tuned via experimentImage field.

    Use the following example to tune this:

    # contains the custom image for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # override the image of the experiment pod\n        experimentImage: \"litmuschaos/go-runner:ci\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-imagepullsecrets","title":"Experiment ImagePullSecrets","text":"

    It allows developers to specify the imagePullSecret name for ChaosExperiment. It can be tuned via experimentImagePullSecrets field.

    Use the following example to tune this:

    # contains the imagePullSecrets for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # secret name for the experiment image, if using private registry\n        experimentImagePullSecrets:\n        - name: regcred\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-nodeselectors","title":"Experiment NodeSelectors","text":"

    The nodeselector contains labels of the node on which experiment pod should be scheduled. Typically used in case of infra/node level chaos. It can be tuned via nodeSelector field.

    Use the following example to tune this:

    # contains the node-selector for the experiment pod\n# it will schedule the experiment pod on the coresponding node with matching labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # nodeselector for the experiment pod\n        nodeSelector:\n          context: chaos\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-resource-requirements","title":"Experiment Resource Requirements","text":"

    It contains the resource requirements for the ChaosExperiment Pod, where we can provide resource requests and limits for the pod. It can be tuned via resources field.

    Use the following example to tune this:

    # contains the resource requirements for the experiment pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # resource requirements for the experiment pod\n        resources:\n          requests:\n            cpu: \"250m\"\n            memory: \"64Mi\"\n          limits:\n          cpu: \"500m\"\n          memory: \"128Mi\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-tolerations","title":"Experiment Tolerations","text":"

    It provides tolerations for the experiment pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. It can be tuned via tolerations field.

    Use the following example to tune this:

    # contains the tolerations for the experiment pod\n# it will schedule the experiment pod on the tainted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # tolerations for the experiment pod\n        tolerations:\n        - key: \"key1\"\n          operator: \"Equal\"\n          value: \"value1\"\n          effect: \"Schedule\"\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/experiment-components/#experiment-status-check-timeout","title":"Experiment Status Check Timeout","text":"

    It overrides the status timeouts inside chaosexperiments. It contains timeout & delay in seconds. It can be tuned via statusCheckTimeouts field.

    Use the following example to tune this:

    # contains status check timeout for the experiment pod\n# it will set this timeout as upper bound while checking application status, node status in experiments\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      components:\n        # status check timeout for the experiment pod\n        statusCheckTimeouts:\n          delay: 2\n          timeout: 180\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/rbac-details/","title":"RBAC Specifications","text":"

    It specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment are provided in the .spec.definition.permissions field of the respective chaosexperiment CR. It can be tuned via chaosServiceAccount field.

    View the RBAC specification schema

    Field .spec.chaosServiceAccount Description Flag to specify serviceaccount used for chaos experiment Type Mandatory Range user-defined (type: string) Default n/a Notes The chaosServiceAccount in the spec specifies the name of the serviceaccount mapped to a role/clusterRole with enough permissions to execute the desired chaos experiment. The minimum permissions needed for any given experiment is provided in the .spec.definition.permissions field of the respective chaosexperiment CR.

    Use the following example to tune this:

    # contains name of the serviceAccount which contains all the RBAC permissions required for the experiment\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  # name of the service account w/ sufficient permissions\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/","title":"Runner Specifications","text":"

    It contains all the chaos-runner tunables provided at .spec.components.runner inside chaosengine.

    View the runner specification schema

    Field .spec.components.runner.image Description Flag to specify image of ChaosRunner pod Type Optional Range user-defined (type: string) Default n/a (refer Notes) Notes The .components.runner.image allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE

    Field .spec.components.runner.imagePullPolicy Description Flag to specify imagePullPolicy for the ChaosRunner Type Optional Range Always, IfNotPresent Default IfNotPresent Notes The .components.runner.imagePullPolicy allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.

    Field .spec.components.runner.imagePullSecrets Description Flag to specify imagePullSecrets for the ChaosRunner Type Optional Range user-defined (type: []corev1.LocalObjectReference) Default n/a Notes The .components.runner.imagePullSecrets allows developers to specify the imagePullSecret name for ChaosRunner.

    Field .spec.components.runner.runnerAnnotations Description Annotations that needs to be provided in the pod which will be created (runner-pod) Type Optional Range user-defined (type: map[string]string) Default n/a Notes The .components.runner.runnerAnnotation allows developers to specify the custom annotations for the runner pod.

    Field .spec.components.runner.args Description Specify the args for the ChaosRunner Pod Type Optional Range user-defined (type: []string) Default n/a Notes The .components.runner.args allows developers to specify their own debug runner args.

    Field .spec.components.runner.command Description Specify the commands for the ChaosRunner Pod Type Optional Range user-defined (type: []string) Default n/a Notes The .components.runner.command allows developers to specify their own debug runner commands.

    Field .spec.components.runner.configMaps Description Configmaps passed to the chaos runner pod Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The .spec.components.runner.configMaps provides for a means to insert config information into the runner pod.

    Field .spec.components.runner.secrets Description Kubernetes secrets passed to the chaos runner pod. Type Optional Range user-defined (type: {name: string, mountPath: string}) Default n/a Notes The .spec.components.runner.secrets provides for a means to push secrets (typically project ids, access credentials etc.,) into the chaos runner pod. These are especially useful in case of platform-level/infra-level chaos experiments.

    Field .spec.components.runner.nodeSelector Description Node selectors for the runner pod Type Optional Range Labels in the from of label key=value Default n/a Notes The .spec.components.runner.nodeSelector The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos.

    Field .spec.components.runner.resources Description Specify the resource requirements for the ChaosRunner pod Type Optional Range user-defined (type: corev1.ResourceRequirements) Default n/a Notes The .spec.components.runner.resources contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod.

    Field .spec.components.runner.tolerations Description Toleration for the runner pod Type Optional Range user-defined (type: []corev1.Toleration) Default n/a Notes The .spec.components.runner.tolerations Provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos.

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-annotations","title":"ChaosRunner Annotations","text":"

    It allows developers to specify the custom annotations for the runner pod. It can be tuned via runnerAnnotations field.

    Use the following example to tune this:

    # contains annotations for the chaos runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n     # annotations for the chaos-runner\n     runnerAnnotations:\n       name: chaos-runner\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-args-and-command","title":"ChaosRunner Args And Command","text":"

    It defines the args and command to set the args and command of the chaos-runner respectively.

    • args: It allows developers to specify their own debug runner args.
    • command: It allows developers to specify their own debug runner commands.

    Use the following example to tune this:

    # contains args and command for the chaos runner\n# it will be useful for the cases where custom image of the chaos-runner is used, which supports args and commands\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    # override the args and command for the chaos-runner\n    runner:\n      # name of the custom image\n      image: \"<your repo>/chaos-runner:ci\"\n      # args for the image\n      args:\n      - \"/bin/sh\"\n      # command for the image\n      command:\n      - \"-c\"\n      - \"<custom-command>\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-configmaps-and-secrets","title":"ChaosRunner Configmaps And Secrets","text":"

    It defines the configMaps and secrets to set the configmaps and secrets mounted to the chaos-runner respectively.

    • configMaps: It provides for a means to insert config information into the runner pod.
    • secrets: It provides for a means to push secrets (typically project ids, access credentials, etc.,) into the chaos runner pod. These are especially useful in the case of platform-level/infra-level chaos experiments.

    Use the following example to tune this:

    # contains configmaps and secrets for the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n     # configmaps details mounted to the runner pod\n     configMaps:\n     - name: \"configmap-01\"\n       mountPath: \"/mnt\"\n     # secrets details mounted to the runner pod\n     secrets:\n     - name: \"secret-01\"\n       mountPath: \"/tmp\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-image-and-imagepullpoicy","title":"ChaosRunner Image and ImagePullPoicy","text":"

    It defines the image and imagePullPolicy to set the image and imagePullPolicy for the chaos-runner respectively.

    • image: It allows developers to specify their own debug runner images. Defaults for the runner image can be enforced via the operator env CHAOS_RUNNER_IMAGE.
    • imagePullPolicy: It allows developers to specify the pull policy for chaos-runner. Set to Always during debug/test.

    Use the following example to tune this:

    # contains the image and imagePullPolicy of the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # override the image of the chaos-runner\n      # by default it is used the image based on the litmus version\n      image: \"litmuschaos/chaos-runner:latest\"\n      # imagePullPolicy for the runner image\n      # supports: Always, IfNotPresent. default: IfNotPresent\n      imagePullPolicy: \"Always\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-imagepullsecrets","title":"ChaosRunner ImagePullSecrets","text":"

    It allows developers to specify the imagePullSecret name for the ChaosRunner. It can be tuned via imagePullSecrets field.

    Use the following example to tune this:

    # contains the imagePullSecrets for the chaos-runner\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # secret name for the runner image, if using private registry\n      imagePullSecrets:\n      - name: regcred\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-nodeselectors","title":"ChaosRunner NodeSelectors","text":"

    The nodeselector contains labels of the node on which runner pod should be scheduled. Typically used in case of infra/node level chaos. It can be tuned via nodeSelector field.

    Use the following example to tune this:

    # contains the node-selector for the chaos-runner\n# it will schedule the chaos-runner on the coresponding node with matching labels\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # nodeselector for the runner pod\n      nodeSelector:\n        context: chaos\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-resource-requirements","title":"ChaosRunner Resource Requirements","text":"

    It contains the resource requirements for the ChaosRunner Pod, where we can provide resource requests and limits for the pod. It can be tuned via resources field.

    Use the following example to tune this:

    # contains the resource requirements for the runner pod\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # resource requirements for the runner pod\n      resources:\n        requests:\n          cpu: \"250m\"\n          memory: \"64Mi\"\n        limits:\n         cpu: \"500m\"\n         memory: \"128Mi\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runner-components/#chaosrunner-tolerations","title":"ChaosRunner Tolerations","text":"

    It provides tolerations for the runner pod so that it can be scheduled on the respective tainted node. Typically used in case of infra/node level chaos. It can be tuned via tolerations field.

    Use the following example to tune this:

    # contains the tolerations for the chaos-runner\n# it will schedule the chaos-runner on the tainted node\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  components:\n    runner:\n      # tolerations for the runner pod\n      tolerations:\n      - key: \"key1\"\n        operator: \"Equal\"\n        value: \"value1\"\n        effect: \"Schedule\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/","title":"Runtime Specifications","text":"

    It contains runtime details of the chaos experiments provided at .spec inside chaosengine.

    View the runtime specification schema

    Field .spec.annotationCheck Description Flag to control annotationChecks on applications as prerequisites for chaos Type Optional Range true, false Default true Notes The annotationCheck in the spec controls whether or not the operator checks for the annotation \"litmuschaos.io/chaos\" to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated, while setting it to false suppresses this check and proceeds with chaos injection.

    Field .spec.terminationGracePeriodSeconds Description Flag to control terminationGracePeriodSeconds for the chaos pods(abort case) Type Optional Range integer value Default 30 Notes The terminationGracePeriodSeconds in the spec controls the terminationGracePeriodSeconds for the chaos resources in abort case. Chaos pods contains chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before completely terminated.

    Field .spec.jobCleanUpPolicy Description Flag to control cleanup of chaos experiment job post execution of chaos Type Optional Range delete, retain Default delete Notes <The jobCleanUpPolicy controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms).

    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#annotation-check","title":"Annotation Check","text":"

    It controls whether or not the operator checks for the annotation litmuschaos.io/chaos to be set against the application under test (AUT). Setting it to true ensures the check is performed, with chaos being skipped if the app is not annotated while setting it to false suppresses this check and proceeds with chaos injection. It can be tuned via annotationCheck field. It supports the boolean value and the default value is false.

    Use the following example to tune this:

    # checks the AUT for the annoations. The AUT should be annotated with `litmuschaos.io/chaos: true` if provided as true\n# supports: true, false. default: false\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  # annotaionCheck details\n  annotationCheck: \"true\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#jobcleanup-policy","title":"Jobcleanup Policy","text":"

    It controls whether or not the experiment pods are removed once execution completes. Set to retain for debug purposes (in the absence of standard logging mechanisms). It can be tuned via jobCleanUpPolicy fields. It supports retain and delete. The default value is retain.

    Use the following example to tune this:

    # flag to delete or retain the chaos resources after completions of chaosengine\n# supports: delete, retain. default: retain\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  jobCleanUpPolicy: \"delete\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-engine/runtime-details/#termination-grace-period-seconds","title":"Termination Grace Period Seconds","text":"

    It controls the terminationGracePeriodSeconds for the chaos resources in the abort case. Chaos pods contain chaos revert upon abortion steps, which continuously looking for the termination signals. The terminationGracePeriodSeconds should be provided in such a way that the chaos pods got enough time for the revert before being completely terminated. It can be tuned via terminationGracePeriodSeconds field.

    Use the following example to tune this:

    # contains flag to control the terminationGracePeriodSeconds for the chaos pod(abort case)\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  # contains terminationGracePeriodSeconds for the chaos pods\n  terminationGracePeriodSeconds: 100\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/","title":"Component Specification","text":"

    It contains component details provided at spec.definition inside chaosexperiment

    View the component specification schema

    Field .spec.definition.image Description Flag to specify the image to run the ChaosExperiment Type Mandatory Range user-defined (type: string) Default n/a (refer Notes) Notes The .spec.definition.image allows the developers to specify their experiment images. Typically set to the Litmus go-runner or the ansible-runner. This feature of the experiment enables BYOC (BringYourOwnChaos), where developers can implement their own variants of a standard chaos experiment

    Field .spec.definition.imagePullPolicy Description Flag that helps the developers to specify imagePullPolicy for the ChaosExperiment Type Mandatory Range IfNotPresent, Always (type: string) Default Always Notes The .spec.definition.imagePullPolicy allows developers to specify the pull policy for ChaosExperiment image. Set to Always during debug/test

    Field .spec.definition.args Description Flag to specify the entrypoint for the ChaosExperiment Type Mandatory Range user-defined (type:list of string) Default n/a Notes The .spec.definition.args specifies the entrypoint for the ChaosExperiment. It depends on the language used in the experiment. For litmus-go the .spec.definition.args contains a single binary of all experiments and managed via -name flag to indicate experiment to run(-name (exp-name)).

    Field .spec.definition.command Description Flag to specify the shell on which the ChaosExperiment will execute Type Mandatory Range user-defined (type: list of string). Default /bin/bash Notes The .spec.definition.command specifies the shell used to run the experiment /bin/bash is the most common shell to be used.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#image","title":"Image","text":"

    It allows the developers to specify their experiment images. Typically set to the Litmus go-runner or the ansible-runner. This feature of the experiment enables BYOC (BringYourOwnChaos), where developers can implement their own variants of a standard chaos experiment. It can be tuned via image field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    # image of the chaosexperiment\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#imagepullpolicy","title":"ImagePullPolicy","text":"

    It allows developers to specify the pull policy for ChaosExperiment image. Set to Always during debug/test. It can be tuned via imagePullPolicy field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    # imagePullPolicy of the chaosexperiment\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#args","title":"Args","text":"

    It specifies the entrypoint for the ChaosExperiment. It depends on the language used in the experiment. For litmus-go the .spec.definition.args contains a single binary of all experiments and managed via -name flag to indicate experiment to run(-name (exp-name)). It can be tuned via args field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    # it contains args of the experiment\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/component-specification/#command","title":"Command","text":"

    It specifies the shell used to run the experiment /bin/bash is the most common shell to be used. It can be tuned via command field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    # it contains command of the experiment\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/","title":"Configuration Specification","text":"

    It contains configuration details provided at spec.definition inside chaosexperiment

    View the configuration specification schema

    Field .spec.definition.labels Description Flag to specify the label for the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.labels allow developers to specify the ChaosPod label for an experiment.

    Field .spec.definition.securityContext.podSecurityContext Description Flag to specify security context for ChaosPod Type Optional Range user-defined (type:corev1.PodSecurityContext) Default n/a Notes The .spec.definition.securityContext.podSecurityContext allows the developers to specify the security context for the ChaosPod which applies to all containers inside the Pod.

    Field .spec.definition.securityContext.containerSecurityContext.privileged Description Flag to specify the security context for the ChaosExperiment pod Type Optional Range true, false (type:bool) Default n/a Notes The .spec.definition.securityContext.containerSecurityContext.privileged specify the securityContext params to the experiment container.

    Field .spec.definition.configMaps Description Flag to specify the configmap for ChaosPod Type Optional Range user-defined Default n/a Notes The .spec.definition.configMaps allows the developers to mount the ConfigMap volume into the experiment pod.

    Field .spec.definition.secrets Description Flag to specify the secrets for ChaosPod Type Optional Range user-defined Default n/a Notes The .spec.definition.secrets specify the secret data to be passed for the ChaosPod. The secrets typically contains confidential information like credentials.

    Field .spec.definition.experimentAnnotations Description Flag to specify the custom annotation to the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.experimentAnnotations allows the developer to specify the Custom annotation for the chaos pod.

    Field .spec.definition.hostFileVolumes Description Flag to specify the host file volumes to the ChaosPod Type Optional Range user-defined (type:map[string]string) Default n/a Notes The .spec.definition.hostFileVolumes allows the developer to specify the host file volumes to the ChaosPod.

    Field .spec.definition.hostPID Description Flag to specify the host PID for the ChaosPod Type Optional Range true, false (type:bool) Default n/a Notes The .spec.definition.hostPID allows the developer to specify the host PID for the ChaosPod.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#labels","title":"Labels","text":"

    It allows developers to specify the ChaosPod label for an experiment. It can be tuned via labels field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains experiment labels\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#podsecuritycontext","title":"PodSecurityContext","text":"

    It allows the developers to specify the security context for the ChaosPod which applies to all containers inside the Pod. It can be tuned via podSecurityContext field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains pod security context\n    securityContext:\n      podSecurityContext:\n        allowPrivilegeEscalation: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#container-security-context","title":"Container Security Context","text":"

    It allows the developers to specify the security context for the container inside ChaosPod. It can be tuned via containerSecurityContext field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains container security context\n    securityContext:\n      containerSecurityContext:\n        privileged: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#configmaps","title":"ConfigMaps","text":"

    It allows the developers to mount the ConfigMap volume into the experiment pod. It can tuned via configMaps field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains configmaps details\n    configMaps:\n      - name: experiment-data\n        mountPath: \"/mnt\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#secrets","title":"Secrets","text":"

    It specify the secret data to be passed for the ChaosPod. The secrets typically contains confidential information like credentials. It can be tuned via secret field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains secret details\n    secret:\n      - name: auth-credentials\n        mountPath: \"/tmp\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#experiment-annotations","title":"Experiment Annotations","text":"

    It allows the developer to specify the Custom annotation for the chaos pod. It can be tuned via experimentAnnotations field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains experiment annotations\n    experimentAnnotations:\n      context: chaos\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#host-file-volumes","title":"Host File Volumes","text":"

    It allows the developer to specify the host file volumes to the ChaosPod. It can be tuned via hostFileVolumes field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it contains host file volumes\n    hostFileVolumes:\n      - name: socket file\n        mountPath: \"/run/containerd/containerd.sock\"\n        nodePath: \"/run/containerd/containerd.sock\"\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/configuration-specification/#host-pid","title":"Host PID","text":"

    It allows the developer to specify the host PID for the ChaosPod. It can be tuned via hostPID field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n    # it allows hostPID\n    hostPID: true\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/contents/","title":"Chaos Experiment Specifications","text":"

    Granular definition of chaos intent specified via image, librar, necessary permissions, low-level chaos parameters (default values).

    This section describes the fields in the ChaosExperiment and the possible values that can be set against the same.

    Field Name Description User Guide Scope Specification It defines scope of the chaosexperiment Scope Specifications Component Specification It defines component details of the chaosexperiment Component Specifications Experiment Tunables Specification It defines tunables of the chaosexperiment Experiment Tunables Specification Configuration Specification It defines configuration details of the chaosexperiment Configuration Specification"},{"location":"experiments/concepts/chaos-resources/chaos-experiment/experiment-tunable-specification/","title":"Experiment Tunables Specification","text":"

    It contains the array of tunables passed to the experiment pods as environment variables. It is used to manage the experiment execution. We can set the default values for all the variables (tunable) here which can be overridden by ChaosEngine from .spec.experiments[].spec.components.env if required. To know about the variables that need to be overridden check the list of \"mandatory\" & \"optional\" env for an experiment as provided within the respective experiment documentation. It can be provided at spec.definition.env inside chaosexperiment.

    View the experiment tunables specification

    Field .spec.definition.env Description Flag to specify env used for ChaosExperiment Type Mandatory Range user-defined (type: {name: string, value: string}) Default n/a Notes The .spec.definition.env specifies the array of tunables passed to the experiment pods as environment variables. It is used to manage the experiment execution. We can set the default values for all the variables (tunable) here which can be overridden by ChaosEngine from .spec.experiments[].spec.components.env if required. To know about the variables that need to be overridden check the list of \"mandatory\" & \"optional\" env for an experiment as provided within the respective experiment documentation.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    # permissions for the chaosexperiment\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    # it contains experiment tunables\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/","title":"Scope Specification","text":"

    It contains scope and permissions details provided at spec.definition.scope and spec.definition.permissions respectively inside chaosexperiment.

    View the scope specification schema

    Field .spec.definition.scope Description Flag to specify the scope of the ChaosExperiment Type Optional Range Namespaced, Cluster Default n/a (depends on experiment type) Notes The .spec.definition.scope specifies the scope of the experiment. It can be Namespaced scope for pod level experiments and Cluster for the experiments having a cluster wide impact.

    Field .spec.definition.permissions Description Flag to specify the minimum permission to run the ChaosExperiment Type Optional Range user-defined (type: list) Default n/a Notes The .spec.definition.permissions specify the minimum permission that is required to run the ChaosExperiment. It also helps to estimate the blast radius for the ChaosExperiment.

    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/#experiment-scope","title":"Experiment Scope","text":"

    It specifies the scope of the experiment. It can be Namespaced scope for pod level experiments and Cluster for the experiments having a cluster wide impact. It can be tuned via scope field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    # scope of the chaosexperiment\n    scope: Namespaced\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-experiment/scope-specification/#experiment-permissions","title":"Experiment Permissions","text":"

    It specify the minimum permission that is required to run the ChaosExperiment. It also helps to estimate the blast radius for the ChaosExperiment. It can be tuned via permissions field.

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\ndescription:\n  message: |\n    Deletes a pod belonging to a deployment/statefulset/daemonset\nkind: ChaosExperiment\nmetadata:\n  name: pod-delete\n  labels:\n    name: pod-delete\n    app.kubernetes.io/part-of: litmus\n    app.kubernetes.io/component: chaosexperiment\n    app.kubernetes.io/version: latest\nspec:\n  definition:\n    scope: Namespaced\n    # permissions for the chaosexperiment\n    permissions:\n      - apiGroups:\n          - \"\"\n          - \"apps\"\n          - \"apps.openshift.io\"\n          - \"argoproj.io\"\n          - \"batch\"\n          - \"litmuschaos.io\"\n        resources:\n          - \"deployments\"\n          - \"jobs\"\n          - \"pods\"\n          - \"pods/log\"\n          - \"replicationcontrollers\"\n          - \"deployments\"\n          - \"statefulsets\"\n          - \"daemonsets\"\n          - \"replicasets\"\n          - \"deploymentconfigs\"\n          - \"rollouts\"\n          - \"pods/exec\"\n          - \"events\"\n          - \"chaosengines\"\n          - \"chaosexperiments\"\n          - \"chaosresults\"\n        verbs:\n          - \"create\"\n          - \"list\"\n          - \"get\"\n          - \"patch\"\n          - \"update\"\n          - \"delete\"\n          - \"deletecollection\"\n    image: \"litmuschaos/go-runner:latest\"\n    imagePullPolicy: Always\n    args:\n    - -c\n    - ./experiments -name pod-delete\n    command:\n    - /bin/bash\n    env:\n\n    - name: TOTAL_CHAOS_DURATION\n      value: '15'\n\n    - name: RAMP_TIME\n      value: ''\n\n    - name: FORCE\n      value: 'true'\n\n    - name: CHAOS_INTERVAL\n      value: '5'\n\n    - name: PODS_AFFECTED_PERC\n      value: ''\n\n    - name: LIB\n      value: 'litmus'    \n\n    - name: TARGET_PODS\n      value: ''\n\n    ## it defines the sequence of chaos execution for multiple target pods\n    ## supported values: serial, parallel\n    - name: SEQUENCE\n      value: 'parallel'\n\n    labels:\n      name: pod-delete\n      app.kubernetes.io/part-of: litmus\n      app.kubernetes.io/component: experiment-job\n      app.kubernetes.io/version: latest\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/contents/","title":"Chaos Result Specifications","text":"

    Hold engine reference, experiment state, verdict(on complete), salient application/result attributes, sources for metrics collection

    This section describes the fields in the ChaosResult and the possible values that can be set against the same.

    Field Name Description User Guide Spec Specification It defines spec details of the chaosresult Spec Specification Status Specification It defines status details of the chaosresult Status Specification Probe Specification It defines component details of the chaosresult Probe Specification"},{"location":"experiments/concepts/chaos-resources/chaos-result/probe-specification/","title":"Probe Status","text":"

    It contains probe details provided at status.probeStatus inside chaosresult. It contains following fields:

    • name: Flag to show the name of probe used in the experiment
    • type: Flag to show the type of probe used
    • status.continuous: Flag to show the result of probe in continuous mode
    • status.prechaos: Flag to show the result of probe in pre chaos
    • status.postchaos: Flag to show the result of probe in post chaos
    View the probe schema

    Field .status.probestatus.name Description Flag to show the name of probe used in the experiment Range n/a n/a (type: string) Notes The .status.probestatus.name shows the name of the probe used in the experiment.

    Field .status.probestatus.type Description Flag to show the type of probe used Range HTTPProbe,K8sProbe,CmdProbe(type:string) Notes The .status.probestatus.type shows the type of probe used.

    Field .status.probestatus.status.continuous Description Flag to show the result of probe in continuous mode Range Awaited,Passed,Better Luck Next Time (type: string) Notes The .status.probestatus.status.continuous helps to get the result of the probe in the continuous mode. The httpProbe is better used in the Continuous mode.

    Field .status.probestatus.status.postchaos Description Flag to show the probe result post chaos Range Awaited,Passed,Better Luck Next Time (type:map[string]string) Notes The .status.probestatus.status.postchaos shows the result of probe setup in EOT mode executed at the End of Test as a post-chaos check.

    Field .status.probestatus.status.prechaos Description Flag to show the probe result pre chaos Range Awaited,Passed,Better Luck Next Time (type:string) Notes The .status.probestatus.status.prechaos shows the result of probe setup in SOT mode executed at the Start of Test as a pre-chaos check.

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\n  Probe Status:\n    # name of probe\n    Name:  check-frontend-access-url\n    # status of probe\n    Status:\n      Continuous:  Passed \ud83d\udc4d #Continuous\n    # type of probe\n    Type:          HTTPProbe\n    # name of probe\n    Name:          check-app-cluster-cr-status\n    # status of probe\n    Status:\n      Post Chaos:  Passed \ud83d\udc4d #EoT\n    # type of probe\n    Type:          K8sProbe\n    # name of probe\n    Name:          check-database-integrity\n    # status of probe\n    Status:\n      Post Chaos:  Passed \ud83d\udc4d #Edge\n      Pre Chaos:   Passed \ud83d\udc4d \n    # type of probe\n    Type:          CmdProbe\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/spec-specification/","title":"Spec Specification","text":"

    It contains spec details provided at spec inside chaosresult. The name of chaosengine and chaosexperiment are present at spec.engine and spec.experiment respectively.

    View the spec details schema

    Field .spec.engine Description Flag to hold the ChaosEngine name for the experiment Range n/a (type: string) Notes The .spec.engine holds the engine name for the current course of the experiment.

    Field .spec.experiment Description Flag to hold the ChaosExperiment name which induces chaos. Range n/a (type: string) Notes The .spec.experiment holds the ChaosExperiment name for the current course of the experiment.

    view the sample chaosresult:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  # name of the chaosengine\n  Engine:      engine-nginx\n  # name of the chaosexperiment\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/","title":"Status Specification","text":"

    It contains status details provided at status inside chaosresult.

    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/#experiment-status","title":"Experiment Status","text":"

    It contains experiment status provided at status.experimentStatus inside chaosresult. It contains following fields:

    • failStep: Flag to show the failure step of the ChaosExperiment
    • phase: Flag to show the current phase of the experiment
    • probesuccesspercentage: Flag to show the probe success percentage
    • verdict: Flag to show the verdict of the experiment
    View the experiment status

    Field .status.experimentStatus.failstep Description Flag to show the failure step of the ChaosExperiment Range n/a(type: string) Notes The .status.experimentStatus.failstep Show the step at which the experiment failed. It helps in faster debugging of failures in the experiment execution.

    Field .status.experimentStatus.phase Description Flag to show the current phase of the experiment Range Awaited,Running,Completed,Aborted (type: string) Notes The .status.experimentStatus.phase shows the current phase in which the experiment is. It gets updated as the experiment proceeds.If the experiment is aborted then the status will be Aborted.

    Field .status.experimentStatus.probesuccesspercentage Description Flag to show the probe success percentage Range 1 to 100 (type: int) Notes The .status.experimentStatus.probesuccesspercentage shows the probe success percentage which is a ratio of successful checks v/s total probes.

    Field .status.experimentStatus.verdict Description Flag to show the verdict of the experiment. Range Awaited,Pass,Fail,Stopped (type: string) Notes The .status.experimentStatus.verdict shows the verdict of the experiment. It is Awaited when the experiment is in progress and ends up with Pass or Fail according to the experiment result.

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    # step on which experiment fails\n    Fail Step:                 N/A\n    # phase of the chaos result\n    Phase:                     Completed\n    # Success Percentage of the litmus probes\n    Probe Success Percentage:  100\n    # Verdict of the chaos result\n    Verdict:                   Pass\n  History:\n    Failed Runs:   1\n    Passed Runs:   1\n    Stopped Runs:  0\n    Targets:\n      Chaos Status:  targeted\n      Kind:          deployment\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-result/status-specification/#result-history","title":"Result History","text":"

    It contains history of experiment runs present at status.history. It contains following fields:

    • passedRuns: It contains cumulative passed run count
    • failedRuns: It contains cumulative failed run count
    • stoppedRuns: It contains cumulative stopped run count
    • targets.name: It contains name of target application
    • target.kind: It contains kinds of target application
    • target.chaosStatus: It contains chaos status
    View the history details

    Field .status.history.passedRuns Description It contains cumulative passed run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.passedRuns contains cumulative passed run counts for a specific ChaosResult.

    Field .status.history.failedRuns Description It contains cumulative failed run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.failedRuns contains cumulative failed run counts for a specific ChaosResult.

    Field .status.history.stoppedRuns Description It contains cumulative stopped run count Range ANY NON NEGATIVE INTEGER Notes The .status.history.stoppedRuns contains cumulative stopped run counts for a specific ChaosResult.

    Field .status.history.targets.name Description It contains name of the target application Range string Notes The .status.history.targets.name contains name of the target application

    Field .status.history.targets.kind Description It contains kind of the target application Range string Notes The .status.history.targets.kind contains kind of the target application

    Field .status.history.targets.chaosStatus Description It contains status of the chaos Range targeted, injected, reverted Notes The .status.history.targets.chaosStatus contains status of the chaos

    view the sample example:

    Name:         engine-nginx-pod-delete\nNamespace:    default\nLabels:       app.kubernetes.io/component=experiment-job\n              app.kubernetes.io/part-of=litmus\n              app.kubernetes.io/version=1.13.8\n              chaosUID=aa0a0084-f20f-4294-a879-d6df9aba6f9b\n              controller-uid=6943c955-0154-4542-8745-de991eb47c61\n              job-name=pod-delete-w4p5op\n              name=engine-nginx-pod-delete\nAnnotations:  <none>\nAPI Version:  litmuschaos.io/v1alpha1\nKind:         ChaosResult\nMetadata:\n  Creation Timestamp:  2021-09-29T13:28:59Z\n  Generation:          6\n  Resource Version:    66788\n  Self Link:           /apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/engine-nginx-pod-delete\n  UID:                 fe7f01c8-8118-4761-8ff9-0a87824d863f\nSpec:\n  Engine:      engine-nginx\n  Experiment:  pod-delete\nStatus:\n  Experiment Status:\n    Fail Step:                 N/A\n    Phase:                     Completed\n    Probe Success Percentage:  100\n    Verdict:                   Pass\n  History:\n    # fail experiment run count\n    Failed Runs:   1\n    # passed experiment run count\n    Passed Runs:   1\n    # stopped experiment run count\n    Stopped Runs:  0\n    Targets:\n      # status of the chaos\n      Chaos Status:  targeted\n      # kind of the application\n      Kind:          deployment\n      # name of the application\n      Name:          hello\nEvents:              <none>\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/contents/","title":"Chaos Scheduler Specifications","text":"

    Hold attributes for repeated execution (run now, once@timestamp, b/w start-end timestamp@ interval). Embeds the ChaosEngine as template

    This section describes the fields in the ChaosScheduler and the possible values that can be set against the same.

    Parameter Description User Guide Schedule Once Schedule chaos once on specified time or now Schedule Once Repeat Schedule Schedule chaos in repeat mode Repeat Schedule Schedule State Defines the state of the schedule Schedule State Engine Specifications Defines the chaosengine specifications Engine Specifications"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/engine-specification/","title":"Engine Specification","text":"

    It embeds the ChaosEngine as a template inside schedule CR. Which contains the chaosexperiment and target application details.

    View the engine details

    Field .spec.engineTemplateSpec Description Flag to control chaosengine to be formed Type Mandatory Range n/a Default n/a Notes The engineTemplateSpec is the ChaosEngineSpec of ChaosEngine that is to be formed.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/engine-specification/#engine-specification_1","title":"Engine Specification","text":"

    Specify the chaosengine details at spec.engineTemplateSpec inside schedule CR

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n         #format should be like \"10m\" or \"2h\" accordingly for minutes or hours\n        minChaosInterval: \"2m\"  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/","title":"Schedule Once","text":"

    It schedule the chaos once either on the specified time or immediately after creation of schedule CR.

    View the schedule once schema"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#schedule-now","title":"Schedule NOW","text":"

    Field .spec.schedule.now Description Flag to control the type of scheduling Type Mandatory Range true, false Default n/a Notes The now in the spec.schedule ensures immediate creation of chaosengine, i.e., injection of chaos."},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#schedule-once_1","title":"Schedule Once","text":"

    Field .spec.schedule.once.executionTime Description Flag to specify execution timestamp at which chaos is injected, when the policy is once. The chaosengine is created exactly at this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes .spec.schedule.once refers to a single-instance execution of chaos at a particular timestamp specified by .spec.schedule.once.executionTime

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#immediate-chaos","title":"Immediate Chaos","text":"

    It schedule the chaos immediately after creation of the chaos-schedule CR. It can be tuned via setting spec.schedule.now to true.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    now: true\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-once/#chaos-at-a-specified-timestamp","title":"Chaos at a Specified TimeStamp","text":"

    It schedule the chaos once at the specified time. It can be tuned via setting spec.schedule.once.executionTime. The execution time should be in UTC Timezone.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    once:\n      #should be modified according to current UTC Time\n      executionTime: \"2020-05-12T05:47:00Z\"   \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/","title":"Repeat Schedule","text":"

    It schedule the chaos in the repeat mode. There are various ways we can set up this type of schedule by varying the the fields inside spec.repeat.

    Note - We have just one field i.e. minChaosInterval to be specified as mandatory one. All other fields are optional and totally dependent on the desired behaviour.

    View the schedule repeat schema

    Field .spec.schedule.repeat.timeRange.startTime Description Flag to specify start timestamp of the range within which chaos is injected, when the policy is repeat. The chaosengine is not created before this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes When startTime is specified against the policy repeat, ChaosEngine will not be formed before this time, no matter when it was created.

    Field .spec.schedule.repeat.timeRange.endTime Description Flag to specify end timestamp of the range within which chaos is injected, when the policy is repeat. The chaosengine is not created after this timestamp. Type Mandatory Range user-defined (type: UTC Timeformat) Default n/a Notes When endTime is specified against the policy repeat, ChaosEngine will not be formed after this time.

    Field .spec.schedule.repeat.properties.minChaosInterval.hour.everyNthHour Description Flag to specify the hours between each successive schedule Type Mandatory Range integer Default n/a Notes The minChaosInterval.hour.everyNthHour in the spec specifies the time interval in hours between each schedule

    Field .spec.schedule.repeat.properties.minChaosInterval.hour.minuteOfTheHour Description Flag to specify minute of hour for each successive schedule Type Mandatory Range integer Default 0 Notes The minChaosInterval.hour.minuteOfTheHour in the spec specifies the minute of the hour between each schedule

    Field .spec.schedule.repeat.properties.minChaosInterval.minute.everyNthMinute Description Flag to specify the minutes for each successive schedule Type Mandatory Range integer Default n/a Notes The minChaosInterval.hour.everyNthMinute in the spec specifies the time interval in minutes between each schedule

    Field .spec.schedule.repeat.workDays.includedDays Description Flag to specify the days at which chaos is allowed to take place Type Mandatory Range user-defined (type: string)(pattern: [{day_name},{day_name}...]). Default n/a Notes The includedDays in the spec specifies a (comma-separated) list of days of the week at which chaos is allowed to take place. {day_name} is to be specified with the first 3 letters of the name of day such as Mon, Tue etc.

    Field .spec.schedule.repeat.workHours.includedHours Description Flag to specify the hours at which chaos is allowed to take place Type Mandatory Range {hour_number} will range from 0 to 23 (type: string)(pattern: {hour_number}-{hour_number}). Default n/a Notes The includedHours in the spec specifies a range of hours of the day at which chaos is allowed to take place. 24 hour format is followed"},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#basic-schema-to-execute-repeat-strategy","title":"Basic Schema to Execute Repeat Strategy","text":"

    This will keep executing the schedule and creating engines for an indefinite amount of time.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-every-nth-minute","title":"Schedule ChaosEngine at every nth minute","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos at every 5 minutes\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-every-nth-hour","title":"Schedule ChaosEngine at every nth hour","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos every hour at 0th minute\n          hour:\n            everyNthHour: 1\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#schedule-chaosengine-at-nth-minute-of-every-nth-hour","title":"Schedule ChaosEngine at nth minute of every nth hour","text":"
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          # schedule the chaos every hour at 30th minute\n          hour:\n            everyNthHour: 1\n            minuteOfTheHour: 30\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-time-range-for-the-chaos-schedule","title":"Specifying Time Range for the Chaos Schedule","text":"

    This will manipulate the schedule to be started and ended according to our definition.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        startTime: \"2020-05-12T05:47:00Z\"   \n        endTime: \"2020-09-13T02:58:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-just-the-end-time","title":"Specifying Just the End Time","text":"

    Assumes the custom resource creation timestamp as the StartTime

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        endTime: \"2020-09-13T02:58:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-just-the-starttime","title":"Specifying Just the StartTime","text":"

    Executes chaos indefinitely (until the ChaosSchedule CR is removed) starting from the specified timestamp

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      timeRange:\n        #should be modified according to current UTC Time\n        startTime: \"2020-05-12T05:47:00Z\"   \n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-work-hours","title":"Specifying Work Hours","text":"

    This ensures chaos execution within the specified hours of the day, everyday.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5   \n      workHours:\n        # format should be <starting-hour-number>-<ending-hour-number>(inclusive)\n        includedHours: 0-12\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    # It can be true/false\n    annotationCheck: 'true'\n    #ex. values: ns1:name=percona,ns2:run=nginx\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    # It can be delete/retain\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/schedule-repeat/#specifying-work-days","title":"Specifying work days","text":"

    This executes chaos on specified days of the week, with the specified minimum interval.

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosSchedule\nmetadata:\n  name: schedule-nginx\nspec:\n  schedule:\n    repeat:\n      properties:\n        minChaosInterval:\n          minute:\n            everyNthMinute: 5  \n      workDays:\n        includedDays: \"Mon,Tue,Wed,Sat,Sun\"\n  engineTemplateSpec:\n    engineState: 'active'\n    appinfo:\n      appns: 'default'\n      applabel: 'app=nginx'\n      appkind: 'deployment'\n    annotationCheck: 'true'\n    auxiliaryAppInfo: ''\n    chaosServiceAccount: pod-delete-sa\n    jobCleanUpPolicy: 'delete'\n    experiments:\n      - name: pod-delete\n        spec:\n          components:\n            env:\n              # set chaos duration (in sec) as desired\n              - name: TOTAL_CHAOS_DURATION\n                value: '30'\n\n              # set chaos interval (in sec) as desired\n              - name: CHAOS_INTERVAL\n                value: '10'\n\n              # pod failures without '--force' & default terminationGracePeriodSeconds\n              - name: FORCE\n                value: 'false'\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/","title":"Halt/Resume ChaosSchedule","text":"

    Chaos Schedules can be halted or resumed as per need. It can tuned via setting spec.scheduleState to halt and active respectively.

    View the state schema

    Field .spec.scheduleState Description Flag to control chaosshedule state Type Optional Range active, halt, complete Default active Notes The scheduleState is the current state of ChaosSchedule. If the schedule is running its state will be active, if the schedule is halted its state will be halt and if the schedule is completed it state will be complete.

    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/#halt-the-schedule","title":"Halt The Schedule","text":"

    Follow the below steps to halt the active schedule:

    • Edit the ChaosSchedule CR in your favourite editor
      kubectl edit chaosschedule schedule-nginx\n
    • Change the spec.scheduleState to halt
      spec:\n  scheduleState: halt\n
    "},{"location":"experiments/concepts/chaos-resources/chaos-scheduler/state/#resume-the-schedule","title":"Resume The Schedule","text":"

    Follow the below steps to resume the halted schedule:

    • Edit the chaosschedule
      kubectl edit chaosschedule schedule-nginx\n
    • Change the spec.scheduleState to active
      spec:\n  scheduleState: active\n
    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/","title":"Command Probe","text":"

    The command probe allows developers to run shell commands and match the resulting output as part of the entry/exit criteria. The intent behind this probe was to allow users to implement a non-standard & imperative way of expressing their hypothesis. For example, the cmdProbe enables you to check for specific data within a database, parse the value out of a JSON blob being dumped into a certain path, or check for the existence of a particular string in the service logs. It can be executed by setting type as cmdProbe inside .spec.experiments[].spec.probe.

    View the command probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .cmdProbe/inputs.command Description Flag to hold the command for the cmdProbe Type Mandatory Range n/a {type: string} Notes The .cmdProbe/inputs.command contains the shell command, which should be run as part of cmdProbe

    Field .cmdProbe/inputs.source Description Flag to hold the source for the cmdProbe Type Mandatory Range It contains the source attributes i.e, image, imagePullPolicy Notes The .cmdProbe/inputs.source It supports inline mode where command should be run within the experiment pod, and it can be tuned by omiting source field. Otherwise provide the source details(i.e, image) which can be used to launch a external pod where the command execution is carried out.

    Field .cmdProbe/inputs.comparator.type Description Flag to hold type of the data used for comparision Type Mandatory Range string, int, float Notes The .cmdProbe/inputs.comparator.type contains type of data, which should be compare as part of comparision operation

    Field .cmdProbe/inputs.comparator.criteria Description Flag to hold criteria for the comparision Type Mandatory Range it supports {>=, <=, ==, >, <, !=, oneOf, between} for int & float type. And {equal, notEqual, contains, matches, notMatches, oneOf} for string type. Notes The .cmdProbe/inputs.comparator.criteria contains criteria of the comparision, which should be fulfill as part of comparision operation.

    Field .cmdProbe/inputs.comparator.value Description Flag to hold value for the comparision Type Mandatory Range n/a {type: string} Notes The .cmdProbe/inputs.comparator.value contains value of the comparision, which should follow the given criteria as part of comparision operation.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#inline-mode","title":"Inline Mode","text":"

    In inline mode, the command probe is executed from within the experiment pod. It is preferred for simple shell commands. It is default mode, and it can be tuned by omitting source field.

    Use the following example to tune this:

    # execute the command inside the experiment pod itself\n# cases where command doesn't need any extra binaries which is not available in litmsuchaos/go-runner image\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          # command which needs to run in cmdProbe\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/cmdProbe/#source-mode","title":"Source Mode","text":"

    In source mode, the command execution is carried out from within a new pod whose image can be specified. It can be used when application-specific binaries are required.

    View the source probe schema

    Field .image Description Flag to hold the image of the source pod Type Mandatory Range n/a (type: string) Notes The .image holds the image of the source pod/td>

    Field .hostNetwork Description Flag to enable the hostNetwork for the source pod Type Optional Range (type: boolean) Notes The .hostNetwork flag to enable the hostnetwork. It supports boolean values and default value is false/td>

    Field .args Description Flag to hold the args for the source pod Type Optional Range (type: []string]) Notes The .args flag to hold the args for source pod/td>

    Field .env Description Flag to hold the envs for the source pod Type Optional Range (type: []corev1.EnvVar]) Notes The .env flag to hold the envs for source pod/td>

    Field .labels Description Flag to hold the labels for the source pod Type Optional Range (type: map[string]string) Notes The .labels flag to hold the labels for source pod/td>

    Field .annotations Description Flag to hold the annotations for the source pod Type Optional Range (type: map[string]string) Notes The .annotations flag to hold the annotations for source pod/td>

    Field .command Description Flag to hold the command for the source pod Type Optional Range (type: []string Notes The .command flag to hold the command for source pod/td>

    Field .imagePullPolicy Description Flag to set the imagePullPolicy for the source pod Type Optional Range (type: corev1.PullPolicy Notes The .imagePullPolicy Flag to set the imagePullPolicy for the source pod/td>

    Field .privileged Description Flag to set the privileged for the source pod Type Optional Range (type: boolean Notes The .privileged Flag to set the privileged for the source pod. Default value is false/td>

    Field .nodeSelector Description Flag to hold the node selectors for the probe pod Type Optional Range (type: map[string]string Notes The .nodeSelector Flag to hold the node selectors for the probe pod/td>

    Field .tolerations Description Flag to hold the tolerations for the probe pod Type Optional Range (type: []corev1.Tolerations Notes The .tolerations Flag to hold the Tolerations for the probe pod

    Field .volumes Description Flag to hold the volumes for the source pod Type Optional Range (type: []corev1.Volume Notes The .volumes Flag to hold the volumes for source pod/td>

    Field .volumeMount Description Flag to hold the volume mounts for the source pod Type Optional Range (type: []corev1.VolumeMount Notes The .volumes Flag to hold the volume Mounts for source pod/td>

    Field .imagePullSecrets Description Flag to set the imagePullSecrets for the source pod Type Optional Range (type: []corev1.LocalObjectReference Notes The .imagePullSecrets Flag to set the imagePullSecrets for the source pod/td>

    Use the following example to tune this:

    # it launches the external pod with the source image and run the command inside the same pod\n# cases where command needs an extra binaries which is not available in litmsuchaos/go-runner image\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          # command which needs to run in cmdProbe\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=, oneOf, between] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches, oneOf] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n          # source for the cmdProbe\n          source:\n            image: \"<source-image>\"\n            imagePullPolicy: Always\n            privileged: true\n            hostNetwork: false\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/contents/","title":"Probes Specifications","text":"

    Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard \u201cin-built\u201d checks).

    Probe Name Description User Guide Command Probe It defines the command probes Command Probe HTTP Probe It defines the http probes HTTP Probe K8S Probe It defines the k8s probes K8S Probe Prometheus Probe It defines the prometheus probes Prometheus Probe Probe Chaining It chain the litmus probes Probe Chaining"},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/","title":"HTTP Probe","text":"

    The http probe allows developers to specify a URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria. The received status code is mapped against an expected status. It supports http Get and Post methods. It can be executed by setting type as httpProbe inside .spec.experiments[].spec.probe.

    View the http probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .httpProbe/inputs.url Description Flag to hold the URL for the httpProbe Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.url contains the URL which the experiment uses to gauge health/service availability (or other custom conditions) as part of the entry/exit criteria.

    Field .httpProbe/inputs.insecureSkipVerify Description Flag to hold the flag to skip certificate checks for the httpProbe Type Optional Range true, false Notes The .httpProbe/inputs.insecureSkipVerify contains flag to skip certificate checks.

    Field .httpProbe/inputs.responseTimeout Description Flag to hold the flag to response timeout for the httpProbe Type Optional Range n/a {type: integer} Notes The .httpProbe/inputs.responseTimeout contains flag to provide the response timeout for the http Get/Post request.

    Field .httpProbe/inputs.method.get.criteria Description Flag to hold the criteria for the http get request Type Mandatory Range ==, !=, oneOf Notes The .httpProbe/inputs.method.get.criteria contains criteria to match the http get request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run

    Field .httpProbe/inputs.method.get.responseCode Description Flag to hold the expected response code for the get request Type Mandatory Range HTTP_RESPONSE_CODE Notes The .httpProbe/inputs.method.get.responseCode contains the expected response code for the http get request as part of httpProbe run

    Field .httpProbe/inputs.method.post.contentType Description Flag to hold the content type of the post request Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.method.post.contentType contains the content type of the http body data, which need to be passed for the http post request

    Field .httpProbe/inputs.method.post.body Description Flag to hold the body of the http post request Type Mandatory Range n/a {type: string} Notes The .httpProbe/inputs.method.post.body contains the http body, which is required for the http post request. It is used for the simple http body. If the http body is complex then use .httpProbe/inputs.method.post.bodyPath field.

    Field .httpProbe/inputs.method.post.bodyPath Description Flag to hold the path of the http body, required for the http post request Type Optional Range n/a {type: string} Notes The .httpProbe/inputs.method.post.bodyPath This field is used in case of complex POST request in which the body spans multiple lines, the bodyPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.

    Field .httpProbe/inputs.method.post.criteria Description Flag to hold the criteria for the http post request Type Mandatory Range ==, !=, oneOf Notes The .httpProbe/inputs.method.post.criteria contains criteria to match the http post request's response code with the expected responseCode, which need to be fulfill as part of httpProbe run

    Field .httpProbe/inputs.method.post.responseCode Description Flag to hold the expected response code for the post request Type Mandatory Range HTTP_RESPONSE_CODE Notes The .httpProbe/inputs.method.post.responseCode contains the expected response code for the http post request as part of httpProbe run

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-get-request","title":"HTTP Get Request","text":"

    In HTTP Get method, it sends an http GET request to the provided URL and matches the response code based on the given criteria(==, !=, oneOf). It can be executed by setting httpProbe/inputs.method.get field.

    Use the following example to tune this:

    # contains the http probes with get method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http get method and verify the response code\n            get: \n              # criteria which should be matched\n              criteria: == # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-post-requesthttp-body-is-a-simple","title":"HTTP Post Request(http body is a simple)","text":"

    It contains the http body, which is required for the http post request. It is used for the simple http body. The http body can be provided in the body field. It can be executed by setting httpProbe/inputs.method.post.body field.

    Use the following example to tune this:

    # contains the http probes with post method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http post method and verify the response code\n            post: \n              # value of the http body, used for the post request\n              body: \"<http-body>\"\n              # http body content type\n              contentType: \"application/json; charset=UTF-8\"\n              # criteria which should be matched\n              criteria: \"==\" # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"200\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#http-post-requesthttp-body-is-a-complex","title":"HTTP Post Request(http body is a complex)","text":"

    In the case of a complex POST request in which the body spans multiple lines, the bodyPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR. It can be executed by setting httpProbe/inputs.method.post.body field.

    NOTE: It is mutually exclusive with the body field. If body is set then it will use the body field for the post request otherwise, it will use the bodyPath field.

    Use the following example to tune this:

    # contains the http probes with post method and verify the response code\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          method:\n            # call http post method and verify the response code\n            post: \n              # the configMap should be mounted to the experiment which contains http body\n              # use the mounted path here\n              bodyPath: \"/mnt/body.yml\"\n              # http body content type\n              contentType: \"application/json; charset=UTF-8\"\n              # criteria which should be matched\n              criteria: \"==\" # ==, !=, oneof\n              # exepected response code for the http request, which should follow the specified criteria\n              responseCode: \"200\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#response-timout","title":"Response Timout","text":"

    It contains a flag to provide the response timeout for the http Get/Post request. It can be tuned via .httpProbe/inputs.responseTimeout field. It is an optional field and its unit is milliseconds.

    Use the following example to tune this:

    # defines the response timeout for the http probe\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          # timeout for the http requests\n          responseTimeout: 100 #in ms\n          method:\n            get: \n              criteria: == # ==, !=, oneof\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/httpProbe/#skip-certification-check","title":"Skip Certification Check","text":"

    It contains flag to skip certificate checks. It can bed tuned via .httpProbe/inputs.insecureSkipVerify field. It supports boolean values. Provide it to true to skip the certificate checks. Its default value is false.

    Use the following example to tune this:

    # skip the certificate checks for the httpProbe\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          # skip certificate checks for the httpProbe\n          # supports: true, false. default: false\n          insecureSkipVerify: \"true\"\n          method:\n            get: \n              criteria: == \n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/","title":"K8S Probe","text":"

    With the proliferation of custom resources & operators, especially in the case of stateful applications, the steady-state is manifested as status parameters/flags within Kubernetes resources. k8sProbe addresses verification of the desired resource state by allowing users to define the Kubernetes GVR (group-version-resource) with appropriate filters (field selectors/label selectors). The experiment makes use of the Kubernetes Dynamic Client to achieve this. It supports CRUD operations which can be defined at probe.k8sProbe/inputs.operation. It can be executed by setting type as k8sProbe inside .spec.experiments[].spec.probe.

    View the k8s probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .k8sProbe/inputs.group Description Flag to hold the group of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.group contains group of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.version Description Flag to hold the apiVersion of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.version contains apiVersion of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.resource Description Flag to hold the kubernetes resource name for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.resource contains the kubernetes resource name on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.namespace Description Flag to hold the namespace of the kubernetes resource for the k8sProbe Type Mandatory Range n/a {type: string} Notes The .k8sProbe/inputs.namespace contains namespace of the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.fieldSelector Description Flag to hold the fieldSelectors of the kubernetes resource for the k8sProbe Type Optional Range n/a {type: string} Notes The .k8sProbe/inputs.fieldSelector contains fieldSelector to derived the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.labelSelector Description Flag to hold the labelSelectors of the kubernetes resource for the k8sProbe Type Optional Range n/a {type: string} Notes The .k8sProbe/inputs.labelSelector contains labelSelector to derived the kubernetes resource on which k8sProbe performs the specified operation

    Field .k8sProbe/inputs.operation Description Flag to hold the operation type for the k8sProbe Type Mandatory Range create, delete, present, absent Notes The .k8sProbe/inputs.operation contains operation which should be applied on the kubernetes resource as part of k8sProbe. It supports four type of operation. It can be one of create, delete, present, absent.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#create-operation","title":"Create Operation","text":"

    It creates kubernetes resource based on the data provided inside probe.data field. It can be defined by setting operation to create operation.

    Use the following example to tune this:

    # create the given resource provided inside data field\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"create-percona-pvc\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource should be created\n          namespace: \"default\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"create\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n        # contains manifest, which can be used to create the resource\n        data: |\n          kind: PersistentVolumeClaim\n          apiVersion: v1\n          metadata:\n            name: percona-mysql-claim\n            labels:\n              openebs.io/target-affinity: percona\n          spec:\n            storageClassName: standard\n            accessModes:\n            - ReadWriteOnce\n            resources:\n              requests:\n                storage: 100Mi\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#delete-operation","title":"Delete Operation","text":"

    It deletes matching kubernetes resources via GVR and filters (field selectors/label selectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to delete operation.

    Use the following example to tune this:

    # delete the resource matched with the given inputs\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"delete-percona-pvc\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace of the instance, which needs to be deleted\n          namespace: \"default\"\n          # labels selectors for the k8s resource, which needs to be deleted\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource, which needs to be deleted\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"delete\"\n        mode: \"EOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#present-operation","title":"Present Operation","text":"

    It checks for the presence of kubernetes resource based on GVR and filters (field selectors/labelselectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to present operation.

    Use the following example to tune this:

    # verify the existance of the resource matched with the given inputs inside cluster\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-percona-pvc-presence\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource\n          namespace: \"default\"\n          # labels selectors for the k8s resource\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"present\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/k8sProbe/#absent-operation","title":"Absent Operation","text":"

    It checks for the absence of kubernetes resource based on GVR and filters (field selectors/labelselectors) provided at probe.k8sProbe/inputs. It can be defined by setting operation to absent operation.

    Use the following example to tune this:

    # verify that the no resource should be present in cluster with the given inputs\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-percona-pvc-absence\"\n        type: \"k8sProbe\"\n        k8sProbe/inputs:\n          # group of the resource\n          group: \"\"\n          # version of the resource\n          version: \"v1\"\n          # name of the resource\n          resource: \"persistentvolumeclaims\"\n          # namespace where the instance of resource\n          namespace: \"default\"\n          # labels selectors for the k8s resource\n          labelSelector: \"openebs.io/target-affinity=percona\"\n          # fieldselector for the k8s resource\n          fieldSelector: \"\"\n          # type of operation\n          # supports: create, delete, present, absent\n          operation: \"absent\"\n        mode: \"EOT\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/","title":"Introduction","text":"

    Litmus probes are pluggable checks that can be defined within the ChaosEngine for any chaos experiment. The experiment pods execute these checks based on the mode they are defined in & factor their success as necessary conditions in determining the verdict of the experiment (along with the standard \u201cin-built\u201d checks). It can be provided at .spec.experiments[].spec.probe inside chaosengine. It supports four types: cmdProbe, k8sProbe, httpProbe, and promProbe.

    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#probe-modes","title":"Probe Modes","text":"

    The probes can be set up to run in five different modes. Which can be tuned via mode ENV.

    • SOT: Executed at the Start of the Test as a pre-chaos check
    • EOT: Executed at the End of the Test as a post-chaos check
    • Edge: Executed both, before and after the chaos
    • Continuous: The probe is executed continuously, with a specified polling interval during the chaos injection.
    • OnChaos: The probe is executed continuously, with a specified polling interval strictly for chaos duration of chaos

    Use the following example to tune this:

    # contains the common attributes or run properties\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        # modes for the probes\n        # supports: [SOT, EOT, Edge, Continuous, OnChaos]\n        mode: \"Continuous\"\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#run-properties","title":"Run Properties","text":"

    All probes share some common attributes. Which can be tuned via runProperties ENV.

    • probeTimeout: Represents the time limit for the probe to execute the check specified and return the expected data.
    • retry: The number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.
    • interval: The period between subsequent retries
    • probePollingInterval: The time interval for which continuous/onchaos probes should be sleep after each iteration.

    Use the following example to tune this:

    # contains the common attributes or run properties\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        runProperties:\n          # time limit for the probe to execute the specified check\n          probeTimeout: 5 #in seconds\n          # the time period between subsequent retries\n          interval: 2 #in seconds\n          # number of times a check is re-run upon failure before declaring the probe status as failed\n          retry: 1\n          #time interval for which continuous probe should wait after each iteration\n          # applicable for onChaos and Continuous probes\n          probePollingInterval: 2\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#initial-delay-seconds","title":"Initial Delay Seconds","text":"

    It Represents the initial waiting time interval for the probes. It can be tuned via initialDelaySeconds ENV.

    Use the following example to tune this:

    # contains the initial delay seconds for the probes\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        RunProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n          #initial waiting time interval for the probes\n          initialDelaySeconds: 30 #in seconds\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#stopcontinue-experiment-on-probe-failure","title":"Stop/Continue Experiment On Probe Failure","text":"

    It can be set to true/false to stop or continue the experiment execution after the probe fails. It can be tuned via stopOnFailure ENV. It supports boolean values. The default value is false.

    Use the following example to tune this:

    # contains the flag to stop/continue experiment based on the specified flag\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-frontend-access-url\"\n        type: \"httpProbe\"\n        httpProbe/inputs:\n          url: \"<url>\"\n          insecureSkipVerify: false\n          responseTimeout: <value>\n          method:\n            get: \n              criteria: ==\n              responseCode: \"<response code>\"\n        mode: \"Continuous\"\n        # contains runProperties for the probes\n        runProperties:\n          probeTimeout: 5 \n          interval: 2 \n          retry: 1\n          probePollingInterval: 2\n          #it can be set to true/false to stop or continue the experiment execution after probe fails\n          # supports: true, false. default: false\n          stopOnFailure: true\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#comparator","title":"Comparator","text":"

    Comparator used to validate the SLO based on the probe's actual and expected values for the specified criteria.

    View the comparator's supported fields

    Field .type Description Flag to hold type of the probe's output Type Mandatory Range {int, float, string} (type: string) Notes The .type holds the type of the probe's output/td>

    Field .criteria Description Flag to hold the criteria, which should to be followed by the actual and expected probe outputs Type Mandatory Range Float & Int type: {>,<.<=,>=,==,!=,oneOf,between}, String type: {equal, notEqual, contains, matches, notMatches, oneOf} Notes The .criteria holds the criteria, which should to be followed by the actual and expected probe outputs

    Field .value Description Flag to hold the probe's expected value, which should follow the specified criteria Type Mandatory Range value can be of int, float, string, slice type Notes The .value hold the probe's expected value, which should follow the specified criteria

    Use the following example to tune this:

    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n  appns: \"default\"\n  applabel: \"app=nginx\"\n  appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-database-integrity\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          command: \"<command>\"\n          comparator:\n            # output type for the above command\n            # supports: string, int, float\n            type: \"string\"\n            # criteria which should be followed by the actual output and the expected output\n            #supports [>=, <=, >, <, ==, !=, oneOf, between] for int and float\n            # supports [contains, equal, notEqual, matches, notMatches, oneOf] for string values\n            criteria: \"contains\"\n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n          source:\n            image: \"<source-image>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n          initialDelaySeconds: 5\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#arithmetic-criteria","title":"Arithmetic criteria:","text":"

    It is used to compare the numeric values(int,float) for arithmetic comparisons. It consists of >, <, >=, <=, ==, != criteria

    comparator:\n  type: int\n  criteria: \">\" \n  value: \"20\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#oneof-criteria","title":"OneOf criteria:","text":"

    It is used to compare numeric or string values, whether actual value lies in expected slice. Here expected values consists either of int/float/string values

    comparator:\n  type: int\n  criteria: \"oneOf\"\n  value: \"[400,404,405]\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#between-criteria","title":"Between criteria:","text":"

    It is used to compare the numeric(int,float) values, whether actual value lies between the given lower and upper bound range[a,b]

    comparator:\n  type: int\n  criteria: \"between\"\n  value: \"[1000,5000]\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#equal-and-notequal-criteria","title":"Equal and NotEqual criteria:","text":"

    It is used to compare the string values, it checks whether actual value is equal/notEqual to the expected value or not

    comparator:\n  type: string\n  criteria: \"equal\" #equal or notEqual\n  value: \"<string value>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#contains-criteria","title":"Contains criteria:","text":"

    It is used to compare the string values, it checks whether expected value is sub string of actual value or not

    comparator:\n  type: string\n  criteria: \"contains\" \n  value: \"<string value>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/litmus-probes/#matches-and-notmatches-criteria","title":"Matches and NotMatches criteria:","text":"

    It is used to compare the string values, it checks whether the actual value matches/notMatches the regex(provided as expected value) or not

    comparator:\n  type: string\n  criteria: \"matches\" #matches or notMatches\n  value: \"<regex>\"\n
    "},{"location":"experiments/concepts/chaos-resources/probes/probe-chaining/","title":"Probe Chaining","text":"

    Probe chaining enables reuse of probe a result (represented by the template function {{ .<probeName>.probeArtifact.Register}}) in subsequent \"downstream\" probes defined in the ChaosEngine. Note: The order of execution of probes in the experiment depends purely on the order in which they are defined in the ChaosEngine.

    Use the following example to tune this:

    # chaining enables reuse of probe's result (represented by the template function {{ <probeName>.probeArtifact.Register}}) \n#-- in subsequent \"downstream\" probes defined in the ChaosEngine.\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"probe1\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          command: \"<command>\"\n          comparator:\n            type: \"string\"\n            criteria: \"equals\"\n            value: \"<value-for-criteria-match>\"\n          source: \"inline\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n      - name: \"probe2\"\n        type: \"cmdProbe\"\n        cmdProbe/inputs:\n          ## probe1's result being used as one of the args in probe2\n          command: \"<commmand> {{ .probe1.ProbeArtifacts.Register }} <arg2>\"\n          comparator:\n            type: \"string\"\n            criteria: \"equals\"\n            value: \"<value-for-criteria-match>\"\n          source: \"inline\"\n        mode: \"SOT\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/","title":"Prometheus Probe","text":"

    The prometheus probe allows users to run Prometheus queries and match the resulting output against specific conditions. The intent behind this probe is to allow users to define metrics-based SLOs in a declarative way and determine the experiment verdict based on its success. The probe runs the query on a Prometheus server defined by the endpoint, and checks whether the output satisfies the specified criteria. It can be executed by setting type as promProbe inside .spec.experiments[].spec.probe.

    View the prometheus probe schema

    Field .name Description Flag to hold the name of the probe Type Mandatory Range n/a (type: string) Notes The .name holds the name of the probe. It can be set based on the usecase

    Field .type Description Flag to hold the type of the probe Type Mandatory Range httpProbe, k8sProbe, cmdProbe, promProbe Notes The .type supports four type of probes. It can one of the httpProbe, k8sProbe, cmdProbe, promProbe

    Field .mode Description Flag to hold the mode of the probe Type Mandatory Range SOT, EOT, Edge, Continuous, OnChaos Notes The .mode supports five modes of probes. It can one of the SOT, EOT, Edge, Continuous, OnChaos

    Field .promProbe/inputs.endpoint Description Flag to hold the prometheus endpoints for the promProbe Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.endpoint contains the prometheus endpoints

    Field .promProbe/inputs.query Description Flag to hold the promql query for the promProbe Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.query contains the promql query to extract out the desired prometheus metrics via running it on the given prometheus endpoint

    Field .promProbe/inputs.queryPath Description Flag to hold the path of the promql query for the promProbe Type Optional Range n/a {type: string} Notes The .promProbe/inputs.queryPath This field is used in case of complex queries that spans multiple lines, the queryPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR.

    Field .promProbe/inputs.comparator.criteria Description Flag to hold criteria for the comparision Type Mandatory Range it supports {>=, <=, ==, >, <, !=, oneOf, between} criteria Notes The .promProbe/inputs.comparator.criteria contains criteria of the comparision, which should be fulfill as part of comparision operation.

    Field .promProbe/inputs.comparator.value Description Flag to hold value for the comparision Type Mandatory Range n/a {type: string} Notes The .promProbe/inputs.comparator.value contains value of the comparision, which should follow the given criteria as part of comparision operation.

    Field .runProperties.probeTimeout Description Flag to hold the timeout for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.probeTimeout represents the time limit for the probe to execute the specified check and return the expected data

    Field .runProperties.retry Description Flag to hold the retry count for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.retry contains the number of times a check is re-run upon failure in the first attempt before declaring the probe status as failed.

    Field .runProperties.interval Description Flag to hold the interval for the probes Type Mandatory Range n/a {type: integer} Notes The .runProperties.interval contains the interval for which probes waits between subsequent retries

    Field .runProperties.probePollingInterval Description Flag to hold the polling interval for the probes(applicable for Continuous mode only) Type Optional Range n/a {type: integer} Notes The .runProperties.probePollingInterval contains the time interval for which continuous probe should be sleep after each iteration

    Field .runProperties.initialDelaySeconds Description Flag to hold the initial delay interval for the probes Type Optional Range n/a {type: integer} Notes The .runProperties.initialDelaySeconds represents the initial waiting time interval for the probes.

    Field .runProperties.stopOnFailure Description Flags to hold the stop or continue the experiment on probe failure Type Optional Range false {type: boolean} Notes The .runProperties.stopOnFailure can be set to true/false to stop or continue the experiment execution after probe fails

    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#common-probe-tunables","title":"Common Probe Tunables","text":"

    Refer the common attributes to tune the common tunables for all the probes.

    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#prometheus-queryquery-is-a-simple","title":"Prometheus Query(query is a simple)","text":"

    It contains the promql query to extract out the desired prometheus metrics via running it on the given prometheus endpoint. The prometheus query can be provided in the query field. It can be executed by setting .promProbe/inputs.query field.

    Use the following example to tune this:

    # contains the prom probe which execute the query and match for the expected criteria\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-probe-success\"\n        type: \"promProbe\"\n        promProbe/inputs:\n          # endpoint for the promethus service\n          endpoint: \"<prometheus-endpoint>\"\n          # promql query, which should be executed\n          query: \"<promql-query>\"\n          comparator:\n            # criteria which should be followed by the actual output and the expected output\n            #supports >=,<=,>,<,==,!= comparision\n            criteria: \"==\" \n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/chaos-resources/probes/promProbe/#prometheus-queryquery-is-a-complex","title":"Prometheus Query(query is a complex","text":"

    In case of complex queries that spans multiple lines, the queryPath attribute can be used to provide the path to a file consisting of the same. This file can be made available to the experiment pod via a ConfigMap resource, with the ConfigMap name being defined in the ChaosEngine OR the ChaosExperiment CR. It can be executed by setting promProbe/inputs.queryPath field.

    NOTE: It is mutually exclusive with the query field. If query is set then it will use the query field otherwise, it will use the queryPath field.

    Use the following example to tune this:

    # contains the prom probe which execute the query and match for the expected criteria\napiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  engineState: \"active\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: pod-delete-sa\n  experiments:\n  - name: pod-delete\n    spec:\n      probe:\n      - name: \"check-probe-success\"\n        type: \"promProbe\"\n        promProbe/inputs:\n          # endpoint for the promethus service\n          endpoint: \"<prometheus-endpoint>\"\n          # the configMap should be mounted to the experiment which contains promql query\n          # use the mounted path here\n          queryPath: \"<path of the query>\"\n          comparator:\n            # criteria which should be followed by the actual output and the expected output\n            #supports >=,<=,>,<,==,!= comparision\n            criteria: \"==\" \n            # expected value, which should follow the specified criteria\n            value: \"<value-for-criteria-match>\"\n        mode: \"Edge\"\n        runProperties:\n          probeTimeout: 5\n          interval: 5\n          retry: 1\n
    "},{"location":"experiments/concepts/security/kyverno-policies/","title":"Kyverno Policies","text":"

    Kyverno policies blocks configurations that don't match a policy (enforce mode) or can generate policy violations (audit mode). It scans existing configurations and reports violations in the cluster. Litmus recommends using the provided policy configuration to enable the execution of all supported (out-of-the-box) experiments listed in the chaoshub. Having said that, this is recommendatory in nature and left to user discretion/choice depending upon experiments desired.

    The details listed here are expected to aid users of Kyverno. If you are using alternate means to enforce runtime security, such as native Kubernetes PSPs (pod security policies), refer this section: refer

    "},{"location":"experiments/concepts/security/kyverno-policies/#policies-in-litmus","title":"Policies in Litmus","text":"

    Litmus recommends using the following policies:

    1. Add Capabilities: It restricts add capabilities except the NET_ADMIN and SYS_ADMIN for the pods that use runtime API
    2. Host Namespaces: It validates following host namespaces for the pods that use runtime API.
      1. HostPID: It allows hostPID. It should be set to true.
      2. HostIPC: It restricts the host IPC. It should be set to false.
      3. HostNetwork: It restricts the hostNetwork. It should be set to false.
    3. Host Paths: It restricts hostPath except the socket-path & container-path host paths for the pods that uses runtime API. It allows hostPaths for service-kill experiments.
    4. Privilege Escalation: It restricts privilege escalation except for the pods that use runtime API
    5. Privilege Container: It restricts privileged containers except for the pods that use runtime API
    6. User Groups: It allows users groups for all the experiment pods
    "},{"location":"experiments/concepts/security/kyverno-policies/#install-policies","title":"Install Policies","text":"

    These Kyverno policies are based on the Kubernetes Pod Security Standards definitons. To apply all pod security policies (recommended) install Kyverno and kustomize, then run:

    kustomize build https://github.com/litmuschaos/chaos-charts/security/kyverno-policies | kubectl apply -f -\n
    "},{"location":"experiments/concepts/security/kyverno-policies/#pod-security-policies-in-restricted-setup","title":"Pod Security Policies in restricted setup","text":"

    If setup contains restricted policies which don't allow execution of litmus experiments by default. For Example deny-privilege-escalation policy doesn't allow privileged escalation. It deny all the pods to use privileged escalation.

    To allow litmus pods to use the privileged escalation. Add the litmus serviceAcccount or ClusterRole/Role inside the exclude block as :

    apiVersion: kyverno.io/v1\nkind: ClusterPolicy\nmetadata:\n  name: deny-privilege-escalation\n  annotations:\n    policies.kyverno.io/category: Pod Security Standards (Restricted)\n    policies.kyverno.io/severity: medium\n    policies.kyverno.io/subject: Pod\n    policies.kyverno.io/description: >-\n      Privilege escalation, such as via set-user-ID or set-group-ID file mode, should not be allowed.\n      This policy ensures the `allowPrivilegeEscalation` fields are either undefined\n      or set to `false`.      \nspec:\n  background: true\n  validationFailureAction: enforce\n  rules:\n  - name: deny-privilege-escalation\n    match:\n      resources:\n        kinds:\n        - Pod\n    exclude:\n      clusterRoles:\n      # add litmus cluster roles here\n      - litmus-admin\n      roles:\n      # add litmus roles here\n      - litmus-roles\n      subjects:\n      # add serviceAccount name here\n      - kind: ServiceAccount\n        name: pod-network-loss-sa\n    validate:\n      message: >-\n        Privilege escalation is disallowed. The fields\n        spec.containers[*].securityContext.allowPrivilegeEscalation, and\n        spec.initContainers[*].securityContext.allowPrivilegeEscalation must\n        be undefined or set to `false`.        \n      pattern:\n        spec:\n          =(initContainers):\n          - =(securityContext):\n              =(allowPrivilegeEscalation): \"false\"\n          containers:\n          - =(securityContext):\n              =(allowPrivilegeEscalation): \"false\"\n
    "},{"location":"experiments/concepts/security/openshift-scc/","title":"OpenShift Security Context Constraint (SCC)","text":"

    Security context constraints allow administrators to control permissions for pods in a cluster. A service account provides an identity for processes that run in a Pod. The service account within a project which applications would usually be run as is the default service account. You may run other applications in the same project, and don't necessarily want to override the privileges used for all applications, create a new service account which can be granted the special rights. In the project where the application is to run. For example run install litmus-admin service account.

    $ oc apply -f https://litmuschaos.github.io/litmus/litmus-admin-rbac.yaml\n\nserviceaccount/litmus-admin created\nclusterrole.rbac.authorization.k8s.io/litmus-admin created\nclusterrolebinding.rbac.authorization.k8s.io/litmus-admin created\n

    The next step is that which must be run as a cluster administrator. It is the granting of the appropriate rights to the service account. This is done by specifying that the service account should run with a specific security context constraint (SCC).

    As an administrator, you can see the list of SCCs that are defined in the cluster by running the oc get scc command.

    $ oc get scc --as system:admin\n\nNAME               PRIV      CAPS      SELINUX     RUNASUSER          FSGROUP     SUPGROUP    PRIORITY   READONLYROOTFS   VOLUMES\nanyuid             false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    10         false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nhostaccess         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim projected secret]\nhostmount-anyuid   false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath nfs persistentVolumeClaim projected secret]\nhostnetwork        false     []        MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nnonroot            false     []        MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\nprivileged         true      [*]       RunAsAny    RunAsAny           RunAsAny    RunAsAny    <none>     false            [*]\nrestricted         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim projected secret]\n

    By default applications would run under the restricted SCC. We can use make use of the default SCC or can create our own SCC to provide the litmus experiment service account (here litmus-admin) to run all the experiments. Here is one such SCC that can be used:

    litmus-scc.yaml

    apiVersion: security.openshift.io/v1\nkind: SecurityContextConstraints\n# To mount the socket path directory in helper pod\nallowHostDirVolumePlugin: true\nallowHostIPC: false\nallowHostNetwork: false\n# To run fault injection on a target container using pid namespace.\n# It is used in stress, network, dns and http experiments. \nallowHostPID: true\nallowHostPorts: false\nallowPrivilegeEscalation: true\n# To run some privileged modules in dns, stress and network chaos\nallowPrivilegedContainer: true\n# NET_ADMIN & SYS_ADMIN: used in network chaos experiments to perform\n# network operations (running tc command in network ns of target container). \n# SYS_ADMIN: used in stress chaos experiment to perform cgroup operations.\nallowedCapabilities:\n- 'NET_ADMIN'\n- 'SYS_ADMIN'\ndefaultAddCapabilities: null\nfsGroup:\n  type: MustRunAs\ngroups: []\nmetadata:\n  name: litmus-scc\npriority: null\nreadOnlyRootFilesystem: false\nrequiredDropCapabilities: null\nrunAsUser:\n  type: RunAsAny\nseLinuxContext:\n  type: MustRunAs\nsupplementalGroups:\n  type: RunAsAny\nusers:\n- system:serviceaccount:litmus:argo\nvolumes:\n# To allow configmaps mounts on upload scripts or envs.\n- configMap\n# To derive the experiment pod name in the experimemnt.\n- downwardAPI\n# used for chaos injection like io chaos.\n- emptyDir\n- hostPath\n- persistentVolumeClaim\n- projected\n# To authenticate with different cloud providers\n- secret\n

    Install the SCC

    $ oc create -f litmus-scc.yaml\nsecuritycontextconstraints.security.openshift.io/litmus-scc created\n

    Now to associate the new service account with the SCC, run the given command

    $ oc adm policy add-scc-to-user litmus-scc -z litmus-admin --as system:admin -n litmus\nclusterrole.rbac.authorization.k8s.io/system:openshift:scc:litmus-scc added: \"litmus-admin\"\n

    The -z option indicates to apply the command to the service account in the current project. To add-scc-to-user add the name of SCC. Provide the namespace of the target service account after -n.

    "},{"location":"experiments/concepts/security/psp/","title":"Using Pod Security Policies with Litmus","text":"

    While working in environments (clusters) that have restrictive security policies, the default litmuschaos experiment execution procedure may be inhibited. This is mainly due to the fact that the experiment pods running the chaos injection tasks in privileged mode. This, in turn, is necessitated due to the mounting of container runtime-specific socket files from the Kubernetes nodes in order to invoke runtime APIs. While this is not needed for all experiments (a considerable number of them use purely the K8s API), those involving injection of chaos processes into the network/process namespaces of other containers have this requirement (ex: netem, stress).

    The restrictive policies are often enforced via pod security policies (PSP) today, with organizations opting for the default \"restricted\" policy.

    "},{"location":"experiments/concepts/security/psp/#applying-pod-security-policies-to-litmus-chaos-pods","title":"Applying Pod Security Policies to Litmus Chaos Pods","text":"
    • To run the litmus pods with operating characteristics described above, first create a custom PodSecurityPolicy that allows the same:

      apiVersion: policy/v1beta1\nkind: PodSecurityPolicy\nmetadata:\nname: litmus\nannotations:\n    seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'\nspec:\nprivileged: true\n# Required to prevent escalations to root.\nallowPrivilegeEscalation: true\n# Allow core volume types.\nvolumes:\n    # To mount script files/templates like ssm-docs in experiment\n    - 'configMap'\n    # Used for chaos injection like io chaos\n    - 'emptyDir'\n    - 'projected'\n    # To authenticate with different cloud providers\n    - 'secret'\n    # To derive the experiment pod name in the experimemnt\n    - 'downwardAPI'\n    # Assume that persistentVolumes set up by the cluster admin are safe to use.\n    - 'persistentVolumeClaim'\n    # To mount the socket path directory used to perform container runtime operations\n    - 'hostPath'\n\nallowedHostPaths:\n    # substitutes this path with an appropriate socket path\n    # ex: '/run/containerd/containerd.sock', '/run/containerd/containerd.sock', '/run/crio/crio.sock'\n    - pathPrefix: \"/run/containerd/containerd.sock\"\n    # substitutes this path with an appropriate container path\n    # ex: '/var/lib/docker/containers', '/var/lib/containerd/io.containerd.runtime.v1.linux/k8s.io', '/var/lib/containers/storage/overlay/'\n    - pathPrefix: \"/var/lib/containerd/io.containerd.runtime.v1.linux/k8s.io\"\n\nallowedCapabilities:\n    # NET_ADMIN & SYS_ADMIN: used in network chaos experiments to perform\n    # network operations (running tc command in network ns of target container). \n    - \"NET_ADMIN\"\n    # SYS_ADMIN: used in stress chaos experiment to perform cgroup operations.\n    - \"SYS_ADMIN\"\nhostNetwork: false\nhostIPC: false\n    # To run fault injection on a target container using pid namespace.\n    # It is used in stress, network, dns and http experiments. \nhostPID: true\nseLinux:\n    # This policy assumes the nodes are using AppArmor rather than SELinux.\n    rule: 'RunAsAny'\nsupplementalGroups:\n    rule: 'MustRunAs'\n    ranges:\n    # Forbid adding the root group.\n    - min: 1\n      max: 65535\nfsGroup:\n    rule: 'MustRunAs'\n    ranges:\n    # Forbid adding the root group.\n    - min: 1\n      max: 65535\nreadOnlyRootFilesystem: false\n

      Note: This PodSecurityPolicy is a sample configuration which works for a majority of the usecases. It is left to the user's discretion to modify it based on the environment. For example, if the experiment doesn't need the socket file to be mounted, allowedHostPaths can be excluded from the psp spec. On the other hand, in case of CRI-O runtime, network-chaos tests need the chaos pods executed in privileged mode. It is also possible that different PSP configs are used in different namespaces based on ChaosExperiments installed/executed in them.

    • Subscribe to the created PSP in the experiment RBAC (or in the admin-mode rbac, as applicable). For example, the pod-delete experiment rbac instrumented with the PSP is shown below:

      ---\napiVersion: v1\nkind: ServiceAccount\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: Role\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nrules:\n- apiGroups: [\"\"]\nresources: [\"pods\",\"events\"]\nverbs: [\"create\",\"list\",\"get\",\"patch\",\"update\",\"delete\",\"deletecollection\"]\n- apiGroups: [\"\"]\nresources: [\"pods/exec\",\"pods/log\",\"replicationcontrollers\"]\nverbs: [\"create\",\"list\",\"get\"]\n- apiGroups: [\"batch\"]\nresources: [\"jobs\"]\nverbs: [\"create\",\"list\",\"get\",\"delete\",\"deletecollection\"]\n- apiGroups: [\"apps\"]\nresources: [\"deployments\",\"statefulsets\",\"daemonsets\",\"replicasets\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"apps.openshift.io\"]\nresources: [\"deploymentconfigs\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"argoproj.io\"]\nresources: [\"rollouts\"]\nverbs: [\"list\",\"get\"]\n- apiGroups: [\"litmuschaos.io\"]\nresources: [\"chaosengines\",\"chaosexperiments\",\"chaosresults\"]\nverbs: [\"create\",\"list\",\"get\",\"patch\",\"update\"]\n- apiGroups: [\"policy\"]\nresources: [\"podsecuritypolicies\"]\nverbs: [\"use\"]\nresourceNames: [\"litmus\"] \n---\napiVersion: rbac.authorization.k8s.io/v1\nkind: RoleBinding\nmetadata:\nname: pod-delete-sa\nnamespace: default\nlabels:\n    name: pod-delete-sa\n    app.kubernetes.io/part-of: litmus\nroleRef:\napiGroup: rbac.authorization.k8s.io\nkind: Role\nname: pod-delete-sa\nsubjects:\n- kind: ServiceAccount\nname: pod-delete-sa\nnamespace: default\n
    • Execute the ChaosEngine and verify that the litmus experiment pods are created successfully.

    "},{"location":"experiments/faq/ci-cd/","title":"CI/CD","text":""},{"location":"experiments/faq/ci-cd/#table-of-contents","title":"Table of Contents","text":"
    1. Is there any use case to integrate Litmus into CI? Which experiment have you integrated as part of the CI? And what would you do if a microservice fails an experiment in the CI?

    2. Is there any way to use Litmus within GitHub? When someone submits a k8s deployment for a PR , We want to run a chaos Experiment on that to see whether it passes or not

    3. How can users integrate Litmuschaos in their environment with Gitops?

    4. How can we use Litmus in our DevOps pipeline/cycle?

    "},{"location":"experiments/faq/ci-cd/#is-there-any-use-case-to-integrate-litmus-into-ci-which-experiment-have-you-integrated-as-part-of-the-ci-and-what-would-you-do-if-a-microservice-fails-an-experiment-in-the-ci","title":"Is there any use case to integrate Litmus into CI? Which experiment have you integrated as part of the CI? And what would you do if a microservice fails an experiment in the CI?","text":"

    We have integrated Litmus with a couple of CI tools, the major ones are:

    • GitHub Actions using litmuschaos actions
    • GitLab using remote templates
    • Keptn
    • Spinnaker templates

    By this, we induce chaos as part of the CI stage as Continuous Chaos allows us to automatically identify application failures over the development phase.

    Failure of an exp in CI should invariably fail the pipeline. The pass would be more subjective. Depends on what is the nature of the CI pipeline - what it is the tests being carried is like etc., If you are doing a simple pod-delete or cpu-hog on a microservice pod w/o traffic OR w/o running it in an env that doesn't need it to interact w/ other services then the insights are limited.

    "},{"location":"experiments/faq/ci-cd/#is-there-any-way-to-use-litmus-within-github-when-someone-submits-a-k8s-deployment-for-a-pr-we-want-to-run-a-chaos-experiment-on-that-to-see-whether-it-passes-or-not","title":"Is there any way to use Litmus within GitHub? When someone submits a k8s deployment for a PR , We want to run a chaos Experiment on that to see whether it passes or not.","text":"

    Yes, with the help of GitHub-chaos-action we can automate the chaos execution on an application in the same place where the code is stored. We can write individual tasks along with chaos actions and combine them to create a custom GitHub workflow. GitHub Workflows are custom automated processes that we can set up in our repository to build, test, package, or deploy any code project on GitHub. Including the GitHub chaos actions in our workflow YAML, We can test the performance/resiliency of our application in a much simpler and better way. To know more visit our Github chaos action repository.

    "},{"location":"experiments/faq/ci-cd/#how-can-users-integrate-litmuschaos-in-their-environment-with-gitops","title":"How can users integrate Litmuschaos in their environment with Gitops?","text":"

    GitOps feature in Litmus enables users to sync workflows from a configured git repo, any workflow inserts/updates made to the repo will be monitored and picked up by the Litmus portal and will be executed on the target cluster. Litmus portal GitOps also includes an event-driven chaos injection feature where users can annotate an application to be watched for changes and if and when the change happens chaos workflows can be triggered automatically. This integrates with other GitOps tools like Flux/Argo CD and enables users to automatically run chaos workflows whenever a new release happens or a particular change occurs in the application. To configure a git repo the user must provide the Git URL of the repository and the branch name and the authentication credentials which are of two types:

    1. Access Token
    2. SSH Key

    Once GitOps is enabled, any new workflows created will be stored in the configured repo in the path litmus/<project-id>/<workflow-name>.yaml

    "},{"location":"experiments/faq/ci-cd/#how-can-we-use-litmus-in-our-devops-pipelinecycle","title":"How can we use Litmus in our DevOps pipeline/cycle?","text":"

    You can add Litmus to the CI/CD pipelines as part of an end-to-end testing approach due to its minimal pre-requisites and simple result mechanisms. It also provides utilities for quick setup of Kubernetes clusters on different platforms as well as installation of storage provider control plane components (operators). Openebs.ci is a reference implementation of how Litmus can be used in the DevOps pipeline.

    "},{"location":"experiments/faq/content/","title":"Litmus FAQ","text":""},{"location":"experiments/faq/content/#faq","title":"FAQ","text":"Category Description References Install Questions related to litmus installation Install Experiments Questions related to litmus experiments Experiments Portal Questions related to litmus portal Portal Scheduler Questions related to litmus scheduler Scheduler Security Questions related to litmus security Security CI/CD Questions related to litmus CI/CD integration CI/CD"},{"location":"experiments/faq/content/#troubleshooting","title":"Troubleshooting","text":"Category Description References Install Troubleshooting related to litmus installation Install Experiments Troubleshooting related to litmus experiments Experiments Portal Troubleshooting related to litmus portal Portal Scheduler Troubleshooting related to litmus scheduler Scheduler"},{"location":"experiments/faq/experiments/","title":"Litmus Experiments","text":""},{"location":"experiments/faq/experiments/#table-of-contents","title":"Table of Contents","text":"
    1. Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?

    2. Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?

    3. How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?

    4. What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?

    5. What are the typical probes used for pod-network related experiments?

    6. Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?

    7. How to run chaos experiment programatically using apis?

    8. Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?

    9. The experiment is not targeting all pods with the given label, it just selects only one pod by default

    10. Do we have a way to see what pods are targeted when users use percentages?

    11. What is the function of spec.definition.scope of a ChaosExperiment CR?

    12. Pod network latency -- I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?

    13. How to check the NETWORK_INTERFACE and SOCKET_PATH variable?

    14. What are the different ways to target the pods and nodes for chaos?

    15. Does the pod affected perc select the random set of pods from the total pods under chaos?

    16. How to extract the chaos start time and end time?

    17. How do we check the MTTR (Mean time to recovery) for an application post chaos?

    18. What is the difference between Ramp Time and Chaos Interval?

    19. Can the appkind be a pod?

    20. What type of chaos experiments are supported by Litmus?

    21. What are the permissions required to run Litmus Chaos Experiments?

    22. What is the scope of a Litmus Chaos Experiment?

    23. To get started with running chaos experiments using Litmus?

    24. How to view and interpret the results of a chaos experiment?

    25. Do chaos experiments run as a standard set of pods?

    26. Is it mandatory to annotate application deployments for chaos?

    27. How to add Custom Annotations as chaos filters?

    28. Is it mandatory for the chaosengine and chaos experiment resources to exist in the same namespace?

    29. How to get the chaos logs in Litmus?

    30. Does Litmus support generation of events during chaos?

    31. How to stop or abort a chaos experiment?

    32. Can a chaos experiment be resumed once stopped or aborted?

    33. How to restart chaosengine after graceful completion?

    34. Does Litmus support any chaos metrics for experiments?

    35. Does Litmus track any usage metrics on the test clusters?

    36. What to choose between minChaosInterval and instanceCount?

    "},{"location":"experiments/faq/experiments/#node-memory-hog-experiments-pod-oom-killed-even-before-the-kubelet-sees-the-memory-stress","title":"Node memory hog experiment's pod OOM Killed even before the kubelet sees the memory stress?","text":"

    The experiment takes a percentage of the total memory capacity of the Node. The helper pod runs on the target node to stress the resources of that node. So The experiment will not consume/hog the memory resources greater than the total memory available on Node. In other words there will always be an upper limit for the amount of memory to be consumed, which equal to the total available memory. Please refer to this blog for more details.

    "},{"location":"experiments/faq/experiments/#pod-network-corruption-and-pod-network-loss-both-experiments-force-network-packet-loss-is-it-worthwhile-trying-out-both-experiments-in-a-scheduled-chaos-test","title":"Pod-network-corruption and pod-network-loss both experiments force network packet loss - is it worthwhile trying out both experiments in a scheduled chaos test?","text":"

    Yes, ultimately these are different ways to simulate a degraded network. Both cases are expected to typically cause retransmissions (for tcp). The extent of degradation depends on the percentage of loss/corruption

    "},{"location":"experiments/faq/experiments/#how-is-the-packet-loss-achieved-in-pod-network-loss-and-corruption-experiments-what-are-the-internals-of-it","title":"How is the packet loss achieved in pod-network loss and corruption experiments? What are the internals of it?","text":"

    The experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container) The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in different availability zones/regions etc.., Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things. Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree.

    The args passed to the tc netem command run against the target container changes depending on the type of n/w fault

    "},{"location":"experiments/faq/experiments/#whats-the-difference-between-pod-memorycpu-hog-vs-pod-memorycpu-hog-exec","title":"What's the difference between pod-memory/cpu-hog vs pod-memory/cpu-hog-exec?","text":"

    The pod cpu and memory chaos experiment till now (version 1.13.7) was using an exec mode of execution which means - we were execing inside the specified target container and launching process like md5sum and dd to consume the cpu and memory respectively. This is done by providing CHAOS_INJECT_COMMAND and CHAOS-KILL-COMMAND in chaosengine CR. But we have some limitations of using this method. Those were:

    • The chaos inject and kill command are highly dependent on the base image of the target container and may work for some and for others you may have to derive it manually and use it.
    • For scratch images that don't expose shells we couldn't execute the chaos.

    To overcome this - The stress-chaos experiments (cpu, memory and io) are enhanced to use a non exec mode of chaos execution. It makes use of target container cgroup for the resource allocation and container pid namespace for showing the stress-ng process in target container. This stress-ng process will consume the resources on the target container without doing an exec. The new enhanced experiments are available from litmus 1.13.8 version.

    "},{"location":"experiments/faq/experiments/#what-are-the-typical-probes-used-for-pod-network-related-experiments","title":"What are the typical probes used for pod-network related experiments?","text":"

    Precisely the role of the experiment. Cause n/w degradation w/o the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container) The idea of this exp is to simulate issues within your pod-network OR microservice communication across services in diff availability zones/regions etc..,

    Mitigation (in this case keep the timeout i.e., access latency low) could be via some middleware that can switch traffic based on some SLOs/perf parameters. If such an arrangement is not available - the next best thing would be to verify if such a degradation is highlighted via notification/alerts etc,. so the admin/SRE has the opportunity to investigate and fix things.

    Another utility of the test would be to see what the extent of impact caused to the end-user OR the last point in the app stack on account of degradation in access to a downstream/dependent microservice. Whether it is acceptable OR breaks the system to an unacceptable degree

    "},{"location":"experiments/faq/experiments/#litmus-provides-multiple-libs-to-run-some-chaos-experiments-like-stress-chaos-and-network-chaos-so-which-library-should-be-preferred-to-use","title":"Litmus provides multiple libs to run some chaos experiments like stress-chaos and network chaos so which library should be preferred to use?","text":"

    The optional libs (like Pumba) is more of an illustration of how you can use 3rd party tools with litmus. Called the BYOC (Bring Your Own Chaos). The preferred LIB is litmus.

    "},{"location":"experiments/faq/experiments/#how-to-run-chaos-experiment-programatically-using-apis","title":"How to run chaos experiment programatically using apis?","text":"

    To directly consume/manipulate the chaos resources (i.e., chaosexperiment, chaosengine or chaosresults) via API - you can directly use the kube API. The CRDs by default provide us with an API endpoint. You can use any generic client implementation (go/python are most used ones) to access them. In case you use go, there is a clientset available as well: go-client

    Here are some simple CRUD ops against chaosresources you could construct with curl (I have used kubectl proxy, one could use an auth token instead)- just for illustration purposes.

    "},{"location":"experiments/faq/experiments/#create-chaosengine","title":"Create ChaosEngine:","text":"

    For example, assume this is the engine spec

    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines -XPOST -H 'Content-Type: application/json' -d@pod-delete-chaosengine-trigger.json\n
    "},{"location":"experiments/faq/experiments/#read-chaosengine-status","title":"Read ChaosEngine status:","text":"
    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos | jq '.status.engineStatus, .status.experiments[].verdict'\n
    "},{"location":"experiments/faq/experiments/#update-chaosengine-spec","title":"Update ChaosEngine Spec:","text":"

    (say, this is the patch: https://gist.github.com/ksatchit/be54955a1f4231314797f25361ac488d)

    curl --header \"Content-Type: application/json-patch+json\" --request PATCH --data '[{\"op\": \"replace\", \"path\": \"/spec/engineState\", \"value\": \"stop\"}]' http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos\n
    "},{"location":"experiments/faq/experiments/#delete-the-chaosengine-resource","title":"Delete the ChaosEngine resource:","text":"
    curl -X DELETE localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/nginx-chaos \\\n-d '{\"kind\":\"DeleteOptions\",\"apiVersion\":\"v1\",\"propagationPolicy\":\"Foreground\"}' \\\n-H \"Content-Type: application/json\"\n
    "},{"location":"experiments/faq/experiments/#similarly-to-check-the-resultsverdict-of-the-experiment-from-chaosresult-you-could-use","title":"Similarly, to check the results/verdict of the experiment from ChaosResult, you could use:","text":"
    curl -s http://localhost:8001/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosresults/nginx-chaos-pod-delete | jq '.status.experimentStatus.verdict, .status.experimentStatus.probeSuccessPercentage'\n
    "},{"location":"experiments/faq/experiments/#kubernetes-by-default-has-built-in-features-like-replicasetdeployment-to-prevent-service-unavailability-continuous-curl-from-the-httpprobe-on-litmus-should-not-fail-in-case-of-container-kill-pod-delete-and-oom-due-to-pod-memory-hog-then-why-do-we-need-cpu-io-and-network-related-chaos-experiments","title":"Kubernetes by default has built-in features like replicaset/deployment to prevent service unavailability (continuous curl from the httpProbe on litmus should not fail) in case of container kill, pod delete and OOM due to pod-memory-hog then why do we need CPU, IO and network related chaos experiments?","text":"

    There are some scenarios that can still occur despite whatever availability aids K8s provides. For example, take disk usage or CPU hogs -- problems you would generally refer to as \"Noisy Neighbour\" problems. Stressing the disk w/ continuous and heavy I/O for example can cause degradation in reads and writes performed by other microservices that use this shared disk - for example. (modern storage solutions for Kubernetes use the concept of storage pools out of which virtual volumes/devices are carved out). Another issue is the amount of scratch space eaten up on a node - leading to lack of space for newer containers to get scheduled (kubernetes too gives up by applying an \"eviction\" taint like \"disk-pressure\") and causes a wholesale movement of all pods to other nodes. Similarly w/ CPU chaos -- by injecting a rogue process into a target container, we starve the main microservice process (typically pid 1) of the resources allocated to it (where limits are defined) causing slowness in app traffic OR in other cases unrestrained use can cause node to exhaust resources leading to eviction of all pods.

    "},{"location":"experiments/faq/experiments/#the-experiment-is-not-targeting-all-pods-with-the-given-label-it-just-selects-only-one-pod-by-default","title":"The experiment is not targeting all pods with the given label, it just selects only one pod by default.","text":"

    Yes. You can use either the PODS_AFFECTED_PERCENTAGE or TARGET_PODS env to select multiple pods. Refer: experiment tunable envs.

    "},{"location":"experiments/faq/experiments/#do-we-have-a-way-to-see-what-pods-are-targeted-when-users-use-percentages","title":"Do we have a way to see what pods are targeted when users use percentages?","text":"

    We can view the target pods from the experiment logs or inside chaos results.

    "},{"location":"experiments/faq/experiments/#what-is-the-function-of-specdefinitionscope-of-a-chaosexperiment-cr","title":"What is the function of spec.definition.scope of a ChaosExperiment CR?","text":"

    The spec.definition.scope & .spec.definition.permissions is mostly for indicative/illustration purposes (for external tools to identify and validate what are the permissions associated to run the exp). By itself, it doesn't influence how and where an exp can be used.One could remove these fields if needed (of course along w/ the crd validation) and store these manifests if desired.

    "},{"location":"experiments/faq/experiments/#in-pod-network-latency-i-have-pod-a-talking-to-pod-b-over-service-b-and-i-want-to-introduce-latency-between-pod-a-and-service-b-what-would-go-into-specappinfo-section-pod-a-namespace-label-selector-and-kind-what-will-go-into-destination_ip-and-destination_host-service-b-details-what-are-the-target_pods","title":"In Pod network latency - I have pod A talking to Pod B over Service B. and I want to introduce latency between Pod A and Service B. What would go into spec.appInfo section? Pod A namespace, label selector and kind? What will go into DESTINATION_IP and DESTINATION_HOST? Service B details? What are the TARGET_PODS?","text":"

    It will target the [1:total_replicas](based on PODS_AFFECTED_PERC) numbers of random pods with matching labels(appinfo.applabel) and namespace(appinfo.appns). But if you want to target a specific pod then you can provide their names as a comma separated list inside TARGET_PODS. Yes, you can provide service B details inside DESTINATION_IPS or DESTINATION_HOSTS. The NETWORK_INTERFACE should be eth0.

    "},{"location":"experiments/faq/experiments/#how-to-check-the-network_interface-and-socket_path-variable","title":"How to check the NETWORK_INTERFACE and SOCKET_PATH variable?","text":"

    The NETWORK_INTERFACE is the interface name inside the pod/container that needs to be targeted. You can find it by execing into the target pod and checking the available interfaces. You can try ip link, iwconfig , ifconfig depending on the tools installed in the pod either of those could work.

    The SOCKET_PATH by default takes the containerd socket path. If you are using something else like docker, crio or have a different socket path by any chance you can specify it. This is required to communicate with the container runtime of your cluster. In addition to this if container-runtime is different then provide the name of container runtime inside CONTAINER_RUNTIME ENV. It supports docker, containerd, and crio runtimes.

    "},{"location":"experiments/faq/experiments/#what-are-the-different-ways-to-target-the-pods-and-nodes-for-chaos","title":"What are the different ways to target the pods and nodes for chaos?","text":"

    The different ways are:

    Pod Chaos:

    • Appinfo: Provide the target pod labels in the chaos engine appinfo section.
    • TARGET_PODS: You can provide the target pod names as a Comma Separated Variable. Like pod1,pod2.

    Node Chaos:

    • TARGET_NODE or TARGET_NODES: Provide the target node or nodes in these envs.
    • NODE_LABEL: Provide the label of the target nodes.
    "},{"location":"experiments/faq/experiments/#does-the-pod-affected-percentage-select-the-random-set-of-pods-from-the-total-pods-under-chaos","title":"Does the pod affected percentage select the random set of pods from the total pods under chaos?","text":"

    Yes, it selects the random pods based on the POD_AFFACTED_PERC ENV. In pod-delete experiment it selects random pods for each iterations of chaos. But for rest of the experiments(if it supports iterations) then it will select random pods once and use the same set of pods for remaining iterations.

    "},{"location":"experiments/faq/experiments/#how-to-extract-the-chaos-start-time-and-end-time","title":"How to extract the chaos start time and end time?","text":"

    We can use the Chaos exporter metrics for the same. One can also visualise these events along with time in chaos engine events.

    "},{"location":"experiments/faq/experiments/#how-do-we-check-the-mttr-mean-time-to-recovery-for-an-application-post-chaos","title":"How do we check the MTTR (Mean time to recovery) for an application post chaos?","text":"

    The MTTR can be validated by using statusCheck Timeout in the chaos engine. By default its value will be 180 seconds. We can also overwrite this using ChaosEngine. For more details refer this

    "},{"location":"experiments/faq/experiments/#what-is-the-difference-between-ramp-time-and-chaos-interval","title":"What is the difference between Ramp Time and Chaos Interval?","text":"

    The ramp time is the time duration to wait before and after injection of chaos in seconds. While the chaos interval is the time interval (in second) between successive chaos iterations.

    "},{"location":"experiments/faq/experiments/#can-the-appkind-be-a-pod","title":"Can the appkind be a pod?","text":"

    The appkind as pod is not supported explicitly. The supported appkind are deployment, statefulset, replicaset, daemonset, rollout, and deploymentconfig. But we can target the pods by following ways:

    • provide labels and namespace at spec.appinfo.applabel and spec.appinfo.appns respectively and provide spec.appinfo.appkind as empty.
    • provide pod names at TARGET_PODS ENV and provide spec.appinfo as nil

    NOTE: The annotationCheck should be provided as false

    "},{"location":"experiments/faq/experiments/#what-type-of-chaos-experiments-are-supported-by-litmus","title":"What type of chaos experiments are supported by Litmus?","text":"

    Litmus broadly defines Kubernetes chaos experiments into two categories: application or pod-level chaos experiments and platform or infra-level chaos experiments. The former includes pod-delete, container-kill, pod-cpu-hog, pod-network-loss etc., while the latter includes node-drain, disk-loss, node-cpu-hog etc., The infra chaos experiments typically have a higher blast radius and impact more than one application deployed on the Kubernetes cluster. Litmus also categorizes experiments on the basis of the applications, with the experiments consisting of app-specific health checks. For a full list of supported chaos experiments, visit: https://hub.litmuschaos.io

    "},{"location":"experiments/faq/experiments/#what-are-the-permissions-required-to-run-litmus-chaos-experiments","title":"What are the permissions required to run Litmus Chaos Experiments?","text":"

    By default, the Litmus operator uses the \u201clitmus\u201d serviceaccount that is bound to a ClusterRole, in order to watch for the ChaosEngine resource across namespaces. However, the experiments themselves are associated with \u201cchaosServiceAccounts\u201d which are created by the developers with bare-minimum permissions necessary to execute the experiment in question. Visit the chaos-charts repo to view the experiment-specific rbac permissions. For example, here are the permissions for container-kill chaos.

    "},{"location":"experiments/faq/experiments/#what-is-the-scope-of-a-litmus-chaos-experiment","title":"What is the scope of a Litmus Chaos Experiment?","text":"

    The chaos CRs (chaosexperiment, chaosengine, chaosresults) themselves are namespace scoped and are installed in the same namespace as that of the target application. While most of the experiments can be executed with service accounts mapped to namespaced roles, some infra chaos experiments typically perform health checks of applications across namespaces & therefore need their serviceaccounts mapped to ClusterRoles.

    "},{"location":"experiments/faq/experiments/#to-get-started-with-running-chaos-experiments-using-litmus","title":"To get started with running chaos experiments using Litmus?","text":"

    Litmus has a low entry barrier and is easy to install/use. Typically, it involves installing the chaos-operator, chaos experiment CRs from the charthub, annotating an application for chaos and creating a chaosengine CR to map your application instance with a desired chaos experiment. Refer to the getting started documentation to learn more on how to run a simple chaos experiment.

    "},{"location":"experiments/faq/experiments/#how-to-view-and-interpret-the-results-of-a-chaos-experiment","title":"How to view and interpret the results of a chaos experiment?","text":"

    The results of a chaos experiment can be obtained from the verdict property of the chaosresult custom resource. If the verdict is Pass, it means that the application under test is resilient to the chaos injected. Alternatively, Fail reflects that the application is not resilient enough to the injected chaos, and indicates the need for a relook into the deployment sanity or possible application bugs/issues.

    kubectl describe chaosresult <chaosengine-name>-<chaos-experiment> -n <namespace>\n
    The status of the experiment can also be gauged by the \u201cstatus\u201d property of the ChaosEngine.

    Kubectl describe chaosengine <chaosengne-name> -n <namespace>\n
    "},{"location":"experiments/faq/experiments/#do-chaos-experiments-run-as-a-standard-set-of-pods","title":"Do chaos experiments run as a standard set of pods?","text":"

    The chaos experiment (triggered after creation of the ChaosEngine resource) workflow consists of launching the \u201cchaos-runner\u201d pod, which is an umbrella executor of different chaos experiments listed in the engine. The chaos-runner creates one pod (job) per each experiment to run the actual experiment business logic, and also manages the lifecycle of these experiment pods (performs functions such as experiment dependencies validation, job cleanup, patching of status back into ChaosEngine etc.,). Optionally, a monitor pod is created to export the chaos metrics. Together, these 3 pods are a standard set created upon execution of the experiment. The experiment job, in turn may spawn dependent (helper) resources if necessary to run the experiments, but this depends on the experiment selected, chaos libraries chosen etc.,

    "},{"location":"experiments/faq/experiments/#is-it-mandatory-to-annotate-application-deployments-for-chaos","title":"Is it mandatory to annotate application deployments for chaos?","text":"

    Typically applications are expected to be annotated with litmuschaos.io/chaos=\"true\" to lend themselves to chaos. This is in order to support selection of the right applications with similar labels in a namespaces, thereby isolating the application under test (AUT) & reduce the blast radius. It is also helpful for supporting automated execution (say, via cron) as a background service. However, in cases where the app deployment specifications are sacrosanct and not expected to be modified, or in cases where annotating a single application for chaos when the experiment itself is known to have a higher blast radius doesn\u2019t make sense (ex: infra chaos), the annotationCheck can be disabled via the ChaosEngine tunable annotationCheck (.spec.annotationCheck: false).

    "},{"location":"experiments/faq/experiments/#how-to-add-custom-annotations-as-chaos-filters","title":"How to add Custom Annotations as chaos filters?","text":"

    Currently Litmus allows you to set your own/custom keys for Annotation filters, the value being true/false. To use your custom annotation, add this key under an ENV named as CUSTOM_ANNOTATION in ChaosOperator deployment. A sample chaos-operator deployment spec is provided here for reference:

    view the manifest
    ---\napiVersion: apps/v1\nkind: Deployment\nmetadata:\nname: chaos-operator-ce\nnamespace: litmus\nspec:\nreplicas: 1\nselector:\n    matchLabels:\n    name: chaos-operator\ntemplate:\n    metadata:\n    labels:\n        name: chaos-operator\n    spec:\n    serviceAccountName: litmus\n    containers:\n        - name: chaos-operator\n        # 'latest' tag corresponds to the latest released image\n        image: litmuschaos/chaos-operator:latest\n        command:\n        - chaos-operator\n        imagePullPolicy: Always\n        env:\n            - name: CUSTOM_ANNOTATION\n            value: \"mayadata.io/chaos\"\n            - name: CHAOS_RUNNER_IMAGE\n            value: \"litmuschaos/chaos-runner:latest\"\n            - name: WATCH_NAMESPACE\n            value: \n            - name: POD_NAME\n            valueFrom:\n                fieldRef:\n                fieldPath: metadata.name\n            - name: OPERATOR_NAME\n          value: \"chaos-operator\"\n
    "},{"location":"experiments/faq/experiments/#is-it-mandatory-for-the-chaosengine-and-chaos-experiment-resources-to-exist-in-the-same-namespace","title":"Is it mandatory for the chaosengine and chaos experiment resources to exist in the same namespace?","text":"

    Yes. As of today, the chaos resources are expected to co-exist in the same namespace, which typically is also the application's (AUT) namespace.

    "},{"location":"experiments/faq/experiments/#how-to-get-the-chaos-logs-in-litmus","title":"How to get the chaos logs in Litmus?","text":"

    The chaos logs can be viewed in the following manner. To view the successful launch/removal of chaos resources upon engine creation, for identification of application under test (AUT) etc., view the chaos-operator logs:

    kubectl logs -f <chaos-operator-(hash)-(hash)> -n <chaos_namespace>\n
    To view lifecycle management logs of a given (or set of) chaos experiments, view the chaos-runner logs:
    kubectl logs -f <chaosengine_name>-runner -n <chaos_namespace>\n
    To view the chaos logs itself (details of experiment chaos injection, application health checks et al), view the experiment pod logs:
    kubectl logs -f <experiment_name_(hash)_(hash)> -n <chaos_namespace>\n

    "},{"location":"experiments/faq/experiments/#does-litmus-support-generation-of-events-during-chaos","title":"Does Litmus support generation of events during chaos?","text":"

    The chaos-operator generates Kubernetes events to signify the creation of removal of chaos resources over the course of a chaos experiment, which can be obtained by running the following command:

    kubectl describe chaosengine <chaosengine-name> -n <namespace>\n
    Note: Efforts are underway to add more events around chaos injection in subsequent releases.

    "},{"location":"experiments/faq/experiments/#how-to-stop-or-abort-a-chaos-experiment","title":"How to stop or abort a chaos experiment?","text":"

    A chaos experiment can be stopped/aborted inflight by patching the .spec.engineState property of the chaosengine to stop . This will delete all the chaos resources associated with the engine/experiment at once.

    kubectl patch chaosengine <chaosengine-name> -n <namespace> --type merge --patch '{\"spec\":{\"engineState\":\"stop\"}}'\n
    The same effect will be caused by deleting the respective chaosengine resource.

    "},{"location":"experiments/faq/experiments/#can-a-chaos-experiment-be-resumed-once-stopped-or-aborted","title":"Can a chaos experiment be resumed once stopped or aborted?","text":"

    Once stopped/aborted, patching the chaosengine .spec.engineState with active causes the experiment to be re-executed. Another way is to re-apply the ChaosEngine YAML, this will delete all stale chaos resources, and restart ChaosEngine lifecycle. However, support is yet to be added for saving state and resuming an in-flight experiment (i.e., execute pending iterations etc.,)

    kubectl patch chaosengine <chaosengine-name> -n <namespace> --type merge --patch '{\"spec\":{\"engineState\":\"active\"}}'\n

    "},{"location":"experiments/faq/experiments/#how-to-restart-chaosengine-after-graceful-completion","title":"How to restart chaosengine after graceful completion?","text":"

    To restart chaosengine, check the .spec.engineState, which should be equal to stop, which means your chaosengine has gracefully completed, or forcefully aborted. In this case, restart is quite easy, as you can re-apply the chaosengine YAML to restart it. This will remove all stale chaos resources linked to this chaosengine, and restart its own lifecycle.

    "},{"location":"experiments/faq/experiments/#does-litmus-support-any-chaos-metrics-for-experiments","title":"Does Litmus support any chaos metrics for experiments?","text":"

    Litmus provides a basic set of prometheus metrics indicating the total count of chaos experiments, passed/failed experiments and individual status of experiments specified in the ChaosEngine, which can be queried against the monitor pod. Work to enhance and improve this is underway.

    "},{"location":"experiments/faq/experiments/#does-litmus-track-any-usage-metrics-on-the-test-clusters","title":"Does Litmus track any usage metrics on the test clusters?","text":"

    By default, the installation count of chaos-operator & run count of a given chaos experiment is collected as part of general analytics to gauge user adoption & chaos trends. However, if you wish to inhibit this, please use the following ENV setting on the chaos-operator deployment:

    env: \n  name: ANALYTICS\n  value: 'FALSE'\n

    "},{"location":"experiments/faq/experiments/#what-to-choose-between-minchaosinterval-and-instancecount","title":"What to choose between minChaosInterval and instanceCount?","text":"

    Only one should be chosen ideally between minChaosInterval and instanceCount. However if both are specified minChaosInterval will be given priority. minChaosInterval specifies the minimum interval that should be present between the launch of 2 chaosengines and instanceCount specifies the exact number of chaosengines to be launched between the range (start and end time). SO we can choose depending on our requirements.

    "},{"location":"experiments/faq/install/","title":"Install","text":""},{"location":"experiments/faq/install/#table-of-contents","title":"Table of Contents","text":"
    1. I encountered the concept of namespace and cluster scope during the installation. What is meant by the scopes, and how does it affect experiments to be performed outside or inside the litmus Namespace?

    2. Does Litmus 2.0 maintain backward compatibility with Kubernetes?

    3. Can I run LitmusChaos Outside of my Kubernetes clusters?

    4. What is the minimum system requirement to run Portal and agent together?

    5. Can I use LitmusChaos in Production?

    6. Why should I use Litmus? What is its distinctive feature?

    7. What licensing model does Litmus use?

    8. What are the prerequisites to get started with Litmus?

    9. How to Install Litmus on the Kubernetes Cluster?

    "},{"location":"experiments/faq/install/#i-encountered-the-concept-of-namespace-and-cluster-scope-during-the-installation-what-is-meant-by-the-scopes-and-how-does-it-affect-experiments-to-be-performed-outside-or-inside-the-litmus-namespace","title":"I encountered the concept of namespace and cluster scope during the installation. What is meant by the scopes, and how does it affect experiments to be performed outside or inside the litmus Namespace?","text":"

    The scope of control plane (portal) installation can be tuned by the env 'PORTAL_SCOPE' in the 'litmusportal-server' deployment. Its value can be kept as a \u201cnamespace\u201d if you want to provide restricted access to litmus. It is useful in strictly multi-tenant environments in which users have namespace-level permissions and need to set up their own chaos-center instances. This is also the case in certain popular SaaS environments like Okteto cloud.

    This setting can be used in combination with a flag, 'AGENT_SCOPE' in the 'litmus-portal-admin-config' ConfigMap to limit the purview of the corresponding self-agent (the execution plane pods on the cluster/namespace where the control plane is installed) to the current namespace, which means the user can perform chaos experiments only in chosen installation namespace. By default, both are set up for cluster-wide access, by which microservices across the cluster can be subjected to chaos.

    In case of external-agents, i.e., the targets being connected to the chaos-center, you can choose the agent\u2019s scope to either cluster or namespace via a 'litmusctl' flag (when using it in non-interactive mode) or by providing the appropriate input (in interactive mode).

    "},{"location":"experiments/faq/install/#does-litmus-20-maintain-backward-compatibility-with-kubernetes","title":"Does Litmus 2.0 maintain backward compatibility with Kubernetes?","text":"

    Yes, Litmus maintains a separate CRD manifest to support backward compatibility.

    "},{"location":"experiments/faq/install/#can-i-run-litmuschaos-outside-of-my-kubernetes-clusters","title":"Can I run LitmusChaos Outside of my Kubernetes clusters?","text":"

    You can run the chaos experiments outside of the k8s cluster as a dockerized container. However, other components such as chaos-operator,chaos-exporter, and runner are Kubernetes native. They require k8s cluster to run on it.

    "},{"location":"experiments/faq/install/#what-is-the-minimum-system-requirement-to-run-portal-and-agent-together","title":"What is the minimum system requirement to run Portal and agent together?","text":"

    To run LitmusPortal you need to have a minimum of 1 GiB memory and 1 core of CPU free.

    "},{"location":"experiments/faq/install/#can-i-use-litmuschaos-in-production","title":"Can I use LitmusChaos in Production?","text":"

    Yes, you can use Litmuschaos in production. Litmus has a wide variety of experiments and is designed according to the principles of chaos engineering. However, if you are new to Chaos Engineering, we would recommend you to first try Litmus on your dev environment, and then after getting the confidence, you should use it in Production.

    "},{"location":"experiments/faq/install/#why-should-i-use-litmus-what-is-its-distinctive-feature","title":"Why should I use Litmus? What is its distinctive feature?","text":"

    Litmus is a toolset for performing cloud-native Chaos Engineering. Litmus provides tools to orchestrate chaos on Kubernetes to help developers and SREs find weaknesses in their application deployments. Litmus can be used to run chaos experiments initially in the staging environment and eventually in production to find bugs and vulnerabilities. Fixing the weaknesses leads to increased resilience of the system. Litmus adopts a \u201cKubernetes-native\u201d approach to define chaos intent in a declarative manner via custom resources.

    "},{"location":"experiments/faq/install/#what-licensing-model-does-litmus-use","title":"What licensing model does Litmus use?","text":"

    Litmus is developed under Apache License 2.0 license at the project level. Some components of the projects are derived from the other Open Source projects and are distributed under their respective licenses.

    "},{"location":"experiments/faq/install/#what-are-the-prerequisites-to-get-started-with-litmus","title":"What are the prerequisites to get started with Litmus?","text":"

    To get started with Litmus, the only prerequisites is to have Kubernetes 1.11+ cluster. While most pod/container level experiments are supported on any Kubernetes platform, some of the infrastructure chaos experiments are supported on specific platforms. To find the list of supported platforms for an experiment, view the \"Platforms\" section on the sidebar in the experiment page.

    "},{"location":"experiments/faq/install/#how-to-install-litmus-on-the-kubernetes-cluster","title":"How to Install Litmus on the Kubernetes Cluster?","text":"

    You can install/deploy stable litmus using this command:

    kubectl apply -f https://litmuschaos.github.io/litmus/litmus-operator-latest.yaml\n
    "},{"location":"experiments/faq/portal/","title":"Litmus Portal","text":""},{"location":"experiments/faq/portal/#table-of-contents","title":"Table of Contents","text":"
    1. Can we host MongoDB outside the cluster? What connection string is supported? Is SSL connection supported?

    2. What does failed status of workflow means in LitmusPortal?

    3. How can I setup a chaoshub of my gitlab repo in Litmus Portal?

    4. How to achieve High Availability of MongoDB and how can we add persistence to MongoDB?

    5. Can I create workflows without using a dashboard?

    6. Does Litmusctl support actions that are currently performed from the portal dashboard?

    7. How is resilience score is Calculated?

    "},{"location":"experiments/faq/portal/#can-we-host-mongodb-outside-the-cluster-what-connection-string-is-supported-is-ssl-connection-supported","title":"Can we host MongoDB outside the cluster? What connection string is supported? Is SSL connection supported?","text":"

    Yes we can host Mongodb outside the cluster, the mongo string can be updated accordingly DataBaseServer: \"mongodb://mongo-service:27017\" We use the same connection string for both authentication server and graphql server containers in litmus portal-server deployment, also there are the db user and db password keys that can be tuned in the secrets like DB_USER: \"admin\" and DB_PASSWORD: \"1234\". We can connect with SSL if the certificate is optional. If our requirement is ca.cert auth for the SSL connection, then this is not available on the portal

    "},{"location":"experiments/faq/portal/#what-does-failed-status-of-workflow-means-in-litmusportal","title":"What does failed status of workflow means in LitmusPortal?","text":"

    Failed status indicates that either there is some misconfiguration in the workflow or the default hypothesis of the experiment was disproved and some of the experiments in the workflow failed, In such case, the resiliency score will be less than 100.

    "},{"location":"experiments/faq/portal/#how-can-i-setup-a-chaoshub-of-my-gitlab-repo-in-litmus-portal","title":"How can I setup a chaoshub of my gitlab repo in Litmus Portal?","text":"

    In the litmus portal when you go to the chaoshub section and you click on connect new hub button, you can see that there are two modes of authentication i.e public mode and private mode. For public mode, you only have to provide the git URL and branch name. For private mode, we have two types of authentication; Access token and SSH key. For the access token, go to the settings of GitLab and in the Access token section, add a token with read repository permission. After getting the token, go to the Litmus portal and provide the GitLab URL and branch name along with the access token. After submitting, your own chaos hub is connected to the Litmus portal. For the second mode of authentication i.e; SSH key, In SSH key once you click on the SSH, It will generate a public key. You have to use the public key and put it in the GitLab setting. Just go to the settings of GitLab, you can see the SSH key section, go to the SSH key section and add your public key. After adding the public key. Get the ssh type URL of the git repository and put it in the Litmusportal along with the branch, after submitting your chaoshub is connected to the Litmus Portal.

    "},{"location":"experiments/faq/portal/#how-to-achieve-high-availability-of-mongodb-and-how-can-we-add-persistence-to-mongodb","title":"How to achieve High Availability of MongoDB and how can we add persistence to MongoDB?","text":"

    Currently, the MongoDB instance is not HA, we can install the MongoDB operator along with mongo to achieve HA. This MongoDB CRD allows for specifying the desired size and version as well as several other advanced options. Along with the MongoDB operator, we will use the MongoDB sts with PV to add persistence.

    "},{"location":"experiments/faq/portal/#can-i-create-workflows-without-using-a-dashboard","title":"Can I create workflows without using a dashboard?","text":"

    Currently, you can\u2019t.But We are working on it. Shortly we will publish samples for doing this via API/SDK and litmusctl.

    "},{"location":"experiments/faq/portal/#does-litmusctl-support-actions-that-are-currently-performed-from-the-portal-dashboard","title":"Does Litmusctl support actions that are currently performed from the portal dashboard?","text":"

    For now you can create agents and projects, also you can get the agents and project details by using litmusctl. To know more about litmusctl please refer to the documentation of litmusctl.

    "},{"location":"experiments/faq/portal/#how-is-resilience-score-is-calculated","title":"How is resilience score is Calculated?","text":"

    The Resilience score is calculated on the basis of the weightage and the Probe Success Percentage of the experiment. Resilience for one single experiment is the multiplication of the weight given to that experiment and the Probe Success Percentage. Then we get the total test result by adding the resilience score of all the experiments. The Final Resilience Score is calculated by dividing the total test result by the sum of the weights of all the experiments combined in the single workflow. For more detail refer to this blog.

    "},{"location":"experiments/faq/scheduler/","title":"Chaos Scheduler","text":""},{"location":"experiments/faq/scheduler/#table-of-contents","title":"Table of Contents","text":"
    1. What is ChaosScheduler?

    2. How is ChaosScheduler different from ChaosOperator?

    3. What are the pre-requisites for ChaosScheduler?

    4. How to install ChaosScheduler?

    5. How to schedule the chaos using ChaosScheduler?

    6. What are the different techniques of scheduling the chaos?

    7. What fields of spec.schedule are to be specified with spec.schedule.type=now?

    8. What fields of spec.schedule are to be specified with spec.schedule.type=once?

    9. What fields of spec.schedule are to be specified with spec.schedule.type=repeat?

    10. How to run ChaosScheduler in Namespaced mode?

    "},{"location":"experiments/faq/scheduler/#what-is-chaosscheduler","title":"What is ChaosScheduler?","text":"

    ChaosScheduler is an operator built on top of the operator-sdk framework. It keeps on watching resources of kind ChaosSchedule and based on the scheduling parameters automates the formation of ChaosEngines, to be observed by ChaosOperator, instead of manually forming the ChaosEngine every time we wish to inject chaos in the cluster.

    "},{"location":"experiments/faq/scheduler/#how-is-chaosscheduler-different-from-chaosoperator","title":"How is ChaosScheduler different from ChaosOperator?","text":"

    ChaosOperator operates on chaosengines while ChaosScheduler operates on chaosschedules which in turn forms chaosengines, through some scheduling techniques, to be observed by ChaosOperator. So ChaosOperator is a basic building block used to inject chaos in a cluster while ChaosScheduler is just a scheduling strategy that injects chaos in some form of pattern using ChaosOperator only. ChaosScheduler can not be used independently of ChaosOperator.

    "},{"location":"experiments/faq/scheduler/#what-are-the-pre-requisites-for-chaosscheduler","title":"What are the pre-requisites for ChaosScheduler?","text":"

    For getting started with ChaosScheduler, we should just have ChaosOperator and all the litmus infrastructure components installed in the cluster beforehand.

    "},{"location":"experiments/faq/scheduler/#how-to-install-chaosscheduler","title":"How to install ChaosScheduler?","text":"

    Firstly install the rbac and crd -

    kubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/rbac.yaml\nkubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/crds/chaosschedule_crd.yaml\n

    Install ChaosScheduler operator afterwards -

    kubectl apply -f https://raw.githubusercontent.com/litmuschaos/chaos-scheduler/master/deploy/chaos-scheduler.yaml\n

    "},{"location":"experiments/faq/scheduler/#how-to-schedule-the-chaos-using-chaosscheduler","title":"How to schedule the chaos using ChaosScheduler?","text":"

    This depends on which type of schedule we want to use for injecting chaos. For basic understanding refer constructing schedule

    "},{"location":"experiments/faq/scheduler/#what-are-the-different-techniques-of-scheduling-the-chaos","title":"What are the different techniques of scheduling the chaos?","text":"

    As of now, there are 3 scheduling techniques which can be selected based on the parameter passed to spec.schedule.type

    • type=now
    • type=once
    • type=repeat
    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletypenow","title":"What fields of spec.schedule are to be specified with spec.schedule.type=now?","text":"

    No fields are needed to be specified for this as it launches the desired chaosengine immediately.

    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletypeonce","title":"What fields of spec.schedule are to be specified with spec.schedule.type=once?","text":"

    We just need to pass spec.executionTime. Scheduler will launch the chaosengine exactly at the point of time mentioned in this parameter.

    "},{"location":"experiments/faq/scheduler/#what-fields-of-specschedule-are-to-be-specified-with-specscheduletyperepeat","title":"What fields of spec.schedule are to be specified with spec.schedule.type=repeat?","text":"

    All the fields of spec.schedule except spec.schedule.executionTime are needed to be specified.

    • startTime
    • endTime
    • minChaosInterval
    • includedHours
    • includedDays

    It schedules chaosengines to be launched according to the parameters passed. It works just as a cronjob does, having superior functionalities such as we can control when the schedule will start and end.

    "},{"location":"experiments/faq/scheduler/#how-to-run-chaosscheduler-in-namespaced-mode","title":"How to run ChaosScheduler in Namespaced mode?","text":"

    Firstly install the crd -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-scheduler-namespaced-crd.yaml\n

    Secondly install the rbac in the desired Namespace -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-scheduler-ns-rbac.yaml -n <namespace>\n

    Install ChaosScheduler operator in the desired Namespace afterwards -

    kubectl apply -f https://github.com/litmuschaos/litmus/tree/master/mkdocs/docs/litmus-namespaced-scope/litmus-namespaced-scheduler.yaml -n <namespace>\n

    Execute ChaosScheduler with an experiment in the desired Namespace afterward.

    Note: The ChaosServiceAccount used within the embedded ChaosEngine template needs to be chosen appropriately depending on the experiment scope. - ```yaml apiVersion: litmuschaos.io/v1alpha1 kind: ChaosSchedule metadata: name: schedule-nginx namespace: spec: schedule: repeat: timeRange: startTime: \"2020-05-12T05:47:00Z\" #should be modified according to current UTC Time, for type=repeat endTime: \"2020-09-13T02:58:00Z\" #should be modified according to current UTC Time, for type=repeat properties: minChaosInterval: \"2m\" #format should be like \"10m\" or \"2h\" accordingly for minutes and hours, for type=repeat workHours: includedHours: 0-12 workDays: includedDays: \"Mon,Tue,Wed,Sat,Sun\" #should be set for type=repeat engineTemplateSpec: appinfo: appns: 'default' applabel: 'app=nginx' appkind: 'deployment' # It can be true/false annotationCheck: 'false' # It can be active/stop engineState: 'active' #ex. values: ns1:name=percona,ns2:run=nginx auxiliaryAppInfo: '' chaosServiceAccount: pod-delete-sa # It can be delete/retain jobCleanUpPolicy: 'delete' experiments: - name: pod-delete spec: components: env: # set chaos duration (in sec) as desired - name: TOTAL_CHAOS_DURATION value: '30'

              # set chaos interval (in sec) as desired\n          - name: CHAOS_INTERVAL\n            value: '10'\n\n          # pod failures without '--force' & default terminationGracePeriodSeconds\n          - name: FORCE\n            value: 'false'\n
    "},{"location":"experiments/troubleshooting/experiments/","title":"Litmus Experiments","text":""},{"location":"experiments/troubleshooting/experiments/#table-of-contents","title":"Table of Contents","text":"
    1. When I\u2019m executing an experiment the experiment's pod failed with the exec format error

    2. Nothing happens (no pods created) when the chaosengine resource is created?

    3. The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?

    4. The experiment pod enters completed state w/o the desired chaos being injected?

    5. Observing experiment results using describe chaosresult is showing NotFound error?

    6. The helper pod is getting in a failed state due to container runtime issue

    7. Disk Fill fail with the error message

    8. Disk Fill failed with error

    9. Disk fill experiment fails with an error pointing to the helper pods being unable to finish in the given duration

    10. The infra experiments like node drain, node taint, kubelet service kill to act on the litmus pods only

    11. AWS experiments failed with the following error

    12. In AWS SSM Chaos I have provided the aws in secret but still not able to inject the SSM chaos on the target instance

    13. GCP VM Disk Loss experiment fails unexpectedly where the disk gets detached successfully but fails to attach back to the instance. What can be the reason?

    14. In pod level stress chaos experiments like pod memory hog or pod io stress after the chaos is injected successfully the helper fails with an error message

    15. Experiment failed for the istio enabled namespaces

    "},{"location":"experiments/troubleshooting/experiments/#when-im-executing-an-experiment-the-experiments-pod-failed-with-the-exec-format-error","title":"When I\u2019m executing an experiment the experiment's pod failed with the exec format error","text":"View the error message

    standard_init_linux.go:211: exec user process caused \"exec format error\":

    There could be multiple reasons for this. The most common one is mismatched in the binary and the platform on which it is running, try to check out the image binary you're using should have the support for the platform on which you\u2019re trying to run the experiment.

    "},{"location":"experiments/troubleshooting/experiments/#nothing-happens-no-pods-created-when-the-chaosengine-resource-is-created","title":"Nothing happens (no pods created) when the chaosengine resource is created?","text":"

    If the ChaosEngine creation results in no action at all, perform the following checks:

    • Check the Kubernetes events generated against the chaosengine resource.

      kubectl describe chaosengine <chaosengine-name> -n <namespace>\n
      Specifically look for the event reason ChaosResourcesOperationFailed. Typically, these events consist of messages pointing to the problem. Some of the common messages include:

      • Unable to filter app by specified info
      • Unable to get chaos resources
      • Unable to update chaosengine
    • Check the logs of the chaos-operator pod using the following command to get more details (on failed creation of chaos resources). The below example uses litmus namespace, which is the default mode of installation. Please provide the namespace into which the operator has been deployed:

      kubectl logs -f <chaos-operator-(hash)-(hash)>-runner -n litmus\n

    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-for-these-errors-include","title":"Some of the possible reasons for these errors include:","text":"
    • The annotationCheck is set to true in the ChaosEngine spec, but the application deployment (AUT) has not been annotated for chaos. If so, please add it using the following command:

      kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos=\"true\"\n

    • The annotationCheck is set to true in the ChaosEngine spec and there are multiple chaos candidates that share the same label (as provided in the .spec.appinfo of the ChaosEngine) and are also annotated for chaos. If so, please provide a unique label for the AUT, or remove annotations on other applications with the same label. Litmus, by default, doesn't allow selection of multiple applications. If this is a requirement, set the annotationCheck to false.

      kubectl annotate <deploy-type>/<application_name> litmuschaos.io/chaos-\n

    • The ChaosEngine has the .spec.engineState set to stop, which causes the operator to refrain from creating chaos resources. While it is an unlikely scenario, it is possible to reuse a previously modified ChaosEngine manifest.

    • Verify if the service account used by the Litmus ChaosOperator has enough permissions to launch pods/services (this is available by default if the manifests suggested by the docs have been used).

    "},{"location":"experiments/troubleshooting/experiments/#the-chaos-runner-pod-enters-completed-state-seconds-after-getting-created-no-experiment-jobs-are-created","title":"The chaos-runner pod enters completed state seconds after getting created. No experiment jobs are created?","text":"

    If the chaos-runner enters completed state immediately post creation, i.e., the creation of experiment resources is unsuccessful, perform the following checks:

    • Check the Kubernetes events generated against the chaosengine resource.
      kubectl describe chaosengine <chaosengine-name> -n <namespace>\n

    Look for one of these events: ExperimentNotFound, ExperimentDependencyCheck, EnvParseError

    • Check the logs of the chaos-runner pod logs.
      kubectl logs -f <chaosengine_name>-runner -n <namespace>\n
    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-may-include","title":"Some of the possible reasons may include:","text":"
    • The ChaosExperiment CR for the experiment (name) specified in the ChaosEngine .spec.experiments list is not installed. If so, please install the desired experiment from the chaoshub

    • The dependent resources for the ChaosExperiment, such as ConfigMap & secret volumes (as specified in the ChaosExperiment CR or the ChaosEngine CR) may not be present in the cluster (or in the desired namespace). The runner pod doesn\u2019t proceed with creation of experiment resources if the dependencies are unavailable.

    • The values provided for the ENV variables in the ChaosExperiment or the ChaosEngines might be invalid

    • The chaosServiceAccount specified in the ChaosEngine CR doesn\u2019t have sufficient permissions to create the experiment resources (For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs).

    "},{"location":"experiments/troubleshooting/experiments/#the-experiment-pod-enters-completed-state-wo-the-desired-chaos-being-injected","title":"The experiment pod enters completed state w/o the desired chaos being injected?","text":"

    If the experiment pod enters completed state immediately (or in a few seconds) after creation w/o injecting the desired chaos, perform the following checks:

    • Check the Kubernetes events generated against the ChaosEngine resource
    kubectl describe chaosengine <chaosengine-name> -n <namespace>\n

    Look for the event with reason Summary with message experiment has been failed

    • Check the logs of the chaos-experiment pod.
    kubectl logs -f <experiment_name_(hash)_(hash)> -n <namespace>\n
    "},{"location":"experiments/troubleshooting/experiments/#some-of-the-possible-reasons-may-include_1","title":"Some of the possible reasons may include:","text":"
    • The ChaosExperiment CR or the ChaosEngine CR doesn\u2019t include mandatory ENVs (or consists of incorrect values/info) needed by the experiment. Note that each experiment (see docs) specifies a mandatory set of ENVs along with some optional ones, which are necessary for successful execution of the experiment.

    • The chaosServiceAccount specified in the ChaosEngine CR doesn\u2019t have sufficient permissions to create the experiment helper-resources (i.e., some experiments in turn create other K8s resources like Jobs/Daemonsets/Deployments etc.., For existing experiments, appropriate rbac manifests are already provided in chaos-charts/docs)

    • The application's (AUT) unique label provided in the ChaosEngine is set only at the parent resource metadata but not propagated to the pod template spec. Note that the Operator uses this label to filter chaos candidates at the parent resource level (deployment/statefulset/daemonset) but the experiment pod uses this to pick application pods into which the chaos is injected.

    • The experiment pre-chaos checks have failed on account of application (AUT) or auxiliary application unavailability

    "},{"location":"experiments/troubleshooting/experiments/#observing-experiment-results-using-describe-chaosresult-is-showing-notfound-error","title":"Observing experiment results using describe chaosresult is showing NotFound error?","text":"

    Upon observing the ChaosResults by executing the describe command given below, it may give a NotFound error.

    kubectl describe chaosresult <chaos-engine-name>-<chaos-experiment-name>  -n <namespace>\n

    Alternatively, running the describe command without specifying the expected ChaosResult name might execute successfully, but does may not show any output.

    kubectl describe chaosresult  -n <namespace>`\n

    This can occur sometimes due to the time taken in pulling the image starting the experiment pod (note that the ChaosResult resource is generated by the experiment). For the above commands to execute successfully, you should simply wait for the experiment pod to be created. The waiting time will be based upon resource available (network bandwidth, space availability on the node filesyste

    "},{"location":"experiments/troubleshooting/experiments/#the-helper-pod-is-getting-in-a-failed-state-due-to-container-runtime-issue","title":"The helper pod is getting in a failed state due to container runtime issue","text":"View the error message

    time=\"2021-07-15T10:26:04Z\" level=fatal msg=\"helper pod failed, err: Unable to run command, err: exit status 1; error output: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?\"

    OR

    time=\"2021-07-16T22:21:02Z\" level=error msg=\"[docker]: Failed to run docker inspect: []\\nError: No such object: 1807fec21ccad1101bbb63a7d412be15414f807316572f9e043b9f4a3e7c4acc\\n\" time=\"2021-07-16T22:21:02Z\" level=fatal msg=\"helper pod failed, err: exit status 1\"

    The default values for CONTAINER_RUNTIME & SOCKET_PATH env is for docker runtime. Please check if the cluster runtime is other than docker i.e, containerd then update above ENVs as follow:

    • For containerd runtime:

      • CONTAINER_RUNTIME: containerd
      • SOCKET_PATH: /run/containerd/containerd.sock
    • For CRIO runtime:

      • CONTAINER_RUNTIME: crio
      • SOCKET_PATH: /run/crio/crio.sock

    NOTE: The above values are the common ones and may vary based on the cluster you\u2019re using.

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-fail-with-the-error-message","title":"Disk Fill fail with the error message","text":"View the error message

    time=\"2021-08-12T05:27:39Z\" level=fatal msg=\"helper pod failed, err: either provide ephemeral storage limit inside target container or define EPHEMERAL_STORAGE_MEBIBYTES ENV\"

    The disk fill experiment needs to have either ephemeral storage limit defined in the application or you can provide the value in mebibytes using EPHEMERAL_STORAGE_MEBIBYTES ENV in the chaos engine. Either of them is required. For more details refer: FILL_PERCENTAGE and EPHEMERAL_STORAGE_MEBIBYTES

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-failed-with-error","title":"Disk Fill failed with error:","text":"View the error message

    time=\"2021-08-12T05:41:45Z\" level=error msg=\"du: /diskfill/8a1088e3fd50a31d5f0d383ae2258d9975f1df152ff92b3efd570a44e952a732: No such file or directory\\n\" time=\"2021-08-12T05:41:45Z\" level=fatal msg=\"helper pod failed, err: exit status 1\"

    This could be due to multiple issues in filling the disk of a container the most common one is invalid CONTAINER_PATH env set in the chaosengine. The default container path env is common for most of the use-cases and that is /var/lib/docker/containers

    "},{"location":"experiments/troubleshooting/experiments/#disk-fill-experiment-fails-with-an-error-pointing-to-the-helper-pods-being-unable-to-finish-in-the-given-duration","title":"Disk fill experiment fails with an error pointing to the helper pods being unable to finish in the given duration.","text":"

    This could be possible when the provided block size is quite less and the empirical storage value is high. In this case, it may need more time than the given chaos duration to fill the disk.

    "},{"location":"experiments/troubleshooting/experiments/#the-infra-experiments-like-node-drain-node-taint-kubelet-service-kill-to-act-on-the-litmus-pods-only","title":"The infra experiments like node drain, node taint, kubelet service kill to act on the litmus pods only.","text":"

    Ans: These are the infra level experiments, we need to cordon the target node so that the application pods don\u2019t get scheduled on it and use node selector in the chaos engine to specify the nodes for the experiment pods. Refer to the this to know how to schedule experiments on a certain node.

    "},{"location":"experiments/troubleshooting/experiments/#aws-experiments-failed-with-the-following-error","title":"AWS experiments failed with the following error","text":"View the error message

    time=\"2021-08-12T10:25:57Z\" level=error msg=\"failed perform ssm api calls, err: UnrecognizedClientException: The security token included in the request is invalid.\\n\\tstatus code: 400, request id: 68f0c2e8-a7ed-4576-8c75-0a3ed497efb9\"

    The AWS experiment needs authentication to connect & perform actions on the aws services we can provide this with the help of the secret as shown below:

    View the secret manifest
    apiVersion: v1\nkind: Secret\nmetadata:\n  name: cloud-secret\ntype: Opaque\nstringData:\n  cloud_config.yml: |-\n    # Add the cloud AWS credentials respectively\n    [default]\n    aws_access_key_id = XXXXXXXXXXXXXXXXXXX\n    aws_secret_access_key = XXXXXXXXXXXXXXX\n

    Make sure you have all the required permissions attached with your IAM to perform the chaos operation on the given service. If you are running the experiment in an EKS cluster then you have one more option than creating a secret, you can map the IAM role with the service account refer to this for more details.

    "},{"location":"experiments/troubleshooting/experiments/#in-aws-ssm-chaos-i-have-provided-the-aws-in-secret-but-still-not-able-to-inject-the-ssm-chaos-on-the-target-instance","title":"In AWS SSM Chaos I have provided the aws in secret but still not able to inject the SSM chaos on the target instance","text":"View the error message

    time='2021-08-13T09:30:47Z' level=error msg='failed perform ssm api calls, err: error: the instance id-qqw2-123-12- might not have suitable permission or IAM attached to it. use \\'aws ssm describe-instance-information\\' to check the available instances'

    Ensure that you have the required AWS access and your target EC2 instances have attached an IAM instance profile. To know more checkout Systems Manager Docs

    "},{"location":"experiments/troubleshooting/experiments/#gcp-vm-disk-loss-experiment-fails-unexpectedly-where-the-disk-gets-detached-successfully-but-fails-to-attach-back-to-the-instance-what-can-be-the-reason","title":"GCP VM Disk Loss experiment fails unexpectedly where the disk gets detached successfully but fails to attach back to the instance. What can be the reason?","text":"

    The GCP VM Disk Loss experiment requires a GCP Service Account having a Project Editor or higher permission to execute. This could be because of an issue in the GCP GoLang Compute Engine API, which fails to attach the disk using the attachDisk method with a Compute Admin or lower permission.

    "},{"location":"experiments/troubleshooting/experiments/#in-pod-level-stress-chaos-experiments-like-pod-memory-hog-or-pod-io-stress-after-the-chaos-is-injected-successfully-the-helper-fails-with-an-error-message","title":"In pod level stress chaos experiments like pod memory hog or pod io stress after the chaos is injected successfully the helper fails with an error message","text":"View the error message

    Error: process exited before the actual cleanup

    The error message indicates that the stress process inside the target container is somehow removed before the actual cleanup. There could be multiple reasons for this: the target container might have just got restarted due to excessive load on the container which it can\u2019t handle and the kubelet terminated that replica and launches a new one (if applicable) and reports an OOM event on the older one.

    "},{"location":"experiments/troubleshooting/experiments/#experiment-failed-for-the-istio-enabled-namespaces","title":"Experiment failed for the istio enabled namespaces","text":"View the error message

    W0817 06:32:26.531145 1 client_config.go:541] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work. time=\"2021-08-17T06:32:26Z\" level=error msg=\"unable to get ChaosEngineUID, error: unable to get ChaosEngine name: pod-delete-chaos, in namespace: default, error: Get \\\"https://10.100.0.1:443/apis/litmuschaos.io/v1alpha1/namespaces/default/chaosengines/pod-delete-chaos\\\": dial tcp 10.100.0.1:443: connect: connection refused\"

    If istio is enabled for the chaos-namespace, it will launch the chaos-runner and chaos-experiment pods with the istio sidecar. Which may block/delay the external traffic of those pods for the intial few seconds. Which can fail the experiment.

    We can fix the above failure by avoiding istio sidecar for the chaos pods. Refer the following manifest:

    View the ChaosEngine manifest with the required annotations
    apiVersion: litmuschaos.io/v1alpha1\nkind: ChaosEngine\nmetadata:\n  name: engine-nginx\nspec:\n  components:\n    runner:\n      # annotation for the chaos-runner\n      runnerAnnotations:\n        sidecar.istio.io/inject: \"false\"\n  engineState: \"active\"\n  annotationCheck: \"false\"\n  appinfo:\n    appns: \"default\"\n    applabel: \"app=nginx\"\n    appkind: \"deployment\"\n  chaosServiceAccount: container-kill-sa\n  experiments:\n  - name: container-kill\n    spec:\n      components:\n        #annotations for the experiment pod \n        experimentAnnotations:\n          sidecar.istio.io/inject: \"false\"\n        env:\n        - name: TOTAL_CHAOS_DURATION\n          value: '60'\n
    "},{"location":"experiments/troubleshooting/install/","title":"Install","text":""},{"location":"experiments/troubleshooting/install/#table-of-contents","title":"Table of Contents","text":"
    1. The Litmus ChaosOperator is seen to be in CrashLoopBackOff state immediately after installation?

    2. Litmus uninstallation is not successful and namespace is stuck in terminating state?

    "},{"location":"experiments/troubleshooting/install/#the-litmus-chaosoperator-is-seen-to-be-in-crashloopbackoff-state-immediately-after-installation","title":"The Litmus ChaosOperator is seen to be in CrashLoopBackOff state immediately after installation?","text":"

    Verify if the ChaosEngine custom resource definition (CRD) has been installed in the cluster. This can be verified with the following commands:

    kubectl get crds | grep chaos\n
    kubectl api-resources | grep chaos\n

    If not created, install it from here

    "},{"location":"experiments/troubleshooting/install/#litmus-uninstallation-is-not-successful-and-namespace-is-stuck-in-terminating-state","title":"Litmus uninstallation is not successful and namespace is stuck in terminating state?","text":"

    Under typical operating conditions, the ChaosOperator makes use of finalizers to ensure that the ChaosEngine is deleted only after chaos resources (chaos-runner, experiment pod, any other helper pods) are removed.

    When uninstalling Litmus via the operator manifest, which contains the namespace, operator, and crd specifications in a single YAML, without deleting the existing chaosengine resources first, the ChaosOperator deployment may get deleted before the CRD removal is attempted. Since the stale chaosengines have the finalizer present on them, their deletion (triggered by the CRD delete) and by consequence, the deletion of the chaosengine CRD itself is \"stuck\".

    In such cases, manually remove the finalizer entries on the stale chaosengines to facilitate their successful delete. To get the chaosengine, run:

    kubectl get chaosengine -n <namespace>

    followed by:

    kubectl edit chaosengine <chaosengine-name> -n <namespace> and remove the finalizer entry chaosengine.litmuschaos.io/finalizer

    Repeat this on all the stale chaosengine CRs to remove the CRDs successfully & complete uninstallation process.

    If however, the litmus namespace deletion remains stuck despite the above actions, follow the procedure described here to complete the uninstallation.

    "},{"location":"experiments/troubleshooting/portal/","title":"Litmus Portal","text":""},{"location":"experiments/troubleshooting/portal/#table-of-contents","title":"Table of Contents","text":"
    1. We were setting up a Litmus Portal, however, Self-Agent status is showing pending. Any idea why is happening?

    2. After logging in for the first time to the portal, /get-started page kept loading after I provided the new password

    3. Subscriber is crashing with the error dial:websocket: bad handshake

    4. Not able to connect to the LitmusChaos Control Plane hosted on GKE cluster

    5. I forgot my Litmus portal password. How can I reset my credentials?

    6. While Uninstalling Litmus portal using helm, some components like subscriber, exporter, event, workflows, etc, are not removed

    7. Unable to Install Litmus portal using helm. Server pod and mongo pod are in CrashLoopBackOff state. Got this error while checking the logs of mongo container chown: changing ownership of '/data/db/.snapshot': Read-only file system

    8. Pre-defined workflow Bank Of Anthos showing bus error for accounts-db or ledger-db pod?

    "},{"location":"experiments/troubleshooting/portal/#we-were-setting-up-a-litmus-portal-however-self-agent-status-is-showing-pending-any-idea-why-is-happening","title":"We were setting up a Litmus Portal, however, Self-Agent status is showing pending. Any idea why is happening?","text":"

    The litmusportal-server-service might not be reachable due to inbound rules. You can enable the traffic to it if on GKE/EKS/AKS (by adding the port to inbound rules for traffic). You have to check the logs of the subscriber pod and expose the port mentioned for communication with the server.

    "},{"location":"experiments/troubleshooting/portal/#after-logging-in-for-the-first-time-to-the-portal-get-started-page-kept-loading-after-i-provided-the-new-password","title":"After logging in for the first time to the portal, /get-started page kept loading after I provided the new password.","text":"

    First, try to clear the browser cache and cookies and refresh the page, this might solve your problem. If your problem persists then delete all the cluster role bindings,PV, and PVC used by litmus and try to reinstall the litmus again.

    "},{"location":"experiments/troubleshooting/portal/#subscriber-is-crashing-with-the-error-dialwebsocket-bad-handshake","title":"Subscriber is crashing with the error dial:websocket: bad handshake","text":"

    It is a network issue. It seems your subscriber is unable to access the server. While installing the agent, It creates a config called agent-config to store some metadata like server endpoint, accesskey, etc. That server endpoint can be generated in many ways:

    • Ingress (If INGRESS=true in server deployment envs)
    • Loadbalancer (it generates lb type of IP based on the server svc type)
    • NodePort (it generates nodeport type of IP based on the server svc type)
    • ClusterIP (it generates clusterip type of IP based on the server svc type)

    So, you can edit the agent-config and update the node IP. Once edited, restart the subscriber. We suggest using ingress, so that if the endpoint IP changes, then it won't affect your agent.

    "},{"location":"experiments/troubleshooting/portal/#not-able-to-connect-to-the-litmuschaos-control-plane-hosted-on-gke-cluster","title":"Not able to connect to the LitmusChaos Control Plane hosted on GKE cluster.","text":"

    In GKE you have to setup a firewall rule to allow TCP traffic on the node port.You can use the following command: gcloud compute firewall-rules create test-node-port --allow tcp:port If this firewall rule is set up, it may be accessible on nodeIp:port where nodeIp is the external IP address of your node.

    "},{"location":"experiments/troubleshooting/portal/#i-forgot-my-litmus-portal-password-how-can-i-reset-my-credentials","title":"I forgot my Litmus portal password. How can I reset my credentials?","text":"

    You can reset by running the followin command:

    kubectl exec -it mongo-0 -n litmus -- mongo -u admin -p 1234 <<< $'use auth\\ndb.usercredentials.update({username:\"admin\"},{$set:{password:\"$2a$15$sNuQl9y/Ok92N19UORcro.3wulEyFi0FfJrnN/akOQe3uxTZAzQ0C\"}})\\nexit\\n'\n
    Make sure to update the namespace and mongo pod name according to your setup,the rest should remain the same. This command will update the password to litmus.

    "},{"location":"experiments/troubleshooting/portal/#while-uninstalling-litmus-portal-using-helm-some-components-like-subscriber-exporter-event-workflows-etc-are-not-removed","title":"While Uninstalling Litmus portal using helm, some components like subscriber, exporter, event, workflows, etc, are not removed.","text":"

    These are agent components, which are launched by the control plane server, so first disconnect the agent from the portal then uninstall the portal using helm.

    "},{"location":"experiments/troubleshooting/portal/#unable-to-install-litmus-portal-using-helm-server-pod-and-mongo-pod-are-in-crashloopbackoff-state-got-this-error-while-checking-the-logs-of-mongo-container-chown-changing-ownership-of-datadbsnapshot-read-only-file-system","title":"Unable to Install Litmus portal using helm. Server pod and mongo pod are in CrashLoopBackOff state. Got this error while checking the logs of mongo container chown: changing ownership of '/data/db/.snapshot': Read-only file system","text":"

    It seems the directory somehow existed before litmus installation and might be used by some other application. You have to change the mount path from /consul/config to /consul/myconfig in mongo statefulset then you can successfully deploy the litmus.

    "},{"location":"experiments/troubleshooting/portal/#pre-defined-workflow-bank-of-anthos-showing-bus-error-for-accounts-db-or-ledger-db-pod","title":"Pre-defined workflow Bank Of Anthos showing bus error for accounts-db or ledger-db pod?","text":"

    Bank of anthos is using PostgreSQL and wouldn't fall back properly to not using huge pages. With given possible solution if same scenario occur can be resolve.

    • Modify the docker image to be able to set\u00a0huge_pages = off\u00a0in /usr/share/postgresql/postgresql.conf.sample before initdb was ran (this is what I did).
    • Turn off huge page support on the system (vm.nr_hugepages = 0\u00a0in /etc/sysctl.conf).
    • Fix Postgres's fallback mechanism when\u00a0huge_pages = try\u00a0is set (the default).
    • Modify the k8s manifest to enable huge page support (https://kubernetes.io/docs/tasks/manage-hugepages/scheduling-hugepages/).
    • Modify k8s to show that huge pages are not supported on the system, when they are not enabled for a specific container.
    "},{"location":"experiments/troubleshooting/scheduler/","title":"Chaos Scheduler","text":""},{"location":"experiments/troubleshooting/scheduler/#table-of-contents","title":"Table of Contents","text":"
    1. Scheduler not creating chaosengines for type=repeat?
    "},{"location":"experiments/troubleshooting/scheduler/#scheduler-not-creating-chaosengines-for-typerepeat","title":"Scheduler not creating chaosengines for type=repeat?","text":"

    If the ChaosSchedule has been created successfully created in the cluster and ChaosEngine is not being formed, the most common problem is that either start or end time has been wrongly specified. We should verify the times. We can identify if this is the problem or not by changing to type=now. If the ChaosEngine is formed successfully then the problem is with the specified time ranges, if ChaosEngine is still not formed, then the problem is with engineSpec.

    "}]} \ No newline at end of file