This project contains activities, such as probes and actions, you can call from your experiment through the Chaos Toolkit to perform Chaos Engineering against the Kubernetes API: killing a pod, removing a statefulset or node...
To be used from your experiment, this package must be installed in the Python environment where chaostoolkit already lives.
$ pip install chaostoolkit-kubernetes
To use the probes and actions from this package, add the following to your experiment file:
{
"title": "Do we remain available in face of pod going down?",
"description": "We expect Kubernetes to handle the situation gracefully when a pod goes down",
"tags": ["kubernetes"],
"steady-state-hypothesis": {
"title": "Verifying service remains healthy",
"probes": [
{
"name": "all-our-microservices-should-be-healthy",
"type": "probe",
"tolerance": true,
"provider": {
"type": "python",
"module": "chaosk8s.probes",
"func": "microservice_available_and_healthy",
"arguments": {
"name": "myapp"
}
}
}
]
},
"method": [
{
"type": "action",
"name": "terminate-db-pod",
"provider": {
"type": "python",
"module": "chaosk8s.pod.actions",
"func": "terminate_pods",
"arguments": {
"label_selector": "app=my-app",
"name_pattern": "my-app-[0-9]$",
"rand": true
}
},
"pauses": {
"after": 5
}
}
]
}
That's it! Notice how the action gives you the way to kill one pod randomly.
Please explore the documentation to see existing probes and actions.
Note, for the network, cpu and memory stressors we rely on the fantastic Chaos Mesh project that provides a great interface to inject these faults.
You will need to install Chaos Mesh first in your cluster to use them.
If you have a valid entry in your ~/.kube/config
file for the cluster you
want to target, then there is nothing to be done.
You may specify KUBECONFIG
to specify a different location.
$ export KUBECONFIG=/tmp/my-config
Quite often, your Kubernetes configuration contains several entries, and you need to define the one to use as a default context when it isn't explicitly provided.
You may of course change your default using
kubectl config use-context KUBERNETES_CONTEXT
but you can also be explicit
in your experiment as follows:
{
"title": "Do we remain available in face of pod going down?",
"description": "We expect Kubernetes to handle the situation gracefully when a pod goes down",
"tags": ["kubernetes"],
"secrets": {
"k8s": {
"KUBERNETES_CONTEXT": "..."
}
},
"steady-state-hypothesis": {
"title": "Verifying service remains healthy",
"probes": [
{
"name": "all-our-microservices-should-be-healthy",
"type": "probe",
"tolerance": true,
"secrets": ["k8s"],
"provider": {
"type": "python",
"module": "chaosk8s.probes",
"func": "microservice_available_and_healthy",
"arguments": {
"name": "myapp"
}
}
}
]
},
"method": [
{
"type": "action",
"name": "terminate-db-pod",
"secrets": ["k8s"],
"provider": {
"type": "python",
"module": "chaosk8s.pod.actions",
"func": "terminate_pods",
"arguments": {
"label_selector": "app=my-app",
"name_pattern": "my-app-[0-9]$",
"rand": true
}
},
"pauses": {
"after": 5
}
}
]
}
You need to specify the KUBERNETES_CONTEXT
secret key to the name of the
context you want the experiment to use. Make sure to also inform the
actions and probes about the secret entries they should be
passed "secrets": ["k8s"]
.
When running from a pod (not your local machine or a CI for instance), the
./.kube/config
file does not exist. Instead, the credentials can be found
at /var/run/secrets/kubernetes.io/serviceaccount/token.
To let the extension know about this, simply set CHAOSTOOLKIT_IN_POD
from the
environment variable of the pod specification:
env:
- name: CHAOSTOOLKIT_IN_POD
value: "true"
Finally, you may pass explicitly all required credentials information to the experiment as follows:
{
"secrets": {
"kubernetes": {
"KUBERNETES_HOST": "http://somehost",
"KUBERNETES_API_KEY": {
"type": "env",
"key": "SOME_ENV_VAR"
}
}
}
}
{
"secrets": {
"kubernetes": {
"KUBERNETES_HOST": "http://somehost",
"KUBERNETES_USERNAME": {
"type": "env",
"key": "SOME_ENV_VAR"
},
"KUBERNETES_PASSWORD": {
"type": "env",
"key": "SOME_ENV_VAR"
}
}
}
}
{
"secrets": {
"kubernetes": {
"KUBERNETES_HOST": "http://somehost",
"KUBERNETES_CERT_FILE": {
"type": "env",
"key": "SOME_ENV_VAR"
},
"KUBERNETES_KEY_FILE": {
"type": "env",
"key": "SOME_ENV_VAR"
}
}
}
}
On some managed Kubernetes clusters, you also need to authenticate against the platform itself because the Kubernetes authentication is delegated to it.
In addition to your Kubernetes credentials (via the ~/.kube/config
file), you
need to authenticate against the Google Cloud Platform itself. Usually this
is done via:
$ gcloud auth login
But can also be achieved by defining the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
If you wish to contribute more functions to this package, you are more than welcome to do so. Please, fork this project, write unit tests to cover the proposed changes, implement the changes, ensure they meet the formatting standards and then raise a PR to the repository for review.
Please refer to the formatting section for more information on the formatting standards.
The Chaos Toolkit projects require all contributors must sign a Developer Certificate of Origin on each commit they would like to merge into the master branch of the repository. Please, make sure you can abide by the rules of the DCO before submitting a PR.
If you wish to develop on this project, make sure to install the development dependencies. But first, install PDM and then install the dependencies.
$ pdm install
Now, you can edit the files, and they will be automatically be seen by your
environment, even when running from the chaos
command locally.
To run the tests for the project execute the following:
$ pdm run tests
We use ruff to both lint and format this repositories code.
Before raising a Pull Request, we recommend you run formatting against your code with:
$ pdm run format
This will automatically format any code that doesn't adhere to the formatting standards.
As some things are not picked up by the formatting, we also recommend you run:
$ pdm run lint
To ensure that any unused import statements/strings that are too long, etc. are also picked up.