Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add doc index and improve installation with helm support #173

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
## Cerberus watchdog Guide


### Table of Contents
- [Cerberus watchdog Guide](#cerberus-watchdog-guide)
- [Table of Contents](#table-of-contents)
- [Introduction](#introduction)
- [Tooling](#tooling)
- [Workflow](#workflow)
- [Using Cerberus as part of a tekton pipeline](#using-cerberus-as-part-of-a-tekton-pipeline)
- [Start as a single taskrun](#start-as-a-single-taskrun)
- [Start as a pipelinerun](#start-as-a-pipelinerun)


### Introduction

One keypoint of a chaos infrastructure test is the way to obtain a reliable status of the health of your targeted cluster.
Cerberus is that master piece component that observe regulary various central components of your targeted cluster and return an updated
signal of the global health of you cluster.

For more detail about chaos challenges, read the [cerberus introduction to chaos testing](https://github.com/chaos-kubox/krkn/blob/main/docs/index.md#introduction)

### Tooling

In this section, we will go through how [cerberus](https://github.com/chaos-kubox/cerberus) - a cluster watchdog can help test the global health state of OpenShift and make sure you track state change and return an updated global health signal.

#### Workflow
Let us start by understanding the workflow of Cerberus: the user will start by running cerberus by pointing to a specific OpenShift cluster using kubeconfig to be able to talk to the platform on top of which the OpenShift cluster is hosted. This can be done by either the oc/kubectl API or the cloud API. Based on the configuration of cerberus, it will [watch for nodes](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-nodes),
[watch for cluster operators](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-cluster-operators),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a list here of components that cerberus can watch. Could you update that check box list with the links to the docs/config section for each? Please be sure to add the link to the main chaos-kubox repo not your own thanks

[watch for master schedulable status](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-master-schedulable-status),
[watch for defined namespaces](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-namespaces) and
[watch for defined routes](https://github.com/startxfr/cerberus/blob/main/docs/config.md#watch-routes).
Accoridng to the result of theses check, cerberus will return a go/no-go signal representing the overall health of the cluster.

![Cerberus workflow](../media/cerberus-workflow.png)

### Using Cerberus as part of a tekton pipeline
startxfr marked this conversation as resolved.
Show resolved Hide resolved

You can find on [artifacthub.io](https://artifacthub.io/packages/search?kind=7&ts_query_web=cerberus) the
[cerberus-check](https://artifacthub.io/packages/tekton-task/startx-tekton-catalog/cerberus-check) `tekton-task`
which can be used to check a cerberus signal (and a cluster global health) as part of a chaos pipeline.

To use this task, you must have **Openshift pipeline** enabled (or tekton CRD loaded for Kubernetes clusters)
startxfr marked this conversation as resolved.
Show resolved Hide resolved

#### Start as a single taskrun

```bash
oc project default
oc apply -f https://github.com/startxfr/tekton-catalog/raw/stable/task/cerberus-check/0.1/samples/taskrun.yaml
```

startxfr marked this conversation as resolved.
Show resolved Hide resolved
#### Start as a pipelinerun

```yaml
oc apply -f https://github.com/startxfr/tekton-catalog/raw/stable/task/cerberus-check/0.1/samples/pipelinerun.yaml
```
37 changes: 37 additions & 0 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,3 +78,40 @@ To run Cerberus on Power (ppc64le) architecture, build and run a containerized v

## Run containerized Cerberus as a Kubernetes/OpenShift deployment
Refer to the [instructions](https://github.com/openshift-scale/cerberus/blob/master/containers/README.md#cerberus-as-a-kubernetesopenshift-application) for information on how to run cerberus as a Kubernetes or OpenShift application.

### Deploying Cerberus using a helm-chart

You can find on [artifacthub.io](https://artifacthub.io/packages/search?kind=0&ts_query_web=cerberus) the
[chaos-cerberus](https://artifacthub.io/packages/helm/startx/chaos-cerberus) `helm-chart`
which can be used to deploy a cerberus server.

Default configuration create the following resources :

- 1 project named **chaos-cerberus**
- 1 scc with privileged context for **cerberus** deployment
- 1 configmap named **cerberus-config** with cerberus configuration
- 1 configmap named **cerberus-kubeconfig** with kubeconfig of the targeted cluster
- 2 networkpolicy to allow kraken and route to consume the signal
- 1 deployment named **cerberus**
- 1 service to the cerberus pods
- 1 route to the cerberus service

```bash
# Install the startx helm repository
helm repo add startx https://startxfr.github.io/helm-repository/packages/
# Install the cerberus project
helm install --set project.enabled=true chaos-cerberus-project startx/chaos-cerberus
# Deploy the cerberus instance
helm install \
--set cerberus.enabled=true \
--set cerberus.kraken_allowed=true \
--set cerberus.kraken_ns="chaos-kraken" \
--set cerberus.kubeconfig.token.server="https://api.mycluster:6443" \
--set cerberus.kubeconfig.token.token="sha256~XXXXXXXXXX_PUT_YOUR_TOKEN_HERE_XXXXXXXXXXXX" \
-n chaos-cerberus \
chaos-cerberus-instance startx/chaos-cerberus
```

Refer to the [chaos-cerberus chart manpage](https://artifacthub.io/packages/helm/startx/chaos-cerberus)
and especially the [cerberus configuration values](https://artifacthub.io/packages/helm/startx/chaos-cerberus#chaos-cerberus-values-dictionary)
for details on how to configure this chart.