Skip to content

Commit

Permalink
improve documentations for end users (#223)
Browse files Browse the repository at this point in the history
* improve documentations for end users

* update manifests
  • Loading branch information
sanposhiho authored Dec 18, 2023
1 parent f77218d commit 2a5a3c0
Show file tree
Hide file tree
Showing 10 changed files with 218 additions and 90 deletions.
57 changes: 36 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,12 @@
# tortoise
# Tortoise

<img alt="Tortoise" src="docs/images/tortoise_big.jpg" width="400px"/>

Tortoise, they are living in the Kubernetes cluster.

Tortoise, you need to feed only very few parameters to them.

Tortoise, they will soon start to eat historical usage data of Pods.

Tortoise, once you start to live with them, you no longer need to configure autoscaling by yourself.
Get a cute Tortoise into your Kubernetes garden and say goodbye to the days optimizing your rigid autoscalers.

## Install

Tortoise, you cannot get it from the breeder.

Tortoise, you need to get it from GitHub instead.
You cannot get it from the breeder, you need to get it from GitHub instead.

```shell
# Install CRDs into the K8s cluster specified in ~/.kube/config.
Expand All @@ -23,41 +15,64 @@ make install
make deploy
```

Tortoise, you don't need a rearing cage, but need VPA in your Kubernetes cluster before installing it.
You don't need a rearing cage, but need VPA in your Kubernetes cluster before installing it.

## Motivation

Many developers are working in Mercari, and not all of them are the experts of Kubernetes.
The platform has many tools and guides to simplify the task of optimizing resource requests,
but it takes a lot of human effort because the situation around the applications gets changed very frequently and we have to keep optimizing them every time.
(e.g., the implementation change could change the resource consumption, the amount of traffic could be changed, etc)

Also, there are another important component to be optimized for the optimization, which is HorizontalPodAutoscaler.
It’s not a simple problem which we just set the target utilization as high as possible –
there are many scenarios where the actual resource utilization doesn’t reach the target resource utilization
(because of multiple containers, minReplicas, container’s size etc).

To reduce the human effort to keep optimizing the workloads,
the platform team start to have Tortoise , which is designed to simplify the interface of autoscaling.

It aims to move the responsibility of optimizing the workloads from the application teams to tortoises.
Application teams just need to set up Tortoise, and the platform team will never bother them again for the resource optimization -
all actual optimization is done by Tortoise automatically.

## Usage

Tortoise, they only need the deployment name.
Tortoise has a very simple interface:

```yaml
apiVersion: autoscaling.mercari.com/v1beta2
apiVersion: autoscaling.mercari.com/v1beta3
kind: Tortoise
metadata:
name: lovely-tortoise
namespace: zoo
spec:
updateMode: Auto
updateMode: Auto
targetRefs:
scaleTargetRef:
kind: Deployment
name: sample
```
Tortoise, then they'll prepare/keep adjusting HPA and VPA to achieve efficient autoscaling based on the past behavior of the workload.
Yet, beneath its unassuming shell, lies a wealth of historical resource usage data, cunningly harnessed
to deftly orchestrate HPA and VPA with finely-tuned parameters.
Please refer to [User guide](./docs/user-guide.md) for other parameters.
## Documentations
- [Concept](./docs/concept.md): describes a brief overview of tortoise.
- [Horizontal scaling](./docs/horizontal.md): describes how the Tortoise does the horizontal autoscaling.
- [Vertical scaling](./docs/vertical.md): describes how the Tortoise does the vertical autoscaling.
- [User guide](./docs/user-guide.md): describes a minimum knowledge that the end-users have to know,
and how they can configure Tortoise so that they can let tortoises autoscale their workloads.
- [Admin guide](./docs/admin-guide.md): describes how the cluster admin can configure the global behavior of tortoise.
- [Emergency mode](./docs/emergency.md): describes the emergency mode.
- [Configurations for admin](./docs/configuration.md): describes how the cluster admin can configure the global behavior via the configuration file.
- [Horizontal scaling](./docs/horizontal.md): describes how the Tortoise does the horizontal autoscaling internally.
- [Vertical scaling](./docs/vertical.md): describes how the Tortoise does the vertical autoscaling internally.
- [Technically details](./docs/internal.md): describes the technically details of Tortoise. (mostly for the contributors)
- [Contributor guide](./docs/contributor-guide.md): describes other stuff for the contributor. (testing etc)
## API definition
- [Tortoise](./api/v1beta2/tortoise_types.go)
- [Tortoise](./api/v1beta3/tortoise_types.go)
## Contribution
Expand Down
2 changes: 1 addition & 1 deletion api/v1beta3/tortoise_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ type TargetRefs struct {
// HorizontalPodAutoscalerName is the name of the target HPA.
// The target of this HPA should be the same as the ScaleTargetRef above.
// The target HPA should have the ContainerResource type metric that refers to the container resource utilization.
// Please check out the document for more detail: https://github.com/mercari/tortoise/blob/master/docs/horizontal.md#supported-metrics-in-hpa
// Please check out the document for more detail: https://github.com/mercari/tortoise/blob/master/docs/horizontal.md#attach-your-hpa
// Also, note that you must not edit the HPA directly after you attach the HPA to the tortoise of Auto mode.
// Even if you edit your HPA in that case, tortoise will overwrite the HPA with the metrics/values.
//
Expand Down
2 changes: 1 addition & 1 deletion config/crd/bases/autoscaling.mercari.com_tortoises.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -660,7 +660,7 @@ spec:
HPA. The target of this HPA should be the same as the ScaleTargetRef
above. The target HPA should have the ContainerResource type
metric that refers to the container resource utilization. Please
check out the document for more detail: https://github.com/mercari/tortoise/blob/master/docs/horizontal.md#supported-metrics-in-hpa
check out the document for more detail: https://github.com/mercari/tortoise/blob/master/docs/horizontal.md#attach-your-hpa
Also, note that you must not edit the HPA directly after you
attach the HPA to the tortoise of Auto mode. Even if you edit
your HPA in that case, tortoise will overwrite the HPA with
Expand Down
8 changes: 5 additions & 3 deletions docs/configuration.md → docs/admin-guide.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
## Configuration for admin
## Admin guide

<img alt="Tortoise" src="images/eating.jpg" width="400px"/>

The cluster admin can set the global configurations via the configuration file.
The configuration file is passed via `--config` flag.
Tortoise exposes a lot of flags to configure tortoises behavior in the cluster.

The cluster admin can set the global configurations via the configuration file,
and the configuration file is passed via `--config` flag.

```
RangeOfMinMaxReplicasRecommendationHours: The time (hours) range of minReplicas and maxReplicas recommendation (default: 1)
Expand Down
50 changes: 0 additions & 50 deletions docs/concept.md

This file was deleted.

9 changes: 4 additions & 5 deletions docs/emergency.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,16 @@ you can turn on the emergency mode by setting `Emergency` on `.spec.UpdateMode`

### How emergency mode works

When emergency mode is enabled, tortoise increases the `minReplicas` to the same value as `maxReplicas`.
When emergency mode is enabled, tortoise increases the `minReplicas` of HPA to the same value as `maxReplicas`.

As described in [Horizontal scaling](./horizontal.md), `maxReplicas` gets changed to be fairly higher value every hour.
So, during emergency mode, the replicas will be kept fairly high value calculated from the past behavior for the safety.

### turning emergency mode off
### Turn off emergency mode

Also, for the safety, after reverting `UpdateMode` from `Emergency` to `Auto`,

Tortoise tries to reduce the number of replicas to the original value gradually.
(A sudden decrease is mostly dangerous.)
(A sudden decrease in a replica number is often dangerous.)

Specifically, the controller reduces `minReplicas` to the original value gradually by the following formula in one reconciliation:

Expand All @@ -33,5 +32,5 @@ During gradually reducing the `minReplicas`, the Tortoise is in the `BackToNorma

### Note

Emergency mode is available for tortoises with `Running` or `BackToNormal` phase.
Emergency mode is only available for tortoises with `Running` or `BackToNormal` phase.
(because it requires enough historical data to work on)
21 changes: 14 additions & 7 deletions docs/horizontal.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,18 @@ by setting `Horizontal` in `Spec.ResourcePolicy[*].AutoscalingPolicy`

For `Horizontal` resources, Tortoise keeps changing the corresponding HPA's fields with the recommendation value calculated from the historical usage.

Let's get into detail how each field gets changed.
### Configure Horizontal scaling

#### Attach your HPA

You can attach your HPA via `.spec.targetRefs.HorizontalPodAutoscalerName`.

Currently, Tortoise supports only `type: ContainerResource` metric.

If HPA has `type: Resource` metrics, Tortoise just removes them because they'd be conflict with `type: ContainerResource` metrics managed by Tortoise.
If HPA has metrics other than `Resource` or `ContainerResource`, Tortoise just keeps them.

### How Tortoise

### MaxReplicas

Expand All @@ -21,7 +32,7 @@ max{replica numbers at the same time on the same day of week} * MaxReplicasFacto
max{replica numbers at the same time} * MaxReplicasFactor
```

(refer to [configuration.md](./configuration.md) about each parameter)
(refer to [admin-guide.md](./admin-guide.md) about each parameter)

It only takes the num of replicas of the last 4 weeks into consideration.

Expand All @@ -37,7 +48,7 @@ max{replica numbers at the same time on the same day of week} * MinReplicasFacto
max{replica numbers at the same time} * MinReplicasFactor
```

(refer to [configuration.md](./configuration.md) about each parameter)
(refer to [admin-guide.md](./admin-guide.md) about each parameter)

It only takes the num of replicas of the last 4 weeks into consideration.

Expand Down Expand Up @@ -72,10 +83,6 @@ Looking back the above formula,
- make all container's resource utilization below 100%.
- Thus, finally `100 - (max{recommended resource usage from VPA}/{current resource request} - {current target utilization})` means the target utilization which only give the bare minimum additional resources.

#### Supported metrics in HPA

Currently, Tortoise supports only `type: ContainerResource` metric.

### The container right sizing

Although it says "Horizontal",
Expand Down
Loading

0 comments on commit 2a5a3c0

Please sign in to comment.