Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions docs/cloud/features/04_automatic_scaling/01_overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
sidebar_position: 1
sidebar_label: 'Overview'
slug: /manage/scaling
description: 'Overview of automatic scaling in ClickHouse Cloud'
keywords: ['autoscaling', 'auto scaling', 'scaling', 'horizontal', 'vertical', 'bursts']
title: 'Automatic scaling'
doc_type: 'guide'
---

import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

# Automatic scaling

Scaling is the ability to adjust available resources to meet client demands. Scale and Enterprise (with standard 1:4 profile) tier services can be scaled horizontally by calling an API programmatically, or changing settings on the UI to adjust system resources. These services can also be **autoscaled** vertically to meet application demands.

<ScalePlanFeatureBadge feature="Automatic vertical scaling"/>

:::note
Scale and Enterprise tiers support both single and multi-replica services, whereas, the Basic tier supports only single replica services. Single replica services are meant to be fixed in size and don't allow vertical or horizontal scaling. You can upgrade to the Scale or Enterprise tier to scale your services.
:::

## How scaling works in ClickHouse Cloud {#how-scaling-works-in-clickhouse-cloud}

Currently, ClickHouse Cloud supports vertical autoscaling and manual horizontal scaling for Scale tier services.

For Enterprise tier services scaling works as follows:

- **Horizontal scaling**: Manual horizontal scaling will be available across all standard and custom profiles on the enterprise tier.
- **Vertical scaling**:
- Standard profiles (1:4) will support vertical autoscaling.
- Custom profiles (`highMemory` and `highCPU`) don't support vertical autoscaling or manual vertical scaling. However, these services can be scaled vertically by contacting support.

:::note
Scaling in ClickHouse Cloud happens in what we call a ["Make Before Break" (MBB)](/cloud/features/mbb) approach.
This adds one or more replicas of the new size before removing the old replicas, preventing any loss of capacity during scaling operations.
By eliminating the gap between removing existing replicas and adding new ones, MBB creates a more seamless and less disruptive scaling process.
It is especially beneficial in scale-up scenarios, where high resource utilization triggers the need for additional capacity, since removing replicas prematurely would only exacerbate the resource constraints.
As part of this approach, we wait up to an hour to let any existing queries complete on the older replicas before removing them.
This balances the need for existing queries to complete, while at the same time ensuring that older replicas don't linger around for too long.
:::

## Learn more {#learn-more}

- [Vertical autoscaling](/cloud/features/autoscaling/vertical) — Automatic CPU and memory scaling based on usage
- [Horizontal scaling](/cloud/features/autoscaling/horizontal) — Manual replica scaling via API or UI
- [Make Before Break (MBB)](/cloud/features/mbb) — How ClickHouse Cloud performs seamless scaling operations
- [Automatic idling](/cloud/features/autoscaling/idling) — Cost savings through automatic service suspension
- [Scaling recommendations](/cloud/features/autoscaling/scaling-recommendations) — Understanding scaling recommendations
- [Scheduled scaling](/cloud/features/autoscaling/scaling-recommendations) — Understanding the Scheduled Scaling feature, which lets you define exactly when your service should scale up or down, independent of real-time metrics
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
sidebar_position: 2
sidebar_label: 'Vertical autoscaling'
slug: /cloud/features/autoscaling/vertical
description: 'Configuring vertical autoscaling in ClickHouse Cloud'
keywords: ['autoscaling', 'auto scaling', 'vertical', 'scaling', 'CPU', 'memory']
title: 'Vertical autoscaling'
doc_type: 'guide'
---

import Image from '@theme/IdealImage';
import auto_scaling from '@site/static/images/cloud/manage/AutoScaling.png';
import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

## Vertical auto scaling {#vertical-auto-scaling}

<ScalePlanFeatureBadge feature="Automatic vertical scaling"/>

Scale and Enterprise tier services support autoscaling based on CPU and memory usage. Service usage is constantly monitored over a lookback window to make scaling decisions. If the usage rises above or falls below certain thresholds, the service is scaled appropriately to match the demand.

## CPU-based Scaling {#cpu-based-scaling}

CPU Scaling is based on target tracking which calculates the exact CPU allocation needed to keep utilization at a target level. A scaling action is only triggered if current CPU utilization falls outside a defined band:

| Parameter | Value | Meaning |
|---|---|---|
| Target utilization | 53% | The utilization level ClickHouse aims to maintain |
| High watermark | 75% | Triggers scale-up when CPU exceeds this threshold |
| Low watermark | 37.5% | Triggers scale-down when CPU falls below this threshold |

The recommender evaluates CPU utilization based on historical usage, and determines a recommended CPU size using this formula:
```text
recommended_cpu = max_cpu_usage / target_utilization
```

If the CPU utilization is between 37.5%–75% of allocated capacity, no scaling action is taken. Outside that band, the recommender computes the exact size needed to land back at 53% utilization, and the service is scaled accordingly.

### Example {#cpu-scaling-example}

A service allocated 4 vCPU experiences a spike to 3.8 vCPU usage (~95% utilization), crossing the 75% high watermark. The recommender calculates: `3.8 / 0.53 ≈ 7.2 vCPU`, and rounds up to the next available size (8 vCPU). Once load subsides and usage drops below 37.5% (1.5 vCPU), the recommender scales back down proportionally.

## Memory-based Scaling {#memory-based-scaling}

Memory-based auto-scaling scales the cluster to 125% of the maximum memory usage, or up to 150% if OOM (out of memory) errors are encountered.

## Scaling Decision {#scaling-decision}

The larger of the CPU or memory recommendation is picked, and CPU and memory allocated to the service are scaled in lockstep increments of 1 CPU and 4 GiB memory.

## Configuring vertical auto scaling {#configuring-vertical-auto-scaling}

The scaling of ClickHouse Cloud Scale or Enterprise services can be adjusted by organization members with the **Admin** role. To configure vertical autoscaling, go to the **Settings** tab for your service and adjust the minimum and maximum memory, along with CPU settings as shown below.

:::note
Single replica services can't be scaled for all tiers.
:::

<Image img={auto_scaling} size="lg" alt="Scaling settings page" border/>

Set the **Maximum memory** for your replicas at a higher value than the **Minimum memory**. The service will then scale as needed within those bounds. These settings are also available during the initial service creation flow. Each replica in your service will be allocated the same memory and CPU resources.

You can also choose to set these values the same, essentially "pinning" the service to a specific configuration. Doing so will immediately force scaling to the desired size you picked.

It's important to note that this will disable any auto scaling on the cluster, and your service won't be protected against increases in CPU or memory usage beyond these settings.

:::note
For Enterprise tier services, standard 1:4 profiles will support vertical autoscaling. Custom profiles don’t support vertical autoscaling or manual vertical scaling. However, these services can be scaled vertically by contacting support.
:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
sidebar_position: 3
sidebar_label: 'Horizontal scaling'
slug: /cloud/features/autoscaling/horizontal
description: 'Manual horizontal scaling in ClickHouse Cloud'
keywords: ['horizontal scaling', 'scaling', 'replicas', 'manual scaling', 'spikes', 'bursts']
title: 'Horizontal scaling'
doc_type: 'guide'
---

import Image from '@theme/IdealImage';
import scaling_patch_request from '@site/static/images/cloud/manage/scaling-patch-request.png';
import scaling_patch_response from '@site/static/images/cloud/manage/scaling-patch-response.png';
import scaling_configure from '@site/static/images/cloud/manage/scaling-configure.png';
import scaling_memory_allocation from '@site/static/images/cloud/manage/scaling-memory-allocation.png';
import ScalePlanFeatureBadge from '@theme/badges/ScalePlanFeatureBadge'

## Manual horizontal scaling {#manual-horizontal-scaling}

<ScalePlanFeatureBadge feature="Manual horizontal scaling"/>

You can use ClickHouse Cloud [public APIs](https://clickhouse.com/docs/cloud/manage/api/swagger#/paths/~1v1~1organizations~1:organizationId~1services~1:serviceId~1scaling/patch) to scale your service by updating the scaling settings for the service or adjust the number of replicas from the cloud console.

**Scale** and **Enterprise** tiers also support single-replica services. Services once scaled out, can be scaled back in to a minimum of a single replica. Note that single replica services have reduced availability and aren't recommended for production usage.

:::note
Services can scale horizontally to a maximum of 20 replicas. If you need additional replicas, please contact our support team.
:::

### Horizontal scaling via API {#horizontal-scaling-via-api}

To horizontally scale a cluster, issue a `PATCH` request via the API to adjust the number of replicas. The screenshots below show an API call to scale out a `3` replica cluster to `6` replicas, and the corresponding response.

<Image img={scaling_patch_request} size="lg" alt="Scaling PATCH request" border/>

*`PATCH` request to update `numReplicas`*

<Image img={scaling_patch_response} size="md" alt="Scaling PATCH response" border/>

*Response from `PATCH` request*

If you issue a new scaling request or multiple requests in succession, while one is already in progress, the scaling service will ignore the intermediate states and converge on the final replica count.

### Horizontal scaling via UI {#horizontal-scaling-via-ui}

To scale a service horizontally from the UI, you can adjust the number of replicas for the service on the **Settings** page.

<Image img={scaling_configure} size="md" alt="Scaling configuration settings" border/>

*Service scaling settings from the ClickHouse Cloud console*

Once the service has scaled, the metrics dashboard in the cloud console should show the correct allocation to the service. The screenshot below shows the cluster having scaled to total memory of `96 GiB`, which is `6` replicas, each with `16 GiB` memory allocation.

<Image img={scaling_memory_allocation} size="md" alt="Scaling memory allocation" border />

## Handling spikes in workload {#handling-bursty-workloads}

If you have an upcoming expected spike in your workload, you can use the
[ClickHouse Cloud API](/cloud/manage/api/api-overview) to
preemptively scale up your service to handle the spike and scale it down once
the demand subsides.

To understand the current CPU cores and memory in use for
each of your replicas, you can run the query below:

```sql
SELECT *
FROM clusterAllReplicas('default', view(
SELECT
hostname() AS server,
anyIf(value, metric = 'CGroupMaxCPU') AS cpu_cores,
formatReadableSize(anyIf(value, metric = 'CGroupMemoryTotal')) AS memory
FROM system.asynchronous_metrics
))
ORDER BY server ASC
SETTINGS skip_unavailable_shards = 1
```
30 changes: 30 additions & 0 deletions docs/cloud/features/04_automatic_scaling/05_automatic_idling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
sidebar_position: 5
sidebar_label: 'Automatic idling'
slug: /cloud/features/autoscaling/idling
description: 'Automatic idling and adaptive idling in ClickHouse Cloud'
keywords: ['idling', 'automatic idling', 'adaptive idling', 'cost savings', 'pause']
title: 'Automatic idling'
doc_type: 'guide'
---

## Automatic idling {#automatic-idling}
In the **Settings** page, you can also choose whether or not to allow automatic idling of your service when it is inactive for a certain duration (i.e. when the service isn't executing any user-submitted queries). Automatic idling reduces the cost of your service, as you're not billed for compute resources when the service is paused.

### Adaptive Idling {#adaptive-idling}
ClickHouse Cloud implements adaptive idling to prevent disruptions while optimizing cost savings. The system evaluates several conditions before transitioning a service to idle. Adaptive idling overrides the idling duration setting when any of the below listed conditions are met:
- When the number of parts exceeds the maximum idle parts threshold (default: 10,000), the service isn't idled so that background maintenance can continue
- When there are ongoing merge operations, the service isn't idled until those merges complete to avoid interrupting critical data consolidation
- Additionally, the service also adapts idle timeouts based on server initialization time:
- If server initialization time is less than 15 minutes, no adaptive timeout is applied and the customer-configured default idle timeout is used
- If server initialization time is between 15 and 30 minutes, the idle timeout is set to 15 minutes
- If server initialization time is between 30 and 60 minutes, the idle timeout is set to 30 minutes.
- If server initialization time is more than 60 minutes, the idle timeout is set to 1 hour

:::note
The service may enter an idle state where it suspends refreshes of [refreshable materialized views](/materialized-view/refreshable-materialized-view), consumption from [S3Queue](/engines/table-engines/integrations/s3queue), and scheduling of new merges. Existing merge operations will complete before the service transitions to the idle state. To ensure continuous operation of refreshable materialized views and S3Queue consumption, disable the idle state functionality.
:::

:::danger When not to use automatic idling
Use automatic idling only if your use case can handle a delay before responding to queries, because when a service is paused, connections to the service will time out. Automatic idling is ideal for services that are used infrequently and where a delay can be tolerated. It isn't recommended for services that power customer-facing features that are used frequently.
:::
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
sidebar_position: 6
sidebar_label: 'Scaling recommendations'
slug: /cloud/features/autoscaling/scaling-recommendations
description: 'Understanding scaling recommendations in ClickHouse Cloud'
keywords: ['scaling recommendations', 'recommender', '2-window', 'autoscaling', 'optimization']
title: 'Scaling recommendations'
doc_type: 'guide'
---

ClickHouse Cloud automatically adjusts CPU and memory resources for each service based on real-time usage — ensuring stable performance while minimizing resource wastage. To balance responsiveness with stability, we utilize a two-window recommender system that monitors utilization over both a short 3-hour window and a longer 30-hour window. This allows us to react quickly to changes and also make decisions based on longer-term trends.

When usage increases, the system references the long window so it can scale up in a single, decisive step to the highest observed load within the past 30 hours. This approach minimizes repeated scale events. Conversely, when traffic declines, the short window guides a quick scale-down within about three hours, conserving resources.

By integrating these two perspectives, the recommender intelligently balances responsiveness with stability.

## Benefits {#benefits}

- **Cost optimization:** Right-size your services to avoid paying for unused resources while maintaining performance.
- **Proactive scaling:** Get ahead of potential performance issues before they impact your workloads.
- **Balanced approach:** The 2-window design prevents over-provisioning from transient spikes while still ensuring adequate headroom for real demand.
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
---
sidebar_position: 7
sidebar_label: 'Scheduled scaling'
slug: /cloud/features/autoscaling/scheduled-scaling
description: 'Article discussing the Scheduled Scaling feature in ClickHouse Cloud'
keywords: ['scheduled scaling']
title: 'Scheduled scaling'
doc_type: 'guide'
---

import PrivatePreviewBadge from '@theme/badges/PrivatePreviewBadge';
import Image from '@theme/IdealImage';
import scheduled_scaling_1 from '@site/static/images/cloud/features/autoscaling/scheduled-scaling-1.png';
import scheduled_scaling_2 from '@site/static/images/cloud/features/autoscaling/scheduled-scaling-2.png';

<PrivatePreviewBadge/>

ClickHouse Cloud services automatically scale based on CPU and memory utilization, but many workloads follow predictable patterns — daily ingestion spikes, batch jobs that run overnight, or traffic that drops sharply on weekends. For these use cases, Scheduled Scaling lets you define exactly when your service should scale up or down, independent of real-time metrics.

With Scheduled Scaling, you configure a set of time-based rules directly in the ClickHouse Cloud console. Each rule specifies a time, a recurrence (daily, weekly, or custom), and the target size — either the number of replicas (horizontal) or the memory tier (vertical). At the scheduled time, ClickHouse Cloud automatically applies the change, so your service is sized appropriately before demand arrives rather than reacting after the fact.

This is distinct from metric-based autoscaling, which responds dynamically to CPU and memory pressure. Scheduled Scaling is deterministic: you know exactly when the scaling will happen and to what size. The two approaches are complementary — a service can have a baseline scaling schedule and still benefit from autoscaling within that window if workloads fluctuate unexpectedly.

Scheduled Scaling is currently available in **Private Preview**. To enable it for your organization, contact the ClickHouse support team.

## Setting up a scaling schedule {#setting-up-a-scaling-schedule}

To configure a schedule, navigate to your service in the ClickHouse Cloud console and go to settings. From there, select **Schedule Override** and add a new rule.

<Image img={scheduled_scaling_1} size="md" alt="The Scaling Schedules interface in the ClickHouse Cloud console, showing time-based scaling rules" border/>

<Image img={scheduled_scaling_2} size="md" alt="Configuring a scheduled scaling rule in the ClickHouse Cloud console" border/>

Each rule requires:

- **Time:** When the scaling action should occur (in your local timezone)
- **Recurrence:** How often the rule repeats (e.g. every weekday, every Sunday)
- **Target size:** The number of replicas or memory allocation to scale to

Multiple rules can be combined to form a full weekly schedule. For example, you might scale out to 5 replicas every weekday at 6 AM and scale back to 2 replicas at 8 PM.

## Use cases {#use-cases}

**Batch and ETL workloads:** Scale up before a nightly ingest job runs and scale back down once it completes, avoiding over-provisioning during idle daytime hours.

**Predictable traffic patterns:** Services with consistent peak hours (e.g. business-hours query traffic) can be pre-scaled to handle load before it arrives, rather than waiting for autoscaling to react.

**Weekend scale-down:** Reduce replica count or memory tier over weekends when demand is lower, then restore capacity before the Monday morning surge.

**Cost control:** For teams managing ClickHouse Cloud spend, scheduled scale-downs during known low-utilization periods can meaningfully reduce resource consumption without any manual intervention.

:::note
A scheduled scaling action and a concurrent autoscaling recommendation may interact — the schedule takes precedence at its trigger time.
:::
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,4 @@
"label": "Automatic Scaling",
"collapsible": true,
"collapsed": true
}
}
Loading
Loading