Skip to content

Commit 37186d2

Browse files
authored
Update top-n-and-remain.md
Added Description
1 parent 93334bb commit 37186d2

File tree

1 file changed

+24
-6
lines changed

1 file changed

+24
-6
lines changed

content/en/altinity-kb-queries-and-syntax/top-n-and-remain.md

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,12 @@ linkTitle: "Top N & Remain"
44
description: >
55
Top N & Remain
66
---
7+
8+
When working with large datasets, you may often need to compute the sum of values for the top N groups and aggregate the remainder separately. This article demonstrates several methods to achieve that in ClickHouse.
9+
10+
Dataset Setup
11+
We'll start by creating a table top_with_rest and inserting data for demonstration purposes:
12+
713
```sql
814
CREATE TABLE top_with_rest
915
(
@@ -18,7 +24,10 @@ INSERT INTO top_with_rest SELECT
1824
FROM numbers_mt(10000);
1925
```
2026

21-
## Using UNION ALL
27+
This creates a table with 10,000 numbers, grouped by dividing the numbers into tens.
28+
29+
## Method 1: Using UNION ALL
30+
This approach retrieves the top 10 groups by sum and aggregates the remaining groups as a separate row.
2231

2332
```sql
2433
SELECT *
@@ -63,7 +72,9 @@ ORDER BY res ASC
6372
└──────┴──────────┘
6473
```
6574

66-
## Using arrays
75+
76+
## Method 2: Using Arrays
77+
In this method, we push the top 10 groups into an array and add a special row for the remainder
6778

6879
```sql
6980
WITH toUInt64(sumIf(sum, isNull(k)) - sumIf(sum, isNotNull(k))) AS total
@@ -98,7 +109,8 @@ ORDER BY res ASC
98109
└──────┴──────────┘
99110
```
100111

101-
## Using window functions (starting from ClickHouse® 21.1)
112+
## Method 3: Using Window Functions
113+
Window functions, available from ClickHouse version 21.1, provide an efficient way to calculate the sum for the top N rows and the remainder.
102114

103115
```sql
104116
SET allow_experimental_window_functions = 1;
@@ -139,7 +151,10 @@ ORDER BY res ASC
139151
null49000050
140152
└──────┴──────────┘
141153
```
154+
Window functions allow efficient summation of the total and top groups in one query.
142155

156+
## Method 4: Using Row Number and Grouping
157+
This approach calculates the row number (rn) for each group and replaces the remaining groups with NULL.
143158
```sql
144159
SELECT
145160
k,
@@ -183,10 +198,10 @@ ORDER BY res
183198
null49000050
184199
└──────┴──────────┘
185200
```
201+
This method uses ROW_NUMBER() to segregate the top N from the rest.
186202

187-
## Using WITH TOTALS
188-
189-
The total number will include the top rows as well so the remainder must be calculated by the application
203+
## Method 5: Using WITH TOTALS
204+
This method includes totals for all groups, and you calculate the remainder on the application side.
190205

191206
```
192207
SELECT
@@ -216,3 +231,6 @@ Totals:
216231
│ │ 49995000 │
217232
└───┴──────────┘
218233
```
234+
You would subtract the sum of the top rows from the totals in your application.
235+
236+
These methods offer different approaches for handling the Top N rows and aggregating the remainder in ClickHouse. Depending on your requirements—whether you prefer using UNION ALL, arrays, window functions, or totals—each method provides flexibility for efficient querying.

0 commit comments

Comments
 (0)