You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/en/altinity-kb-queries-and-syntax/top-n-and-remain.md
+24-6Lines changed: 24 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,12 @@ linkTitle: "Top N & Remain"
4
4
description: >
5
5
Top N & Remain
6
6
---
7
+
8
+
When working with large datasets, you may often need to compute the sum of values for the top N groups and aggregate the remainder separately. This article demonstrates several methods to achieve that in ClickHouse.
9
+
10
+
Dataset Setup
11
+
We'll start by creating a table top_with_rest and inserting data for demonstration purposes:
12
+
7
13
```sql
8
14
CREATETABLEtop_with_rest
9
15
(
@@ -18,7 +24,10 @@ INSERT INTO top_with_rest SELECT
18
24
FROM numbers_mt(10000);
19
25
```
20
26
21
-
## Using UNION ALL
27
+
This creates a table with 10,000 numbers, grouped by dividing the numbers into tens.
28
+
29
+
## Method 1: Using UNION ALL
30
+
This approach retrieves the top 10 groups by sum and aggregates the remaining groups as a separate row.
22
31
23
32
```sql
24
33
SELECT*
@@ -63,7 +72,9 @@ ORDER BY res ASC
63
72
└──────┴──────────┘
64
73
```
65
74
66
-
## Using arrays
75
+
76
+
## Method 2: Using Arrays
77
+
In this method, we push the top 10 groups into an array and add a special row for the remainder
67
78
68
79
```sql
69
80
WITH toUInt64(sumIf(sum, isNull(k)) - sumIf(sum, isNotNull(k))) AS total
@@ -98,7 +109,8 @@ ORDER BY res ASC
98
109
└──────┴──────────┘
99
110
```
100
111
101
-
## Using window functions (starting from ClickHouse® 21.1)
112
+
## Method 3: Using Window Functions
113
+
Window functions, available from ClickHouse version 21.1, provide an efficient way to calculate the sum for the top N rows and the remainder.
102
114
103
115
```sql
104
116
SET allow_experimental_window_functions =1;
@@ -139,7 +151,10 @@ ORDER BY res ASC
139
151
│ null │ 49000050 │
140
152
└──────┴──────────┘
141
153
```
154
+
Window functions allow efficient summation of the total and top groups in one query.
142
155
156
+
## Method 4: Using Row Number and Grouping
157
+
This approach calculates the row number (rn) for each group and replaces the remaining groups with NULL.
143
158
```sql
144
159
SELECT
145
160
k,
@@ -183,10 +198,10 @@ ORDER BY res
183
198
│ null │ 49000050 │
184
199
└──────┴──────────┘
185
200
```
201
+
This method uses ROW_NUMBER() to segregate the top N from the rest.
186
202
187
-
## Using WITH TOTALS
188
-
189
-
The total number will include the top rows as well so the remainder must be calculated by the application
203
+
## Method 5: Using WITH TOTALS
204
+
This method includes totals for all groups, and you calculate the remainder on the application side.
190
205
191
206
```
192
207
SELECT
@@ -216,3 +231,6 @@ Totals:
216
231
│ │ 49995000 │
217
232
└───┴──────────┘
218
233
```
234
+
You would subtract the sum of the top rows from the totals in your application.
235
+
236
+
These methods offer different approaches for handling the Top N rows and aggregating the remainder in ClickHouse. Depending on your requirements—whether you prefer using UNION ALL, arrays, window functions, or totals—each method provides flexibility for efficient querying.
0 commit comments