You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
select
array_agg(col1 order by colx,coly,colz),
array_agg(col2 order by colx,coly,colz),
...
array_agg(coln order by colx,coly,colz)
from table
group by id
Notice array_agg is called N times with the same ordering columns colx, coly, colz.
Currently, in the pre-aggregation the ordering columns are copied into each array_agg result, then exchanged to other nodes, then sorted and then the ordering columns are deleted. This leads to high memory consumption since the ordering columns are duplicated for each invocation of array_agg (N times in the example).
SR should reuse the ordering columns if they are reused for multiple array_agg calls. I believe this is already done for window aggregations.
The text was updated successfully, but these errors were encountered:
Interesting, how would this work in the optimizer? I mainly looked at the aggregation implementation in the BE and there it looked quite hard to have some shared aggregation state between multiple array_agg functions
Enhancement
Given a query like this:
Notice
array_agg
is called N times with the same ordering columnscolx, coly, colz
.Currently, in the pre-aggregation the ordering columns are copied into each
array_agg
result, then exchanged to other nodes, then sorted and then the ordering columns are deleted. This leads to high memory consumption since the ordering columns are duplicated for each invocation ofarray_agg
(N times in the example).SR should reuse the ordering columns if they are reused for multiple
array_agg
calls. I believe this is already done for window aggregations.The text was updated successfully, but these errors were encountered: