Summary
This issue tracks adding vector math and array aggregate scalar functions to DataFusion. These close gaps versus DuckDB and LanceDB for vector search and array analytics workloads.
Replaces #21371 and #21376, which were requested to be split into function-per-PR submissions (per @alamb's review).
Functions
Vector math (with shared vector_math.rs primitives)
Array element-wise math
| Function |
Signature |
Reference |
array_add |
(array, array) → array |
Element-wise addition |
array_subtract |
(array, array) → array |
Element-wise subtraction |
array_scale |
(array, scalar) → array |
Scalar multiply |
Array aggregate scalars
Alias fix
| Fix |
Description |
list_min |
Missing alias on ArrayMin (parity with existing list_max on ArrayMax) |
Submission plan
One PR per function, submitted serially. Each PR will reference this issue.
References
Summary
This issue tracks adding vector math and array aggregate scalar functions to DataFusion. These close gaps versus DuckDB and LanceDB for vector search and array analytics workloads.
Replaces #21371 and #21376, which were requested to be split into function-per-PR submissions (per @alamb's review).
Functions
Vector math (with shared
vector_math.rsprimitives)cosine_distance(array, array) → float64array_cosine_similarityinner_product(array, array) → float64array_inner_productarray_normalize(array) → arrayArray element-wise math
array_add(array, array) → arrayarray_subtract(array, array) → arrayarray_scale(array, scalar) → arrayArray aggregate scalars
array_sum/list_sum(array) → numericlist_sumarray_product/list_product(array) → numericlist_productarray_avg/list_avg(array) → float64list_avgAlias fix
list_minArrayMin(parity with existinglist_maxonArrayMax)Submission plan
One PR per function, submitted serially. Each PR will reference this issue.
References