accelerate vector ops with vek #23245

aunjgr · 2025-12-09T10:16:23Z

User description

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

l2 distance benchmark is improved by 15 times
vanilla go:
Benchmark_L2Distance/L2_Distance-24 181933 6490 ns/op 4096 B/op 1 allocs/op
Benchmark_L2Distance/Normalize_L2-24 717655 1670 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/L2_Distance(v1,_NormalizeL2)-24 277056 4254 ns/op 4096 B/op 1 allocs/op

SIMD version of vek:
Benchmark_L2Distance/L2_Distance-24 2925302 405.2 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/Normalize_L2-24 3214724 374.1 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/L2_Distance(v1,_NormalizeL2)-24 1973931 607.6 ns/op 0 B/op 0 allocs/op

PR Type

Enhancement

Description

Replace BLAS operations with vek SIMD library for vector distance calculations
Achieve 15x performance improvement in L2 distance benchmarks
Simplify code by removing manual vector allocations and loops
Update dependencies to include vek library and update third-party versions

Diagram Walkthrough

flowchart LR
  A["BLAS-based<br/>distance functions"] -->|Replace with| B["vek SIMD<br/>operations"]
  B -->|Improves| C["15x faster<br/>L2 distance"]
  B -->|Reduces| D["Memory allocations<br/>and loops"]

File Walkthrough

Relevant files

Enhancement

distance_func.go `Replace BLAS with vek SIMD operations` pkg/vectorindex/metric/distance_func.go Replace BLAS operations with vek SIMD library functions for L2Distance, L2DistanceSq, L1Distance, InnerProduct, and CosineDistance Simplify L2DistanceSq to use vek.Distance instead of manual loop Refactor NormalizeL2 to use vek.Norm and vek.DivNumber_Inplace for in-place normalization Fix InnerProduct to return positive dot product instead of negated value Update ResolveDistanceFn to use L2Distance instead of L2DistanceSq for Metric_L2Distance	+54/-87

Tests

distance_func_bench_test.go `Update benchmarks for vek float32 operations` pkg/vectorindex/metric/distance_func_bench_test.go Increase benchmark dimension from 128 to 1024 for more realistic testing Change benchmark vectors from float64 to float32 for vek optimization Update loop syntax to use range over b.N Simplify randomVectors function to use float32 and range syntax	+14/-14

Dependencies

go.mod `Add vek SIMD library dependency` go.mod Add github.com/viterin/vek v0.4.3 dependency for SIMD vector operations Add github.com/viterin/partial v1.1.0 as indirect dependency Update github.com/unum-cloud/usearch/golang to v0.0.0-20251130095425-a2f175991017	+4/-1
go.sum `Update dependency checksums` go.sum Add checksums for github.com/viterin/vek v0.4.3 Add checksums for github.com/viterin/partial v1.1.0 Add checksums for github.com/chewxy/math32 v1.10.1 (transitive dependency) Update checksums for github.com/unum-cloud/usearch/golang	+8/-2

Configuration changes

Makefile `Update third-party library versions` thirdparties/Makefile Update USearch from version 2.21.1 to 2.21.4 Update StringZilla from version 4.2.1 to 4.4.2 Update SimSIMD from version 6.5.3 to 6.5.5	+4/-4

qodo-code-review · 2025-12-09T10:16:57Z

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴	Divide-by-zero risk Description: NormalizeL2 divides by the vector norm without guarding against zero-length vectors, causing a potential divide-by-zero (NaN/Inf) output that can corrupt downstream computations or lead to undefined behavior. distance_func.go [165-183] Referred Code switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _normalized := any(normalized).([]float32) copy(_normalized, _v1) vek32.DivNumber_Inplace(_normalized, vek32.Norm(_v1)) case []float64: _v1 := any(v1).([]float64) _normalized := any(normalized).([]float64) copy(_normalized, _v1) vek.DivNumber_Inplace(_normalized, vek.Norm(_v1)) default: return moerr.NewInternalErrorNoCtx("NormalizeL2 type not supported") }
	Broken benchmark loops Description: Benchmarks iterate using 'for i := range b.N' and 'for range dim', which are invalid patterns and could mask realistic performance or safety regressions by not actually executing the intended workload. distance_func_bench_test.go [34-36] Referred Code for i := range b.N { _, _ = L2Distance(v1[i], v2[i]) }
⚪	Invalid cosine for zero vector Description: CosineDistance now relies on vek.CosineSimilarity without explicit checks for zero vectors; if either input is all zeros, this can produce NaN and propagate invalid distances. distance_func.go [107-122] Referred Code switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) return T(1 - vek32.CosineSimilarity(_v1, _v2)), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) return T(1 - vek.CosineSimilarity(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("CosineDistance type not supported") }
Ticket Compliance
⚪	🎫 No ticket provided Create ticket/issue
Codebase Duplication Compliance
⚪	Codebase context is not defined Follow the guide to enable codebase context checks.
Custom Compliance
🟢	Generic: Meaningful Naming and Self-Documenting Code Objective: Ensure all identifiers clearly express their purpose and intent, making code self-documenting Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Error Handling Objective: To prevent the leakage of sensitive system information through error messages while providing sufficient detail for internal debugging. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Secure Logging Practices Objective: To ensure logs are useful for debugging and auditing without exposing sensitive information like PII, PHI, or cardholder data. Status: Passed Learn more about managing compliance generic rules or creating your own custom rules
🔴	Generic: Robust Error Handling and Edge Case Management Objective: Ensure comprehensive error handling that provides meaningful context and graceful degradation Status: Missing edge checks: Functions like L2Distance, L1Distance, and NormalizeL2 no longer validate input lengths or handle zero-length/zero-norm vectors, leading to potential panics or NaNs. Referred Code func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) return T(vek32.Distance(_v1, _v2)), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) return T(vek.Distance(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported") } } func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { ... (clipped 137 lines) Learn more about managing compliance generic rules or creating your own custom rules
⚪	Generic: Comprehensive Audit Trails Objective: To create a detailed and reliable record of critical system actions for security analysis and compliance. Status: No audit logs: The new vector distance functions add or change computation paths without any logging of critical actions, but these math utilities may not be in the scope of audit requirements. Referred Code func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) return T(vek32.Distance(_v1, _v2)), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) return T(vek.Distance(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported") } } func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { ... (clipped 137 lines) Learn more about managing compliance generic rules or creating your own custom rules
	Generic: Security-First Input Validation and Data Handling Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent vulnerabilities Status: Input validation: The updated distance and normalization functions assume equal-length numeric slices without explicit validation, which might be acceptable for internal math utilities but should be confirmed. Referred Code func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) return T(vek32.Distance(_v1, _v2)), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) return T(vek.Distance(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported") } } func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { ... (clipped 137 lines) Learn more about managing compliance generic rules or creating your own custom rules
Update

Compliance status legend

🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

qodo-code-review · 2025-12-09T10:18:13Z

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Impact
Possible issue	✅ ~~Correct inverted inner product metric~~ Suggestion Impact: The commit updated InnerProduct to return the negated dot product for both float32 and float64 cases, matching the suggestion. It also refactored SphericalDistance to use vek dot products instead of gonum, but the key impact is the sign change in InnerProduct. code diff: @@ -90,13 +88,13 @@ _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) - return T(vek32.Dot(_v1, _v2)), nil - - case []float64: - _v1 := any(v1).([]float64) - _v2 := any(v2).([]float64) - - return T(vek.Dot(_v1, _v2)), nil + return T(-vek32.Dot(_v1, _v2)), nil + + case []float64: + _v1 := any(v1).([]float64) + _v2 := any(v2).([]float64) + + return T(-vek.Dot(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("InnerProduct type not supported") Correct the `InnerProduct` function to return the negative dot product. The current implementation returns a positive value, which inverts the logic of the distance metric. pkg/vectorindex/metric/distance_func.go [87-104] func InnerProduct[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) - return T(vek32.Dot(_v1, _v2)), nil + return T(-vek32.Dot(_v1, _v2)), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) - return T(vek.Dot(_v1, _v2)), nil + return T(-vek.Dot(_v1, _v2)), nil default: return 0, moerr.NewInternalErrorNoCtx("InnerProduct type not supported") } } `[Suggestion processed]` Suggestion importance[1-10]: 10 __ Why: The suggestion correctly identifies a critical logic error where the `InnerProduct` distance metric was inverted. This would cause vector searches to return the most dissimilar vectors instead of the most similar, fundamentally breaking the feature.	High
	Prevent division by zero error In `NormalizeL2`, add a check to prevent division by zero when the input vector's norm is zero. If the norm is zero, do not perform the division, thus avoiding `NaN` or `Inf` results. pkg/vectorindex/metric/distance_func.go [166-178] case []float32: _v1 := any(v1).([]float32) _normalized := any(normalized).([]float32) copy(_normalized, _v1) - vek32.DivNumber_Inplace(_normalized, vek32.Norm(_v1)) + if norm := vek32.Norm(_v1); norm > 0 { + vek32.DivNumber_Inplace(_normalized, norm) + } case []float64: _v1 := any(v1).([]float64) _normalized := any(normalized).([]float64) copy(_normalized, _v1) - vek.DivNumber_Inplace(_normalized, vek.Norm(_v1)) + if norm := vek.Norm(_v1); norm > 0 { + vek.DivNumber_Inplace(_normalized, norm) + } Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly identifies a regression where a check for a zero-norm vector was removed, which can lead to a division-by-zero error and `NaN`/`Inf` propagation. This is a valid and important correctness fix.	Medium
	Handle zero vectors in cosine distance In `CosineDistance`, check if the result of `vek.CosineSimilarity` is `NaN`. If it is, return an error to handle cases with zero-magnitude vectors, restoring the previous error-handling behavior. pkg/vectorindex/metric/distance_func.go [106-124] func CosineDistance[T types.RealNumbers](v1, v2 []T) (T, error) { switch any(v1).(type) { case []float32: _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) - return T(1 - vek32.CosineSimilarity(_v1, _v2)), nil + similarity := vek32.CosineSimilarity(_v1, _v2) + if math.IsNaN(float64(similarity)) { + return 0, moerr.NewInternalErrorNoCtx("cannot compute cosine similarity with zero vector") + } + return T(1 - similarity), nil case []float64: _v1 := any(v1).([]float64) _v2 := any(v2).([]float64) - return T(1 - vek.CosineSimilarity(_v1, _v2)), nil + similarity := vek.CosineSimilarity(_v1, _v2) + if math.IsNaN(similarity) { + return 0, moerr.NewInternalErrorNoCtx("cannot compute cosine similarity with zero vector") + } + return T(1 - similarity), nil default: return 0, moerr.NewInternalErrorNoCtx("CosineDistance type not supported") } } Apply / Chat Suggestion importance[1-10]: 8 __ Why: The suggestion correctly points out that a check for zero-magnitude vectors was removed, which can lead to `NaN` results from `vek.CosineSimilarity`. Reintroducing an error for this case prevents silent failures and improves the function's robustness.	Medium
High-level	The PR introduces a new dependency The suggestion advises considering the long-term maintenance burden and risk of adding the new `vek` dependency, which appears to be maintained by a single developer, despite its significant performance improvements. Examples: go.mod [95] github.com/viterin/vek v0.4.3 pkg/vectorindex/metric/distance_func.go [22-23] "github.com/viterin/vek" "github.com/viterin/vek/vek32" Solution Walkthrough: Before: // go.mod // ... (no 'vek' dependency) // pkg/vectorindex/metric/distance_func.go import "gonum.org/v1/gonum/blas/blas32" func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) { // ... diff := blas32.Vector{ Data: make([]float32, len(_v1)), // ... } for i := range _v1 { diff.Data[i] = _v1[i] - _v2[i] } return T(blas32.Nrm2(diff)), nil } After: // go.mod require ( // ... github.com/viterin/vek v0.4.3 ) // pkg/vectorindex/metric/distance_func.go import "github.com/viterin/vek/vek32" func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) { // ... _v1 := any(v1).([]float32) _v2 := any(v2).([]float32) return T(vek32.Distance(_v1, _v2)), nil } Suggestion importance[1-10]: 6 __ Why: The suggestion raises a valid strategic concern about adding a new dependency and its long-term maintenance risk, which is an important consideration for the project's health, even if it's not a code defect.	Low
Update

fengttt

Seems vek only have AVX and does not have NEON?

This is a tough decision

fengttt

If the _op we benchmarked is at 400+ns, cgo is probably acceptable. What about implement the L2 distance function in C and Cgo call it? That should give us best perf on both AMD64 and ARM.

Besides, I trust GCC/CLANG more than hand written assembly to be bug free.

accelerate vector ops with vek

034e8bb

aunjgr requested review from XuPeng-SH, fengttt and zhangxu19830126 as code owners December 9, 2025 10:16

aunjgr had a problem deploying to ci December 9, 2025 10:16 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 9, 2025 10:16 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 9, 2025 10:16 — with GitHub Actions Failure

matrix-meow added the size/M Denotes a PR that changes [100,499] lines label Dec 9, 2025

qodo-code-review bot added the Review effort 2/5 label Dec 9, 2025

mergify bot added the kind/enhancement label Dec 9, 2025

fix ut and bvt

29abc8d

aunjgr requested a review from heni02 as a code owner December 9, 2025 14:41

aunjgr temporarily deployed to ci December 9, 2025 14:41 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 9, 2025 14:41 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 9, 2025 14:41 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 10, 2025 09:32 — with GitHub Actions Failure

aunjgr added 3 commits December 11, 2025 11:05

Merge branch 'main' into simd_dist

297030a

Merge branch 'main' into simd_dist

b3318cc

fix ut/bvt

2240d52

aunjgr had a problem deploying to ci December 11, 2025 08:41 — with GitHub Actions Error

aunjgr had a problem deploying to ci December 11, 2025 08:41 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 11, 2025 08:41 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 11, 2025 08:41 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 11, 2025 08:41 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 11, 2025 08:41 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 11, 2025 08:41 — with GitHub Actions Inactive

aunjgr added 2 commits December 11, 2025 17:26

fix ut

d85f5cb

Merge branch 'main' into simd_dist

f5655ab

aunjgr had a problem deploying to ci December 11, 2025 09:30 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 11, 2025 09:30 — with GitHub Actions Inactive

aunjgr had a problem deploying to ci December 11, 2025 09:30 — with GitHub Actions Failure

aunjgr temporarily deployed to ci December 11, 2025 09:30 — with GitHub Actions Inactive

fengttt reviewed Dec 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

accelerate vector ops with vek #23245

accelerate vector ops with vek #23245

Uh oh!

aunjgr commented Dec 9, 2025 •

edited by qodo-code-review bot

Loading

Uh oh!

qodo-code-review bot commented Dec 9, 2025 •

edited

Loading

Uh oh!

qodo-code-review bot commented Dec 9, 2025 •

edited

Loading

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

fengttt left a comment

Uh oh!

fengttt left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

accelerate vector ops with vek #23245

Are you sure you want to change the base?

accelerate vector ops with vek #23245

Uh oh!

Conversation

aunjgr commented Dec 9, 2025 • edited by qodo-code-review bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

User description

What type of PR is this?

Which issue(s) this PR fixes:

What this PR does / why we need it:

PR Type

Description

Diagram Walkthrough

File Walkthrough

Uh oh!

qodo-code-review bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Compliance Guide 🔍

Uh oh!

qodo-code-review bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Code Suggestions ✨

Examples:

Solution Walkthrough:

Before:

After:

Uh oh!

fengttt left a comment

Choose a reason for hiding this comment

Uh oh!

fengttt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aunjgr commented Dec 9, 2025 •

edited by qodo-code-review bot

Loading

qodo-code-review bot commented Dec 9, 2025 •

edited

Loading

qodo-code-review bot commented Dec 9, 2025 •

edited

Loading