Skip to content

Conversation

@aunjgr
Copy link
Contributor

@aunjgr aunjgr commented Dec 9, 2025

User description

What type of PR is this?

  • API-change
  • BUG
  • Improvement
  • Documentation
  • Feature
  • Test and CI
  • Code Refactoring

Which issue(s) this PR fixes:

issue #22508

What this PR does / why we need it:

l2 distance benchmark is improved by 15 times
vanilla go:
Benchmark_L2Distance/L2_Distance-24 181933 6490 ns/op 4096 B/op 1 allocs/op
Benchmark_L2Distance/Normalize_L2-24 717655 1670 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/L2_Distance(v1,_NormalizeL2)-24 277056 4254 ns/op 4096 B/op 1 allocs/op

SIMD version of vek:
Benchmark_L2Distance/L2_Distance-24 2925302 405.2 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/Normalize_L2-24 3214724 374.1 ns/op 0 B/op 0 allocs/op
Benchmark_L2Distance/L2_Distance(v1,_NormalizeL2)-24 1973931 607.6 ns/op 0 B/op 0 allocs/op


PR Type

Enhancement


Description

  • Replace BLAS operations with vek SIMD library for vector distance calculations

  • Achieve 15x performance improvement in L2 distance benchmarks

  • Simplify code by removing manual vector allocations and loops

  • Update dependencies to include vek library and update third-party versions


Diagram Walkthrough

flowchart LR
  A["BLAS-based<br/>distance functions"] -->|Replace with| B["vek SIMD<br/>operations"]
  B -->|Improves| C["15x faster<br/>L2 distance"]
  B -->|Reduces| D["Memory allocations<br/>and loops"]
Loading

File Walkthrough

Relevant files
Enhancement
distance_func.go
Replace BLAS with vek SIMD operations                                       

pkg/vectorindex/metric/distance_func.go

  • Replace BLAS operations with vek SIMD library functions for
    L2Distance, L2DistanceSq, L1Distance, InnerProduct, and CosineDistance
  • Simplify L2DistanceSq to use vek.Distance instead of manual loop
  • Refactor NormalizeL2 to use vek.Norm and vek.DivNumber_Inplace for
    in-place normalization
  • Fix InnerProduct to return positive dot product instead of negated
    value
  • Update ResolveDistanceFn to use L2Distance instead of L2DistanceSq for
    Metric_L2Distance
+54/-87 
Tests
distance_func_bench_test.go
Update benchmarks for vek float32 operations                         

pkg/vectorindex/metric/distance_func_bench_test.go

  • Increase benchmark dimension from 128 to 1024 for more realistic
    testing
  • Change benchmark vectors from float64 to float32 for vek optimization
  • Update loop syntax to use range over b.N
  • Simplify randomVectors function to use float32 and range syntax
+14/-14 
Dependencies
go.mod
Add vek SIMD library dependency                                                   

go.mod

  • Add github.com/viterin/vek v0.4.3 dependency for SIMD vector
    operations
  • Add github.com/viterin/partial v1.1.0 as indirect dependency
  • Update github.com/unum-cloud/usearch/golang to
    v0.0.0-20251130095425-a2f175991017
+4/-1     
go.sum
Update dependency checksums                                                           

go.sum

  • Add checksums for github.com/viterin/vek v0.4.3
  • Add checksums for github.com/viterin/partial v1.1.0
  • Add checksums for github.com/chewxy/math32 v1.10.1 (transitive
    dependency)
  • Update checksums for github.com/unum-cloud/usearch/golang
+8/-2     
Configuration changes
Makefile
Update third-party library versions                                           

thirdparties/Makefile

  • Update USearch from version 2.21.1 to 2.21.4
  • Update StringZilla from version 4.2.1 to 4.4.2
  • Update SimSIMD from version 6.5.3 to 6.5.5
+4/-4     

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 9, 2025

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
🔴
Divide-by-zero risk

Description: NormalizeL2 divides by the vector norm without guarding against zero-length vectors,
causing a potential divide-by-zero (NaN/Inf) output that can corrupt downstream
computations or lead to undefined behavior.
distance_func.go [165-183]

Referred Code
switch any(v1).(type) {
case []float32:
	_v1 := any(v1).([]float32)
	_normalized := any(normalized).([]float32)

	copy(_normalized, _v1)
	vek32.DivNumber_Inplace(_normalized, vek32.Norm(_v1))

case []float64:
	_v1 := any(v1).([]float64)
	_normalized := any(normalized).([]float64)

	copy(_normalized, _v1)
	vek.DivNumber_Inplace(_normalized, vek.Norm(_v1))

default:
	return moerr.NewInternalErrorNoCtx("NormalizeL2 type not supported")
}
Broken benchmark loops

Description: Benchmarks iterate using 'for i := range b.N' and 'for range dim', which are invalid
patterns and could mask realistic performance or safety regressions by not actually
executing the intended workload.
distance_func_bench_test.go [34-36]

Referred Code
for i := range b.N {
	_, _ = L2Distance(v1[i], v2[i])
}
Invalid cosine for zero vector

Description: CosineDistance now relies on vek.CosineSimilarity without explicit checks for zero
vectors; if either input is all zeros, this can produce NaN and propagate invalid
distances.
distance_func.go [107-122]

Referred Code
switch any(v1).(type) {
case []float32:
	_v1 := any(v1).([]float32)
	_v2 := any(v2).([]float32)

	return T(1 - vek32.CosineSimilarity(_v1, _v2)), nil

case []float64:
	_v1 := any(v1).([]float64)
	_v2 := any(v2).([]float64)

	return T(1 - vek.CosineSimilarity(_v1, _v2)), nil

default:
	return 0, moerr.NewInternalErrorNoCtx("CosineDistance type not supported")
}
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Missing edge checks: Functions like L2Distance, L1Distance, and NormalizeL2 no longer validate input lengths or
handle zero-length/zero-norm vectors, leading to potential panics or NaNs.

Referred Code
func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {
	case []float32:
		_v1 := any(v1).([]float32)
		_v2 := any(v2).([]float32)

		return T(vek32.Distance(_v1, _v2)), nil

	case []float64:
		_v1 := any(v1).([]float64)
		_v2 := any(v2).([]float64)

		return T(vek.Distance(_v1, _v2)), nil

	default:
		return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported")
	}
}

func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {


 ... (clipped 137 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
No audit logs: The new vector distance functions add or change computation paths without any logging of
critical actions, but these math utilities may not be in the scope of audit requirements.

Referred Code
func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {
	case []float32:
		_v1 := any(v1).([]float32)
		_v2 := any(v2).([]float32)

		return T(vek32.Distance(_v1, _v2)), nil

	case []float64:
		_v1 := any(v1).([]float64)
		_v2 := any(v2).([]float64)

		return T(vek.Distance(_v1, _v2)), nil

	default:
		return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported")
	}
}

func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {


 ... (clipped 137 lines)

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status:
Input validation: The updated distance and normalization functions assume equal-length numeric slices
without explicit validation, which might be acceptable for internal math utilities but
should be confirmed.

Referred Code
func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {
	case []float32:
		_v1 := any(v1).([]float32)
		_v2 := any(v2).([]float32)

		return T(vek32.Distance(_v1, _v2)), nil

	case []float64:
		_v1 := any(v1).([]float64)
		_v2 := any(v2).([]float64)

		return T(vek.Distance(_v1, _v2)), nil

	default:
		return 0, moerr.NewInternalErrorNoCtx("L2Distance type not supported")
	}
}

func L2DistanceSq[T types.RealNumbers](v1, v2 []T) (T, error) {
	switch any(v1).(type) {


 ... (clipped 137 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link

qodo-code-review bot commented Dec 9, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Correct inverted inner product metric
Suggestion Impact:The commit updated InnerProduct to return the negated dot product for both float32 and float64 cases, matching the suggestion. It also refactored SphericalDistance to use vek dot products instead of gonum, but the key impact is the sign change in InnerProduct.

code diff:

@@ -90,13 +88,13 @@
 		_v1 := any(v1).([]float32)
 		_v2 := any(v2).([]float32)
 
-		return T(vek32.Dot(_v1, _v2)), nil
-
-	case []float64:
-		_v1 := any(v1).([]float64)
-		_v2 := any(v2).([]float64)
-
-		return T(vek.Dot(_v1, _v2)), nil
+		return T(-vek32.Dot(_v1, _v2)), nil
+
+	case []float64:
+		_v1 := any(v1).([]float64)
+		_v2 := any(v2).([]float64)
+
+		return T(-vek.Dot(_v1, _v2)), nil
 
 	default:
 		return 0, moerr.NewInternalErrorNoCtx("InnerProduct type not supported")

Correct the InnerProduct function to return the negative dot product. The
current implementation returns a positive value, which inverts the logic of the
distance metric.

pkg/vectorindex/metric/distance_func.go [87-104]

 func InnerProduct[T types.RealNumbers](v1, v2 []T) (T, error) {
 	switch any(v1).(type) {
 	case []float32:
 		_v1 := any(v1).([]float32)
 		_v2 := any(v2).([]float32)
 
-		return T(vek32.Dot(_v1, _v2)), nil
+		return T(-vek32.Dot(_v1, _v2)), nil
 
 	case []float64:
 		_v1 := any(v1).([]float64)
 		_v2 := any(v2).([]float64)
 
-		return T(vek.Dot(_v1, _v2)), nil
+		return T(-vek.Dot(_v1, _v2)), nil
 
 	default:
 		return 0, moerr.NewInternalErrorNoCtx("InnerProduct type not supported")
 	}
 }

[Suggestion processed]

Suggestion importance[1-10]: 10

__

Why: The suggestion correctly identifies a critical logic error where the InnerProduct distance metric was inverted. This would cause vector searches to return the most dissimilar vectors instead of the most similar, fundamentally breaking the feature.

High
Prevent division by zero error

In NormalizeL2, add a check to prevent division by zero when the input vector's
norm is zero. If the norm is zero, do not perform the division, thus avoiding
NaN or Inf results.

pkg/vectorindex/metric/distance_func.go [166-178]

 	case []float32:
 		_v1 := any(v1).([]float32)
 		_normalized := any(normalized).([]float32)
 
 		copy(_normalized, _v1)
-		vek32.DivNumber_Inplace(_normalized, vek32.Norm(_v1))
+		if norm := vek32.Norm(_v1); norm > 0 {
+			vek32.DivNumber_Inplace(_normalized, norm)
+		}
 
 	case []float64:
 		_v1 := any(v1).([]float64)
 		_normalized := any(normalized).([]float64)
 
 		copy(_normalized, _v1)
-		vek.DivNumber_Inplace(_normalized, vek.Norm(_v1))
+		if norm := vek.Norm(_v1); norm > 0 {
+			vek.DivNumber_Inplace(_normalized, norm)
+		}
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly identifies a regression where a check for a zero-norm vector was removed, which can lead to a division-by-zero error and NaN/Inf propagation. This is a valid and important correctness fix.

Medium
Handle zero vectors in cosine distance

In CosineDistance, check if the result of vek.CosineSimilarity is NaN. If it is,
return an error to handle cases with zero-magnitude vectors, restoring the
previous error-handling behavior.

pkg/vectorindex/metric/distance_func.go [106-124]

 func CosineDistance[T types.RealNumbers](v1, v2 []T) (T, error) {
 	switch any(v1).(type) {
 	case []float32:
 		_v1 := any(v1).([]float32)
 		_v2 := any(v2).([]float32)
 
-		return T(1 - vek32.CosineSimilarity(_v1, _v2)), nil
+		similarity := vek32.CosineSimilarity(_v1, _v2)
+		if math.IsNaN(float64(similarity)) {
+			return 0, moerr.NewInternalErrorNoCtx("cannot compute cosine similarity with zero vector")
+		}
+		return T(1 - similarity), nil
 
 	case []float64:
 		_v1 := any(v1).([]float64)
 		_v2 := any(v2).([]float64)
 
-		return T(1 - vek.CosineSimilarity(_v1, _v2)), nil
+		similarity := vek.CosineSimilarity(_v1, _v2)
+		if math.IsNaN(similarity) {
+			return 0, moerr.NewInternalErrorNoCtx("cannot compute cosine similarity with zero vector")
+		}
+		return T(1 - similarity), nil
 
 	default:
 		return 0, moerr.NewInternalErrorNoCtx("CosineDistance type not supported")
 	}
 
 }
  • Apply / Chat
Suggestion importance[1-10]: 8

__

Why: The suggestion correctly points out that a check for zero-magnitude vectors was removed, which can lead to NaN results from vek.CosineSimilarity. Reintroducing an error for this case prevents silent failures and improves the function's robustness.

Medium
High-level
The PR introduces a new dependency

The suggestion advises considering the long-term maintenance burden and risk of
adding the new vek dependency, which appears to be maintained by a single
developer, despite its significant performance improvements.

Examples:

go.mod [95]
	github.com/viterin/vek v0.4.3
pkg/vectorindex/metric/distance_func.go [22-23]
	"github.com/viterin/vek"
	"github.com/viterin/vek/vek32"

Solution Walkthrough:

Before:

// go.mod
// ... (no 'vek' dependency)

// pkg/vectorindex/metric/distance_func.go
import "gonum.org/v1/gonum/blas/blas32"

func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) {
    // ...
    diff := blas32.Vector{
        Data: make([]float32, len(_v1)),
        // ...
    }
    for i := range _v1 {
        diff.Data[i] = _v1[i] - _v2[i]
    }
    return T(blas32.Nrm2(diff)), nil
}

After:

// go.mod
require (
    // ...
    github.com/viterin/vek v0.4.3
)

// pkg/vectorindex/metric/distance_func.go
import "github.com/viterin/vek/vek32"

func L2Distance[T types.RealNumbers](v1, v2 []T) (T, error) {
    // ...
    _v1 := any(v1).([]float32)
    _v2 := any(v2).([]float32)
    return T(vek32.Distance(_v1, _v2)), nil
}
Suggestion importance[1-10]: 6

__

Why: The suggestion raises a valid strategic concern about adding a new dependency and its long-term maintenance risk, which is an important consideration for the project's health, even if it's not a code defect.

Low
  • Update

Copy link
Contributor

@fengttt fengttt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems vek only have AVX and does not have NEON?

This is a tough decision

Copy link
Contributor

@fengttt fengttt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the _op we benchmarked is at 400+ns, cgo is probably acceptable. What about implement the L2 distance function in C and Cgo call it? That should give us best perf on both AMD64 and ARM.

Besides, I trust GCC/CLANG more than hand written assembly to be bug free.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants