[feat] Repo registries and RAG workflows #154

sroussey · 2026-01-03T18:28:07Z

No description provided.

…ved input data handling for tests - Replaced structuredClone and JSON methods with a new smartClone function that deep-clones plain objects and arrays while preserving class instances by reference. - quick versions of tasks as functions now pass input to run not the constructor which means no defaults and cloning

…ng additional input properties.

Copilot

Pull request overview

This PR introduces a new VectorQuantizeTask for efficient vector quantization and refactors vector utilities into reusable modules. The changes improve code organization by extracting common vector operations from VectorSimilarityTask into dedicated utility files.

New VectorQuantizeTask supporting multiple quantization types (INT8, UINT8, INT16, UINT16, FLOAT16, FLOAT32, FLOAT64)
Refactored vector utilities into VectorUtils and VectorSimilarityUtils modules for reusability
Updated VectorSimilarityTask to use the new utility functions and renamed similarity parameter to method

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
packages/util/src/vector/VectorUtils.ts	New utility module providing magnitude, inner product, and normalize functions for vector operations
packages/util/src/vector/VectorSimilarityUtils.ts	New utility module with cosine, Jaccard, and Hamming similarity/distance calculations
packages/util/src/vector/TypedArray.ts	Type definitions and JSON schemas for supported typed array types (Float16/32/64, Int8/16, Uint8/16)
packages/util/src/vector/Tensor.ts	Schema definitions for tensor/vector data structures with type, data, shape, and normalization properties
packages/util/src/json-schema/SchemaValidation.ts	Updated import to use @sroussey/json-schema-library package
packages/util/src/common.ts	Added exports for new vector utility modules
packages/util/package.json	Updated dependency from json-schema-library to @sroussey/json-schema-library
packages/test/src/test/task/VectorQuantizeTask.test.ts	Comprehensive test suite for VectorQuantizeTask covering all quantization types and edge cases
packages/task-graph/src/task/Task.ts	Updated stripSymbols to preserve TypedArrays by detecting ArrayBuffer views
packages/ai/src/task/index.ts	Added export for VectorQuantizeTask
packages/ai/src/task/base/AiTaskSchemas.ts	Refactored to import TypedArray and related types from @workglow/util, removed duplicate definitions
packages/ai/src/task/VectorSimilarityTask.ts	Refactored to use imported similarity functions from @workglow/util, removed local implementations, renamed `similarity` parameter to `method`
packages/ai/src/task/VectorQuantizeTask.ts	New task implementing vector quantization with normalization and multiple target type support
packages/ai/src/task/TextEmbeddingTask.ts	Updated imports to use TypedArraySchema from @workglow/util
packages/ai/src/task/ImageEmbeddingTask.ts	Updated imports to use TypedArraySchema from @workglow/util
packages/ai-provider/src/hf-transformers/common/HFT_JobRunFns.ts	Updated to import TypedArray from @workglow/util instead of @workglow/ai
packages/ai-provider/README.md	Updated comment to use "Vector" instead of "TypedArray" in code example
bun.lock	Updated lockfile with new @sroussey/json-schema-library dependency

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/util/src/vector/Tensor.ts

Copilot · 2026-01-03T19:19:56Z

packages/ai/src/task/VectorQuantizeTask.ts

+  private quantizeToUint8(values: number[]): Uint8Array {
+    // Find min/max for scaling
+    const min = Math.min(...values);
+    const max = Math.max(...values);
+    const range = max - min || 1;
+
+    // Scale to [0, 255]
+    return new Uint8Array(values.map((v) => Math.round(((v - min) / range) * 255)));
+  }


The quantizeToUint8 and quantizeToUint16 methods use spread operator with Math.min/Math.max on the entire values array. For large vectors, this is inefficient as it creates multiple intermediate arrays. Consider using a single loop to find both min and max values simultaneously, which would be more performant and memory-efficient.