Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: full mocking #507

Draft
wants to merge 46 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
71311ef
feat: add hybrid router to sync tests
jamescalam Jan 1, 2025
69c2e9b
fix: allow types to work between pinecone and hybrid
jamescalam Jan 1, 2025
73af1ed
fix: sparse emb index type support
jamescalam Jan 2, 2025
36d6928
Merge branch 'main' into james/full-hybrid-support
jamescalam Jan 2, 2025
da7ea31
fix: wrong host on some async methods
jamescalam Jan 3, 2025
4788c61
fix: async usage and tests
jamescalam Jan 3, 2025
414f41b
chore: lint
jamescalam Jan 3, 2025
0f5ffd1
fix: missing router_cls
jamescalam Jan 3, 2025
84d9a7d
chore: lint
jamescalam Jan 3, 2025
ec4c216
feat: simplify and align routers call methods
jamescalam Jan 3, 2025
d6a2058
fix: deprecated multiple routes query
jamescalam Jan 3, 2025
2db93bb
fix: improved openai mocking
jamescalam Jan 4, 2025
dd01f68
fix: hybrid router encoder score tweak
jamescalam Jan 4, 2025
7eafd8f
chore: lint
jamescalam Jan 4, 2025
a4c593f
feat: modify pytest to exit on first fail
jamescalam Jan 4, 2025
5eebdf5
feat: modify pytest to exit on first fail
jamescalam Jan 4, 2025
2c351c8
feat: simplify sr test and remove hybrid router tests
jamescalam Jan 4, 2025
0b72dfb
chore: increase pinecone wait time
jamescalam Jan 4, 2025
7862d39
fix: missing assertion logic
jamescalam Jan 4, 2025
2608dcb
fix: openai mock
jamescalam Jan 4, 2025
4b890e9
fix: openai mock
jamescalam Jan 4, 2025
ca25f7f
chore: lint
jamescalam Jan 4, 2025
8250e32
fix: pinecone delays
jamescalam Jan 4, 2025
bdbbfcb
chore: lint
jamescalam Jan 4, 2025
0d617ac
chore: lint
jamescalam Jan 4, 2025
1a06ae6
fix: pinecone delays
jamescalam Jan 4, 2025
d38d03f
chore: lint
jamescalam Jan 4, 2025
c30652d
fix: pinecone delays
jamescalam Jan 4, 2025
1141d73
fix: pinecone delays
jamescalam Jan 5, 2025
4930d46
fix: pinecone delays
jamescalam Jan 5, 2025
540087b
fix: remove value error for default index
jamescalam Jan 6, 2025
6ce753e
fix: raise error if index not initialized at router level
jamescalam Jan 6, 2025
e12f8eb
fix: vector only test
jamescalam Jan 6, 2025
90fe4d1
feat: modify index readiness checks
jamescalam Jan 6, 2025
5791bfa
fix: sparse vector testing
jamescalam Jan 6, 2025
0c96bf6
fix: tests
jamescalam Jan 6, 2025
9912522
fix: try-except logic
jamescalam Jan 7, 2025
0ebc827
fix: try-except and new is_ready test
jamescalam Jan 7, 2025
2ea6b9f
fix: no vector test
jamescalam Jan 7, 2025
a2db91a
fix: increase timeout for no vector test
jamescalam Jan 7, 2025
76a68e0
fix: optimize router testing
jamescalam Jan 7, 2025
0326e74
fix: threshold checks in tests for hybrid
jamescalam Jan 7, 2025
d93fcf3
fix: add more waits for pc stability
jamescalam Jan 7, 2025
fe2e74f
fix: remaining RouterOnly tests and cleanup for score_threshold checks
jamescalam Jan 7, 2025
1b06252
fix: fit and eval for hybrid router
jamescalam Jan 7, 2025
caede38
feat: v1 of openai and bm25 mock
jamescalam Jan 8, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -12,11 +12,11 @@ lint lint_diff:
poetry run mypy $(PYTHON_FILES)

test:
poetry run pytest -vv --cov=semantic_router --cov-report=term-missing --cov-report=xml
poetry run pytest -vv --cov=semantic_router --cov-report=term-missing --cov-report=xml --exitfirst --maxfail=1

test_functional:
poetry run pytest -vv -n 20 tests/functional
poetry run pytest -vv --exitfirst --maxfail=1 tests/functional
test_unit:
poetry run pytest -vv -n 20 tests/unit
poetry run pytest -vv --exitfirst --maxfail=1 tests/unit
test_integration:
poetry run pytest -vv -n 20 tests/integration
poetry run pytest -vv --exitfirst --maxfail=1 tests/integration
56 changes: 0 additions & 56 deletions docs/00-introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -279,62 +279,6 @@
"sr(\"I'm interested in learning about llama 2\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "dDZF2eN4f3p4"
},
"source": [
"We can also retrieve multiple routes with its associated score:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "n27I7kmpf3p4",
"outputId": "2138e077-190b-41b7-a3eb-4fd76e2f59c2"
},
"outputs": [
{
"data": {
"text/plain": [
"[RouteChoice(name='politics', function_call=None, similarity_score=0.8595844842560181),\n",
" RouteChoice(name='chitchat', function_call=None, similarity_score=0.8356704527362284)]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sr.retrieve_multiple_routes(\"Hi! How are you doing in politics??\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "zi4XJ7Amf3p4",
"outputId": "cf05cd65-d4f4-454a-ef05-77f16f37cc8f"
},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sr.retrieve_multiple_routes(\"I'm interested in learning about llama 2\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
1 change: 1 addition & 0 deletions semantic_router/encoders/bm25.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ def fit(self, routes: List[Route]):
self.model.fit(corpus=utterances)

def __call__(self, docs: List[str]) -> list[SparseEmbedding]:
print(f"JBTEMP: {docs}")
if self.model is None:
raise ValueError("Model or index mapping is not initialized.")
if len(docs) == 1:
Expand Down
25 changes: 21 additions & 4 deletions semantic_router/index/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,12 @@
RETRY_WAIT_TIME = 2.5


class IndexConfig(BaseModel):
type: str
dimensions: int
vectors: int


class BaseIndex(BaseModel):
"""
Base class for indices using Pydantic's BaseModel.
Expand All @@ -38,6 +44,7 @@ def add(
utterances: List[Any],
function_schemas: Optional[List[Dict[str, Any]]] = None,
metadata_list: List[Dict[str, Any]] = [],
**kwargs,
):
"""Add embeddings to the index.
This method should be implemented by subclasses.
Expand All @@ -51,6 +58,7 @@ async def aadd(
utterances: List[str],
function_schemas: Optional[Optional[List[Dict[str, Any]]]] = None,
metadata_list: List[Dict[str, Any]] = [],
**kwargs,
):
"""Add vectors to the index asynchronously.
This method should be implemented by subclasses.
Expand All @@ -62,6 +70,7 @@ async def aadd(
utterances=utterances,
function_schemas=function_schemas,
metadata_list=metadata_list,
**kwargs,
)

def get_utterances(self) -> List[Utterance]:
Expand Down Expand Up @@ -143,10 +152,17 @@ def delete(self, route_name: str):
"""
raise NotImplementedError("This method should be implemented by subclasses.")

def describe(self) -> Dict:
def describe(self) -> IndexConfig:
"""
Returns an IndexConfig object with index details such as type, dimensions, and
total vector count.
This method should be implemented by subclasses.
"""
raise NotImplementedError("This method should be implemented by subclasses.")

def is_ready(self) -> bool:
"""
Returns a dictionary with index details such as type, dimensions, and total
vector count.
Checks if the index is ready to be used.
This method should be implemented by subclasses.
"""
raise NotImplementedError("This method should be implemented by subclasses.")
Expand Down Expand Up @@ -238,7 +254,7 @@ async def _async_read_config(
:return: The config parameter that was read.
:rtype: ConfigParameter
"""
logger.warning("Async method not implemented.")
logger.warning("_async_read_config method not implemented.")
return self._read_config(field=field, scope=scope)

def _write_config(self, config: ConfigParameter) -> ConfigParameter:
Expand Down Expand Up @@ -353,6 +369,7 @@ async def alock(
"""Lock/unlock the index for a given scope (if applicable). If index
already locked/unlocked, raises ValueError.
"""
logger.warning(f"JBTEMP alock method called with {value=} {wait=} {scope=}")
start_time = datetime.now()
while True:
if await self._ais_locked(scope=scope) != value:
Expand Down
8 changes: 1 addition & 7 deletions semantic_router/index/hybrid_local.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@ def add(
function_schemas: Optional[List[Dict[str, Any]]] = None,
metadata_list: List[Dict[str, Any]] = [],
sparse_embeddings: Optional[List[SparseEmbedding]] = None,
**kwargs,
):
if sparse_embeddings is None:
raise ValueError("Sparse embeddings are required for HybridLocalIndex.")
Expand Down Expand Up @@ -66,13 +67,6 @@ def get_utterances(self) -> List[Utterance]:
return []
return [Utterance.from_tuple(x) for x in zip(self.routes, self.utterances)]

def describe(self) -> Dict:
return {
"type": self.type,
"dimensions": self.index.shape[1] if self.index is not None else 0,
"vectors": self.index.shape[0] if self.index is not None else 0,
}

def _sparse_dot_product(
self, vec_a: dict[int, float], vec_b: dict[int, float]
) -> float:
Expand Down
21 changes: 14 additions & 7 deletions semantic_router/index/local.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import numpy as np

from semantic_router.schema import ConfigParameter, SparseEmbedding, Utterance
from semantic_router.index.base import BaseIndex
from semantic_router.index.base import BaseIndex, IndexConfig
from semantic_router.linear import similarity_matrix, top_scores
from semantic_router.utils.logger import logger
from typing import Any
Expand All @@ -26,6 +26,7 @@ def add(
utterances: List[str],
function_schemas: Optional[List[Dict[str, Any]]] = None,
metadata_list: List[Dict[str, Any]] = [],
**kwargs,
):
embeds = np.array(embeddings) # type: ignore
routes_arr = np.array(routes)
Expand Down Expand Up @@ -74,12 +75,18 @@ def get_utterances(self) -> List[Utterance]:
return []
return [Utterance.from_tuple(x) for x in zip(self.routes, self.utterances)]

def describe(self) -> Dict:
return {
"type": self.type,
"dimensions": self.index.shape[1] if self.index is not None else 0,
"vectors": self.index.shape[0] if self.index is not None else 0,
}
def describe(self) -> IndexConfig:
return IndexConfig(
type=self.type,
dimensions=self.index.shape[1] if self.index is not None else 0,
vectors=self.index.shape[0] if self.index is not None else 0,
)

def is_ready(self) -> bool:
"""
Checks if the index is ready to be used.
"""
return self.index is not None and self.routes is not None

def query(
self,
Expand Down
Loading
Loading