-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dimension support #72
Changes from 3 commits
be4428e
e7366c9
989d3bf
de53f79
45b7c23
5025e16
190f3df
164c0b9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
import os | ||
from typing import Union, List, Any, Optional | ||
from typing import Union, List, Any, Optional, Dict | ||
from pinecone_text.dense.base_dense_ecoder import BaseDenseEncoder | ||
|
||
try: | ||
|
@@ -38,6 +38,8 @@ class OpenAIEncoder(BaseDenseEncoder): | |
def __init__( | ||
self, | ||
model_name: str = "text-embedding-ada-002", | ||
*, | ||
dimension: Optional[int] = None, | ||
**kwargs: Any, | ||
): | ||
if not _openai_installed: | ||
|
@@ -46,7 +48,9 @@ def __init__( | |
"dependencies by running: " | ||
"`pip install pinecone-text[openai]" | ||
) | ||
super().__init__(dimension=dimension) | ||
self._model_name = model_name | ||
self._dimension = dimension | ||
acatav marked this conversation as resolved.
Show resolved
Hide resolved
|
||
self._client = self._create_client(**kwargs) | ||
|
||
@staticmethod | ||
|
@@ -76,9 +80,13 @@ def _encode( | |
) | ||
|
||
try: | ||
response = self._client.embeddings.create( | ||
input=texts_input, model=self._model_name | ||
params: Dict[str, Any] = dict( | ||
input=texts_input, | ||
model=self._model_name, | ||
) | ||
if self._dimension: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. either we should have here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right fixing |
||
params["dimensions"] = self._dimension | ||
response = self._client.embeddings.create(**params) | ||
except OpenAIError as e: | ||
# TODO: consider wrapping external provider errors | ||
raise e | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we put dimension as optional param for the base class, we should either allow it in all the child classes or not in the base at all. Right now it seems we only going to allow it for OpenAI, so maybe better to not put it in the base class? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to have dimension accessible to all the child classes since it will be weird if only OpenAI exposes. You are right in the fact that in the base init we should not receive this parameter. I create a private property with a default now and override in OpenAI now. I only define as an initialization parameter in OpenAI one.