[MLOB-1954] feat(langchain): generically patch embeddings to enable tracing all embeddings calls #4970
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
Generically patches LangChain embeddings. We previously hadn't done this because there was no way to patch the prototype for the
Embeddings
class, since it was anabstract class
transpiled from TypeScript, and its abstract functionsembedQuery
andembedDocuments
which we wanted to patch did not end up on the transpiled class.Instead, we will patch the embeddings exports. Its constructor will now try and shim the
embedQuery
andembedDocuments
functions, as all implementers ofEmbeddings
should implement these methods. While this isn't ideal, it does unblock us from adding provider-specific patches per embedding calls (i.e., we specifically patched@langchain/openai
before). Any implementer ofEmbeddings
from@langchain/core/embeddings
will be patched.To go along with this, I added a small test for another embeddings provider (Google Gemini), which revealed some mistakes in logic in grabbing the API key to truncate and tag. One of the downsides of this approach is that we may run into more cases like this (since
Embeddings
do not have a strictly typed config or properties for config).Motivation
Reduce noise/line count in the
langchain
patching for futureEmbeddings
support, just support all instances ofEmbeddings
instead.Additional Notes
I will try and rework
langchain
version matrixing for these tests... it's a bit ugly atm.