-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fails to execute a python script using FastEmbed #315
Comments
Did you try with the source generator as well? |
Yeah, In Visual Studio it works just fine both using SourceGen or if I take the SourceGen code and set it up explicitly. The same code does not work on LINQPad. I cannot think of why the same generated code works in VS but not LINQPad, FYI the following works fine in LINQPad using your test_basic example; I am not sure why the python using FastEmbed does not. public sealed class ExampleDirectIntegration : IDisposable
{
private readonly PyObject module;
private readonly ILogger<IPythonEnvironment> logger;
internal ExampleDirectIntegration(IPythonEnvironment env)
{
this.logger = env.Logger;
using (GIL.Acquire())
{
logger.LogInformation("Importing module {ModuleName}", "test_basic");
module = Import.ImportModule("test_basic");
}
}
public void Dispose()
{
logger.LogInformation("Disposing module");
module.Dispose();
}
public double TestIntFloat(long a, double b)
{
using (GIL.Acquire())
{
logger.LogInformation("Invoking Python function: {FunctionName}", "test_int_float");
using var __underlyingPythonFunc = this.module.GetAttr("test_int_float");
using PyObject a_pyObject = PyObject.From(a);
using PyObject b_pyObject = PyObject.From(b);
using var __result_pyObject = __underlyingPythonFunc.Call(a_pyObject, b_pyObject);
return __result_pyObject.As<double>();
}
}
} Great project btw, very much appreciated. |
I'll spin up a project and see what's happening. I've never used LINQPad before though! Please comment with your thoughts on this issue too-- |
Just a side note. The return type of your function could be more specific: def generate_embeddings(documents: list[str]) -> list: I would specify: def generate_embeddings(documents: list[str]) -> list[list[float]]: If they are lists. If they are sequences but not necessarily lists, you can use the |
I added your code to the Simple Demo project and it worked as expected I can't see anything obvious? When you get the exception, look at the InnerException because it contains the Python stack trace as a property, that will give you more info |
Yeah as I mentioned, this works in Visual Studio but not in LINQPad. I am getting this exception: This is the integration I am using: public sealed class FastEmbedIntegration : IDisposable
{
private readonly PyObject module;
private readonly ILogger logger;
internal FastEmbedIntegration(ILogger<IPythonEnvironment> logger)
{
this.logger = logger;
using (GIL.Acquire())
{
logger.LogInformation("Importing module {ModuleName}", "fast_embed");
module = Import.ImportModule("fast_embed");
}
}
public void Dispose()
{
logger.LogInformation("Disposing module");
module.Dispose();
}
public IReadOnlyList<IReadOnlyList<double>> GenerateEmbeddings(IReadOnlyList<string> items)
{
using (GIL.Acquire())
{
logger.LogInformation("Invoking Python function: {FunctionName}", "generate_embeddings");
using PyObject __underlyingPythonFunc = this.module.GetAttr("generate_embeddings");
using PyObject a_pyObject = PyObject.From(items);
using PyObject __result_pyObject = __underlyingPythonFunc.Call(a_pyObject);
return __result_pyObject.As<IReadOnlyList<IReadOnlyList<double>>>();
}
}
} And this is the python: from fastembed import TextEmbedding
embedding_model = TextEmbedding()
def generate_embeddings(documents: list[str]) -> list[list[float]]:
embeddings_list = list(embedding_model.embed(documents))
return embeddings_list
documents = [
"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.",
"fastembed is supported by and maintained by Qdrant.",
]
embeddings = generate_embeddings(documents)
print(embeddings) |
Also a side question if I may. Are classes supported? e.g. from fastembed import TextEmbedding
class EmbeddingService:
"""
Service class for generating embeddings using fastembed.
Ensures the embedding model is initialized only once.
"""
def __init__(self):
self.embedding_model = TextEmbedding()
def generate_embeddings(self, documents: list[str]) -> list[list[float]]:
"""
Generate embeddings for a list of documents.
Args:
documents (list[str]): List of strings to embed.
Returns:
list[list[float]]: A list of embeddings, where each embedding is a list of floats.
"""
embeddings_list = list(self.embedding_model.embed(documents))
return embeddings_list
service = EmbeddingService()
documents = [
"This is built to be faster and lighter than other embedding libraries e.g. Transformers, Sentence-Transformers, etc.",
"fastembed is supported by and maintained by Qdrant.",
]
embeddings = service.generate_embeddings(documents)
print(embeddings) |
Calling a method on a class instance? Or casting a Python class back to a CLR type? |
Whichever I can get working I guess. Running CodeGen on the above class based python resulted in the following integration: public static class FastEmbedClassExtensions
{
private static IFastEmbedClass? instance;
private static ReadOnlySpan<byte> HotReloadHash => "fffa817c9a4729f2c6a7536b2b15f1d4"u8;
public static IFastEmbedClass FastEmbedClass(this IPythonEnvironment env)
{
if (instance is null)
{
instance = new FastEmbedClassInternal(env.Logger);
}
Debug.Assert(!env.IsDisposed());
return instance;
}
public static void UpdateApplication(Type[]? updatedTypes) {
instance?.ReloadModule();
}
private class FastEmbedClassInternal : IFastEmbedClass
{
private PyObject module;
private readonly ILogger<IPythonEnvironment> logger;
internal FastEmbedClassInternal(ILogger<IPythonEnvironment> logger)
{
this.logger = logger;
using (GIL.Acquire())
{
logger.LogDebug("Importing module {ModuleName}", "fast_embed_class");
module = Import.ImportModule("fast_embed_class");
}
}
void IReloadableModuleImport.ReloadModule() {
logger.LogDebug("Reloading module {ModuleName}", "fast_embed_class");
using (GIL.Acquire())
{
Import.ReloadModule(ref module);
// Dispose old functions
}
}
public void Dispose()
{
logger.LogDebug("Disposing module {ModuleName}", "fast_embed_class");
module.Dispose();
}
}
} |
we don't do any codegen for classes (yet) so what you'd need to do is:
It would look roughly like this: public IReadOnlyList<IReadOnlyList<double>> GenerateEmbeddings(IReadOnlyList<string> items)
{
using (GIL.Acquire())
{
logger.LogInformation("Invoking Python function: {FunctionName}", "EmbeddingService.generate_embeddings");
// Get Class
using PyObject __underlyingPythonType = this.module.GetAttr("EmbeddingService");
// Run __init__ (provide args if required. Call takes param[] PyObject)
using PyObject __instance= __underlyingPythonType.Call();
// Cast the param
using PyObject a_pyObject = PyObject.From(items);
// Get the method
using PyObject __underlyingPythonMethod = __instance.GetAttr("generate_embeddings");
using PyObject __result_pyObject = __underlyingPythonMethod.Call(a_pyObject);
return __result_pyObject.As<IReadOnlyList<IReadOnlyList<double>>>();
}
} For improved performance, I would store the |
I am trying to execute the following module using python 3.12.4 on .NET 8:
I also have the following integration:
Which I call using:
Which throws:
If I comment the lines
#1 and #2
then I correctly receive the arrays.The text was updated successfully, but these errors were encountered: