Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HuggingFace Transformers and Diffusers as nuget packages #296

Open
AshD opened this issue Oct 24, 2024 · 11 comments
Open

HuggingFace Transformers and Diffusers as nuget packages #296

AshD opened this issue Oct 24, 2024 · 11 comments

Comments

@AshD
Copy link

AshD commented Oct 24, 2024

CSnakes looks very impressive after seeing the Sneak Peak video https://youtu.be/U4-95gMT_UA

Is it possible to package HuggingFace Transformers and Diffusers as nuget packages that can be pulled into .NET projects with wrappers to call them from C#.

This should take care of a lot of Gen AI uses cases.

@minesworld
Copy link

minesworld commented Oct 30, 2024

Think that isn't within the scope of CSnakes but projects which use CSnakes.

Maybe those who use that can publish the sources for

  • a class which prepares a folder with a requirements.txt, create the venv in that, fullfilling with pip
  • which provides the autogenerated C# calls

Could be nicer, but depends how such code (in a nuget package) should be used. And: up to now there is only one virtual environment. That means:

  • only one nuget package within a single c# project
  • no other python code which wants to use different modules within the created venv

as otherwise the nuget package might have some issues... And no nuget package provider wants to deal with that user feedback.

To be able the change that, CSnakes would have to split the PythonEnvironmentBuilder code into "usable" chunks, like VirtualEnvironment and pip classes that could be easily used without the builder context. Like from a nuget package whose "main" class would be given the path where the user wants to python stuff to be installed.

Multiple virtual environments could be created that way from C# and be used via a python wrapper which creates a subinterpreter for each one...

Solves one problem adds a new one: Python objects can not be passed between subinterpreters and such the result of one nuget package can not be used as the input of another nuget package without serialization/deserialization ...

@tonybaloney
Copy link
Owner

We might tackle this, see discussion in #270 so far.

@AshD
Copy link
Author

AshD commented Oct 31, 2024

Thanks for the replies.

I added Transformers and Diffusers to a .net 9 Console project using CSnakes and while it works, it needs a separate python env as @minesworld suggested, for the Transformers and Diffusers to make sure they don't break each other dependencies.

To explain my use case in more detail, Fusion Quill is a Windows WPF app that support multiple AI providers including Local providers using Llama.cpp GGUF and Onnx models. I wanted to add support for it for Transformers and Diffusers library to it and CSnakes is a good way to integrate it. My idea is that to put the code into separate .NET libraries that have their own CSnakes python env, so they don't mess up each other. Also, I am considering open sourcing these libraries so that other .NET devs can use it.

I also looked at integrating with exl2 python library (https://github.com/turboderp/exllamav2) since it is one of the fastest but I think it requires async for parallel queries.
https://github.com/turboderp/exllamav2/blob/master/examples/inference_async.py

Thanks @tonybaloney for CSnakes. Experience with it has be great so far.

@tonybaloney
Copy link
Owner

@AshD did you use the nuget locator or another one? I think it's possible to have a standalone package without any system dependencies using the nuget locator but it'll only be distributable on Windows. For Linux or Mac, the user will have to install separately. I'm looking into UV, which has a system for pulling a Python binary via API.

Regarding async, I'm looking into that to see if it were possible to await a Python coroutine, but in the interim you'd need to write a sync function as a wrapper.

@FlatlinerDOA
Copy link

@tonybaloney not to trivialise this but, wouldn't adding async "just" be a case of implementing INotifyCompletion or ICriticalNotifyCompletion?

@tonybaloney
Copy link
Owner

@tonybaloney not to trivialise this but, wouldn't adding async "just" be a case of implementing INotifyCompletion or ICriticalNotifyCompletion?

I wish. The .NET side is simple enough, the problem is that the Python coroutines need an event loop and there isn't a C-API for that.

We could wrap the asyncio library event loop in Python but the internals of it aren't easy to access.

@AshD
Copy link
Author

AshD commented Nov 4, 2024

@AshD did you use the nuget locator or another one? I think it's possible to have a standalone package without any system dependencies using the nuget locator but it'll only be distributable on Windows. For Linux or Mac, the user will have to install separately. I'm looking into UV, which has a system for pulling a Python binary via API.

I am using nuget locator for now. Will try to move it into a separate assembly and test it in the next few days.

Regarding async, I'm looking into that to see if it were possible to await a Python coroutine, but in the interim you'd need to write a sync function as a wrapper.

Need some guidance on how to do streaming of results back to .NET host when the async function is running inside a sync function python wrapper.

@minesworld
Copy link

minesworld commented Nov 5, 2024

I wish. The .NET side is simple enough, the problem is that the Python coroutines need an event loop and there isn't a C-API for that.

We could wrap the asyncio library event loop in Python but the internals of it aren't easy to access.

Isn't such a C-API only needed if both async IO "engines" would be mixed together? For those who might need that, it should be possible to use Pipes, sockets or whatever both sides can use from the operating system...

For the "rest of us" running C# async Tasks and CPython async loop independend from each other should work. and be sufficient.

Wrapping like #307.

We would need "only" Callbacks fom CPython to C# ...

Besides that: having a class/module which would provide a virtual socket between C# and CPython would be nice to have.

@AshD
Copy link
Author

AshD commented Nov 5, 2024

I published a preliminary project for HF Diffusers and Transformers integration. There are two library projects for them, each with it's own python env. On their own they work fine!
https://github.com/AshD/CSnakesIntegrations

I am having trouble getting two python envs when using them because it is registered as a singleton.
https://github.com/AshD/CSnakesIntegrations/blob/main/CSnakesIntegrations/Program.cs

@tonybaloney
Copy link
Owner

@AshD checkout the redistributable locator, we created it for this purpose https://tonybaloney.github.io/CSnakes/reference/#redistributable-locator

@AshD
Copy link
Author

AshD commented Jan 28, 2025

How do I support multiple python envs using the nuget package. I want a different env for Diffusers, Transformers, etc.

The nuget is the easiest way for python installations for Windows apps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants