Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hugging Face Inference API Support #18

Open
1 task done
PSchmiedmayer opened this issue Jul 18, 2023 · 1 comment
Open
1 task done

Hugging Face Inference API Support #18

PSchmiedmayer opened this issue Jul 18, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@PSchmiedmayer
Copy link
Member

Problem

Similar to our support for OpenAI models, it would be great if SpeziML can support a wider variety of models such as the ones hosted on Hugging Face.

Solution

Integrating a light wrapper around the Hugging Face hosted Inference API would be a great way to, e.g., interact with different LLMs hosted on Hugging Face.

More details about the Hugging Face API can be found here: https://huggingface.co/docs/api-inference/index

In order to abstract different API types in an LLM-agnostic manner we should add a general protocol structure that both the Hugging Face and the Open AI module conform to. This would allow developers to easily transition between different providers.
The Swift protocol structure should ideally be fitting to different model types, starting with typical LLM interactions as this is the main use case of our OpenAI component at this point.

Additional context

Initial focus should probably be on the Test Generation task: https://huggingface.co/docs/api-inference/detailed_parameters#text-generation-task. We should also investigate if there are already any suiting Swift Packages that abstract the Hugging Face API out there.

Code of Conduct

  • I agree to follow this project's Code of Conduct and Contributing Guidelines
@ishaan-jaff
Copy link

Hi @PSchmiedmayer @philippzagar I’m the maintainer of LiteLLM (abstraction to call 100+ LLMs)- we allow you to create a proxy server to call 100+ LLMs, and I think it can solve your problem (I'd love your feedback if it does not)

Try it here: https://docs.litellm.ai/docs/proxy_server
https://github.com/BerriAI/litellm

Using LiteLLM Proxy Server

import openai
openai.api_base = "http://0.0.0.0:8000/" # proxy url
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

Creating a proxy server

Ollama models

$ litellm --model ollama/llama2 --api_base http://localhost:11434

Hugging Face Models

$ export HUGGINGFACE_API_KEY=my-api-key #[OPTIONAL]
$ litellm --model claude-instant-1

Anthropic

$ export ANTHROPIC_API_KEY=my-api-key
$ litellm --model claude-instant-1

Palm

$ export PALM_API_KEY=my-palm-key
$ litellm --model palm/chat-bison

@philippzagar philippzagar removed their assignment Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

3 participants