Using LiteLLM in Gradio App? #631
Replies: 2 comments 10 replies
-
Hey @dcruiz01 what's the issue you're facing? |
Beta Was this translation helpful? Give feedback.
0 replies
-
def inference(message, history):
try:
flattened_history = [item for sublist in history for item in sublist]
full_message = " ".join(flattened_history + [message])
messages = [{"role": "user", "content": full_message}]
partial_message = ""
for chunk in litellm.completion(
model=<my-model>
messages=messages,
max_tokens=512,
temperature=.7,
top_p=.9,
repetition_penalty=1.18,
stream=True):
partial_message += chunk
yield partial_message
except Exception as e:
# Print the exception to the console for debugging
print("Exception encountered:", str(e))
# Optionally, you can yield a message to the user
yield f"An Error occured please 'Clear' the error and try your question again" |
Beta Was this translation helpful? Give feedback.
10 replies
Answer selected by
ZQ-Dev8
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello! Thank you for the awesome library. I just stumbled upon LiteLLM and I want to integrate it with existing projects to remove the overhead of dealing with prompt templates. I'm attempting to integrate LiteLLM with an existing gradio app but running into issues. Is there an existing example out there for reference?
Specifically, I am seeking to modify the
inference()
function you typically find in Gradio chatbot demos to include LiteLLM prompt formatting, e.g. I want to modify the for loop below to stream the response from LiteLLM, rather than the typical Gradio way:Beta Was this translation helpful? Give feedback.
All reactions