Skip to content

Using LiteLLM in Gradio App? #631

Answered by krrishdholakia
ZQ-Dev8 asked this question in Q&A
Discussion options

You must be logged in to vote
def inference(message, history):
    try:
        flattened_history = [item for sublist in history for item in sublist]
        full_message = " ".join(flattened_history + [message])
        messages = [{"role": "user", "content": full_message}]
        partial_message = ""
        for chunk in litellm.completion(
            model=<my-model>
            messages=messages,
            max_tokens=512,
            temperature=.7,
            top_p=.9,
            repetition_penalty=1.18,
            stream=True):
            partial_message += chunk
            yield partial_message
    except Exception as e:
        # Print the exception to the console for debugging
        print("Exceptio…

Replies: 2 comments 10 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
10 replies
@krrishdholakia
Comment options

@ZQ-Dev8
Comment options

@krrishdholakia
Comment options

@ZQ-Dev8
Comment options

@flefevre
Comment options

Answer selected by ZQ-Dev8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants