### User and ### Response sent when using litelllm proxy #2971

harpomaxx · 2024-04-12T01:14:46Z

harpomaxx
Apr 12, 2024

I have encountered the following issue when using the litellm proxy with ollama models. According to the logs, when I used the --debug parameter with litellm, the ### User and ### Response prompt template are incorrectly sent to the ollama server regardless of the model specified.

From my understanding, the ollama server should inherently manage prompt templates, forwarding the correct format directly to the model without requiring modifications from the litellm proxy.

I'm not sure if I'm misunderstanding the setup or if there is an error in how litellm handles the prompts. Here is the code to reproduce the issue:

First I start llmlite proxy like this:

itellm --model ollama/gemma:7b-instruct-v1.1-q5_K_M  --debug

Then I use the following script to test the model

import openai # openai library version 1.0.0+
client = openai.OpenAI(api_key="anything", base_url="http://10.64.10.36:4000") # set proxy to base_url
response = client.chat.completions.create(model="shellm", messages=[
                {"role": "user", "content": "this is a test request, write a short poem"}])
print(response)
print(response.choices[0].message.content)

Here are the relevant logs indicating the issue:

INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:4000 (Press CTRL+C to quit)

00:42:53 - LiteLLM:INFO: utils.py:1112 -

POST Request Sent from LiteLLM:
curl -X POST \
http://localhost:11434/api/generate \
-d '{'model': 'gemma:7b-instruct-v1.1-q5_K_M', 'prompt': '### User:\nthis is a test request, write a short poem\n### Response:', 'options': {}, 'stream': False}'

INFO:     10.64.10.36:41676 - "POST /chat/completions HTTP/1.1" 200 OK

Any insight or verification on whether this behavior is intended or a potential configuration error would be highly appreciated.

krrishdholakia · 2024-05-07T15:43:49Z

krrishdholakia
May 7, 2024
Maintainer

hey @harpomaxx the ollama server handles prompt templating when calliing their chat endpoint (based on my understanding)

You can call it via ollama_chat/

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

### User and ### Response sent when using litelllm proxy #2971

{{title}}

Replies: 1 comment

{{title}}

Select a reply

### User and ### Response sent when using litelllm proxy #2971

harpomaxx Apr 12, 2024

Replies: 1 comment

krrishdholakia May 7, 2024 Maintainer

harpomaxx
Apr 12, 2024

krrishdholakia
May 7, 2024
Maintainer