Ollama llama3.x models do not work with LangChain chat/tool integration #12780

tkarna · 2025-02-06T11:28:20Z

Llama3.x models ran through ipex-llm's ollama do not work with LangChain chat/tool integration.

Here's a minimal example of a chat model with tools:

# pip install langchain langchain-ollama
# ollama pull llama3.2:3b-instruct-q4_K_M
from langchain_core.tools import tool
from langchain_ollama.chat_models import ChatOllama


@tool
def get_weather(location: str):
    """Call to get the current weather."""
    if location.lower() in ["sf", "san francisco"]:
        return "It's 60 degrees and foggy."
    else:
        return "It's 90 degrees and sunny."


model = ChatOllama(
    model= "llama3.2:3b-instruct-q4_K_M",
    num_predict=50,  # limit number of tokens to stop hallucination
)

tools = [get_weather]
model_with_tools = model.bind_tools(tools)

res = model_with_tools.invoke("what's the weather in sf?")
res.pretty_print()

This example runs correctly with standard ollama. Expected response with tool arguments:

================================== Ai Message ==================================
Tool Calls:
  get_weather (328879e5-247c-48d1-9013-39f0e1b65539)
 Call ID: 328879e5-247c-48d1-9013-39f0e1b65539
  Args:
    location: sf

Actual output shows that the model just hallucinates:

================================== Ai Message ==================================

I hope that a new
$ has several times this would be =_._ _-level was an item 8/<< is not only to the best
To view= (or is a significant but also knows are still allow [or)

Tested with:

Ubuntu 22.04.5 LTS
oneapi/2025.0
python 3.10.0
ipex-llm 2.2.0b20250105, 2.2.0b20250123
langchain 0.3.17
langchain-ollama 0.2.3
GPU: Intel(R) Data Center GPU Max 1100

tkarna · 2025-02-06T13:33:14Z

Ollama server is able to generate JSON output however. This command

curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
  "model": "llama3.2:1b-instruct-q4_K_M",
  "messages": [{"role": "user", "content": "Tell me about Canada."}],
  "stream": false,
  "format": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string"
      },
      "capital": {
        "type": "string"
      },
      "languages": {
        "type": "array",
        "items": {
          "type": "string"
        }
      }
    },
    "required": [
      "name",
      "capital",
      "languages"
    ]
  }
}'

produces a correct result

{
  "model":"llama3.2:1b-instruct-q4_K_M",
  "created_at":"2025-02-06T13:21:51.954049417Z",
  "message":{
    "role":"assistant",
    "content":"{ \"capital\": \"Ottawa\", \"languages\": [\"English\", \"French\"], \"name\": \"Canada\" }"
  },
  "done_reason":"stop",
  "done":true,
  "total_duration":3687525616,
  "load_duration":2630099684,
  "prompt_eval_count":30,
  "prompt_eval_duration":580000000,
  "eval_count":33,
  "eval_duration":473000000
}

sgwhat · 2025-02-07T02:22:55Z

Hi @tkarna , May I ask what "standard ollama" refers to? Is it "ollama run" or the community version of ollama?

tkarna · 2025-02-07T06:37:52Z

Hi @tkarna , May I ask what "standard ollama" refers to? Is it "ollama run" or the community version of ollama?

Community ollama running on CPU. I have also compiled ollama 3.13 with Intel GPU support. The above test case works with both of these.

qiyuangong assigned sgwhat Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama llama3.x models do not work with LangChain chat/tool integration #12780

Ollama llama3.x models do not work with LangChain chat/tool integration #12780

tkarna commented Feb 6, 2025

tkarna commented Feb 6, 2025

sgwhat commented Feb 7, 2025

tkarna commented Feb 7, 2025

Ollama llama3.x models do not work with LangChain chat/tool integration #12780

Ollama llama3.x models do not work with LangChain chat/tool integration #12780

Comments

tkarna commented Feb 6, 2025

tkarna commented Feb 6, 2025

sgwhat commented Feb 7, 2025

tkarna commented Feb 7, 2025