Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add providers refactor #56

Merged
merged 33 commits into from
May 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
bf85787
Prepare additional providers and refactor.
Maximilian-Winter May 8, 2024
ef86537
Refactor structured output settings.
Maximilian-Winter May 8, 2024
733f15f
Further Refactor and Reformat
Maximilian-Winter May 9, 2024
24339a8
Finished provider implementation
Maximilian-Winter May 9, 2024
f43a767
Split providers
Maximilian-Winter May 9, 2024
159db2b
Refactor agent, include new providers
Maximilian-Winter May 9, 2024
c714c36
Fixed Function Calling Agent
Maximilian-Winter May 9, 2024
c2cd14e
Fixed more examples.
Maximilian-Winter May 9, 2024
40a387e
Update chain.py
Maximilian-Winter May 9, 2024
4257b9d
Update function_calling_agent.py
Maximilian-Winter May 10, 2024
3eb3a73
Fix function calling
Maximilian-Winter May 10, 2024
18237e7
Fixed parsing of objects. Performance improvements JSON schema
Maximilian-Winter May 10, 2024
e2241d2
Update book_dataset_creation.py
Maximilian-Winter May 10, 2024
5e9b3b7
Update dataframe_creation.py
Maximilian-Winter May 10, 2024
ce4ae3f
Update Examples.
Maximilian-Winter May 10, 2024
a375dee
Create chatbot_using_llama_cpp_python_server.py
Maximilian-Winter May 10, 2024
4eec900
Further Example Update
Maximilian-Winter May 10, 2024
25b9a25
Update ReadMe.md
Maximilian-Winter May 10, 2024
72c024b
Moved Agents
Maximilian-Winter May 10, 2024
9fb57bd
Further Refactor
Maximilian-Winter May 10, 2024
831f33e
Added get started guide
Maximilian-Winter May 10, 2024
78bfb1e
Updated examples
Maximilian-Winter May 10, 2024
641d5ba
Finished get started
Maximilian-Winter May 10, 2024
ae55d9a
Update get-started.md
Maximilian-Winter May 10, 2024
35b25dd
Delete knowledge_graph.png
Maximilian-Winter May 11, 2024
1ed28a1
Further refactoring of agent, use of new ChatHistory class, reformat …
Maximilian-Winter May 11, 2024
4f3d53e
Extended test script.
Maximilian-Winter May 11, 2024
f7ce858
Prepared last things for merging with master
Maximilian-Winter May 12, 2024
19f97cf
Update get-started.md
Maximilian-Winter May 12, 2024
bbf6f7c
Update use_llama_index_query_engine_as_tool.py
Maximilian-Winter May 12, 2024
2730387
Corrected all imports and remove grammar and spelling mistakes.
Maximilian-Winter May 12, 2024
71924be
Update get-started.md
Maximilian-Winter May 12, 2024
2449dc7
Updated docs
Maximilian-Winter May 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 7 additions & 30 deletions ReadMe.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,15 +26,19 @@
- [FAQ](#faq)

## Introduction
The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools. The framework integrates seamlessly with the llama.cpp server, llama-cpp-python and OpenAI endpoints that support grammar, offering flexibility and extensibility.
The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools.

The framework uses guided sampling to constrain the model output to the user defined structures. This way also models not fine-tuned to do function calling and JSON output will be able to do it.

The framework is compatible with the llama.cpp server, llama-cpp-python and its server, and with TGI and vllm servers.

## Key Features
- **Simple Chat Interface**: Engage in seamless conversations with LLMs.
- **Structured Output**: Generate structured output (objects) from LLMs.
- **Single and Parallel Function Calling**: Execute functions using LLMs.
- **RAG - Retrieval Augmented Generation**: Perform retrieval augmented generation with colbert reranking.
- **Agent Chains**: Process text using agent chains with tools, supporting Conversational, Sequential, and Mapping Chains.
- **Compatibility**: Works with llama-index tools and OpenAI tool schemas.
- **Compatibility**: Works with python functions, pydantic tools, llama-index tools, and OpenAI tool schemas.
- **Flexibility**: Suitable for various applications, from casual chatting to specific function executions.

## Installation
Expand All @@ -46,34 +50,7 @@ pip install llama-cpp-agent
You can find the latest documentation [here!](https://llama-cpp-agent.readthedocs.io/en/latest/)

## Getting Started
1. Ensure you have the required dependencies installed, including pydantic and llama-cpp-python.
2. Import the necessary classes and functions from the llama-cpp-agent framework.
3. Set up your LLM provider (e.g., llama-cpp-python Llama class, OpenAI endpoint with grammar support like llama-cpp-python server, or llama.cpp server).
4. Create an instance of the desired agent class (e.g., LlamaCppAgent, FunctionCallingAgent, StructuredOutputAgent).
5. Interact with the agent using the provided methods and examples.

Here's a basic example of using the LlamaCppAgent for a simple chat:

```python
from llama_cpp_agent.llm_agent import LlamaCppAgent
from llama_cpp_agent.messages_formatter import MessagesFormatterType
from llama_cpp_agent.providers.llama_cpp_endpoint_provider import LlamaCppEndpointSettings

main_model = LlamaCppEndpointSettings(
completions_endpoint_url="http://127.0.0.1:8080/completion"
)

llama_cpp_agent = LlamaCppAgent(main_model, debug_output=False,
system_prompt="You are Dolphin, an uncensored and unbiased AI assistant.",
predefined_messages_formatter_type=MessagesFormatterType.CHATML)

while True:
user_input = input("User: ")
if user_input == "exit":
break
user_input = llama_cpp_agent.get_chat_response(user_input, temperature=0.7)
print("AI: " + user_input)
```
You can find the get started guide [here!](https://llama-cpp-agent.readthedocs.io/en/latest/)

## Discord Community
Join the Discord Community [here](https://discord.gg/6tGznupZGX)
Expand Down
6 changes: 5 additions & 1 deletion docs/agents-api-reference.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,15 @@
---
title: API Reference
title: Agents Reference
---

## Agents

::: llama_cpp_agent.llm_agent

## Structured Output Settings

::: llama_cpp_agent.llm_output_settings.settings

### Function Calling Agent

::: llama_cpp_agent.function_calling_agent
Expand Down
168 changes: 58 additions & 110 deletions docs/function-calling-agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,43 +3,31 @@ This example shows how to use the FunctionCallingAgent for function calling with

```python
# Example that uses the FunctionCallingAgent class to create a function calling agent.
import json
import datetime
from enum import Enum
from typing import Union, Any
from typing import Union, Optional

from pydantic import BaseModel, Field

from llama_cpp_agent.llm_settings import LlamaLLMSettings, LlamaLLMGenerationSettings
from llama_cpp_agent import LlamaCppFunctionTool
from llama_cpp_agent import FunctionCallingAgent
from llama_cpp_agent import MessagesFormatterType
from llama_cpp_agent.providers import TGIServerProvider

from llama_cpp_agent.function_calling_agent import FunctionCallingAgent
model = TGIServerProvider("http://localhost:8080")


# llama-cpp-agent supports type hinted function definitions for function calling.
# Write to file function that can be used by the agent. Docstring will be used in system prompt.
def write_to_file(chain_of_thought: str, file_path: str, file_content: str):
# Simple tool for the agent, to get the current date and time in a specific format.
def get_current_datetime(output_format: Optional[str] = None):
"""
Write file to the user filesystem.
:param chain_of_thought: Your chain of thought while writing the file.
:param file_path: The file path includes the filename and file ending.
:param file_content: The actual content to write.
"""
print(chain_of_thought)
with open(file_path, mode="w", encoding="utf-8") as file:
file.write(file_content)
return f"File {file_path} successfully written."

Get the current date and time in the given format.

# Read file function that can be used by the agent. Docstring will be used in system prompt.
def read_file(file_path: str):
Args:
output_format: formatting string for the date and time, defaults to '%Y-%m-%d %H:%M:%S'
"""
Read file from the user filesystem.
:param file_path: The file path includes the filename and file ending.
:return: File content.
"""
output = ""
with open(file_path, mode="r", encoding="utf-8") as file:
output = file.read()
return f"Content of file '{file_path}':\n\n{output}"
if output_format is None:
output_format = '%Y-%m-%d %H:%M:%S'
return datetime.datetime.now().strftime(output_format)


# Enum for the calculator tool.
Expand All @@ -50,15 +38,14 @@ class MathOperation(Enum):
DIVIDE = "divide"


# llama-cpp-agent also supports "Instructor" library like function definitions as Pydantic models for function calling.
# Simple pydantic calculator tool for the agent that can add, subtract, multiply, and divide. Docstring and description of fields will be used in system prompt.
class Calculator(BaseModel):
class calculator(BaseModel):
"""
Perform a math operation on two numbers.
"""
number_one: Any = Field(..., description="First number.")
number_one: Union[int, float] = Field(..., description="First number.")
operation: MathOperation = Field(..., description="Math operation to perform.")
number_two: Any = Field(..., description="Second number.")
number_two: Union[int, float] = Field(..., description="Second number.")

def run(self):
if self.operation == MathOperation.ADD:
Expand All @@ -74,73 +61,64 @@ class Calculator(BaseModel):


# Example function based on an OpenAI example.
# llama-cpp-agent also supports OpenAI like dictionaries for function definition.
# llama-cpp-agent supports OpenAI like schemas for function definition.
def get_current_weather(location, unit):
"""Get the current weather in a given location"""
if "London" in location:
return json.dumps({"location": "London", "temperature": "42", "unit": unit.value})
return f"Weather in {location}: {22}° {unit.value}"
elif "New York" in location:
return json.dumps({"location": "New York", "temperature": "24", "unit": unit.value})
return f"Weather in {location}: {24}° {unit.value}"
elif "North Pole" in location:
return json.dumps({"location": "North Pole", "temperature": "-42", "unit": unit.value})
return f"Weather in {location}: {-42}° {unit.value}"
else:
return json.dumps({"location": location, "temperature": "unknown"})
return f"Weather in {location}: unknown"


# Here is a function definition in OpenAI style
tools = [
{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
open_ai_tool_spec = {
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA",
},
"required": ["location"],
"unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
},
"required": ["location", "unit"],
},
}
]
# To make the OpenAI function callable for the function calling agent we need a list with actual function in it:
tool_functions = [get_current_weather]
},
}


# Callback for receiving messages for the user.
def send_message_to_user_callback(message: str):
print(message)

# First we create the calculator tool.
calculator_function_tool = LlamaCppFunctionTool(calculator)

generation_settings = LlamaLLMGenerationSettings(temperature=0.65, top_p=0.5, tfs_z=0.975)
# Next we create the current datetime tool.
current_datetime_function_tool = LlamaCppFunctionTool(get_current_datetime)

# Can be saved and loaded like that:
# generation_settings.save("generation_settings.json")
# generation_settings = LlamaLLMGenerationSettings.load_from_file("generation_settings.json")
# The from_openai_tool function of the LlamaCppFunctionTool class converts an OpenAI tool schema and a callable function into a LlamaCppFunctionTool
get_weather_function_tool = LlamaCppFunctionTool.from_openai_tool(open_ai_tool_spec, get_current_weather)

# Create the function calling agent. We are passing the provider, the tool list, send message to user callback and the chat message formatter. Also, we allow parallel function calling.
function_call_agent = FunctionCallingAgent(
# Can be lama-cpp-python Llama class, llama_cpp_agent.llm_settings.LlamaLLMSettings class or llama_cpp_agent.providers.llama_cpp_server_provider.LlamaCppServerLLMSettings.
LlamaLLMSettings.load_from_file("openhermes-2.5-mistral-7b.Q8_0.json"),
# llama_cpp_agent.llm_settings.LlamaLLMGenerationSettings class or llama_cpp_agent.providers.llama_cpp_server_provider.LlamaCppServerGenerationSettings.
llama_generation_settings=generation_settings,
# A tuple of the OpenAI style function definitions and the actual functions
open_ai_functions=(tools, tool_functions),
# Just a list of type hinted functions for normal Python functions
python_functions=[write_to_file, read_file],
# Just a list of pydantic types
pydantic_functions=[Calculator],
# Callback for receiving messages for the user.
send_message_to_user_callback=send_message_to_user_callback, debug_output=True)

while True:
user_input = input(">")
function_call_agent.generate_response(user_input)
function_call_agent.save("function_calling_agent.json")
model,
llama_cpp_function_tools=[calculator_function_tool, current_datetime_function_tool, get_weather_function_tool],
send_message_to_user_callback=send_message_to_user_callback,
allow_parallel_function_calling=True,
messages_formatter_type=MessagesFormatterType.CHATML)

user_input = '''Get the date and time in '%d-%m-%Y %H:%M' format. Get the current weather in celsius in London, New York and at the North Pole. Solve the following calculations: 42 * 42, 74 + 26, 7 * 26, 4 + 6 and 96/8.'''
function_call_agent.generate_response(user_input)




Expand All @@ -150,44 +128,14 @@ Example Input 1
What is 42 * 42?
```
Example output 1
```json

{
"function": "calculator",
"function-parameters": {
"number_one": 42,
"operation": "multiply",
"number_two": 42
}
}
{
"function": "send-message-to-user",
"function-parameters": {
"message": "Function Call Result: 1764"
}
}
Function Call Result: 1764
```text
The result of 42 * 42 is 1764.
```
Example Input 2
```text
What is the current weather in London celsius?
```
Example output 2
```json

{
"function": "get-current-weather",
"function-parameters": {
"location": "London",
"unit": "celsius"
}
}
{
"function": "send-message-to-user",
"function-parameters": {
"message": "The current temperature in London is 42 degrees Celsius."
}
}

The current temperature in London is 42 degrees Celsius.
```text
The current weather in London is 22° celsius.
```
Loading