Maximilian-Winter · Maximilian-Winter · May 12, 2024 · May 8, 2024 · May 8, 2024 · May 9, 2024
diff --git a/ReadMe.md b/ReadMe.md
@@ -26,15 +26,19 @@
 - [FAQ](#faq)
 
 ## Introduction
-The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools. The framework integrates seamlessly with the llama.cpp server, llama-cpp-python and OpenAI endpoints that support grammar, offering flexibility and extensibility.
+The llama-cpp-agent framework is a tool designed to simplify interactions with Large Language Models (LLMs). It provides an interface for chatting with LLMs, executing function calls, generating structured output, performing retrieval augmented generation, and processing text using agentic chains with tools. 
+
+The framework uses guided sampling to constrain the model output to the user defined structures. This way also models not fine-tuned to do function calling and JSON output will be able to do it.
+
+The framework is compatible with the llama.cpp server, llama-cpp-python and its server, and with TGI and vllm servers.
 
 ## Key Features
 - **Simple Chat Interface**: Engage in seamless conversations with LLMs.
 - **Structured Output**: Generate structured output (objects) from LLMs.
 - **Single and Parallel Function Calling**: Execute functions using LLMs.
 - **RAG - Retrieval Augmented Generation**: Perform retrieval augmented generation with colbert reranking.
 - **Agent Chains**: Process text using agent chains with tools, supporting Conversational, Sequential, and Mapping Chains.
-- **Compatibility**: Works with llama-index tools and OpenAI tool schemas.
+- **Compatibility**: Works with python functions, pydantic tools, llama-index tools, and OpenAI tool schemas.
 - **Flexibility**: Suitable for various applications, from casual chatting to specific function executions.
 
 ## Installation
@@ -46,34 +50,7 @@ pip install llama-cpp-agent
 You can find the latest documentation [here!](https://llama-cpp-agent.readthedocs.io/en/latest/)
 
 ## Getting Started
-1. Ensure you have the required dependencies installed, including pydantic and llama-cpp-python.
-2. Import the necessary classes and functions from the llama-cpp-agent framework.
-3. Set up your LLM provider (e.g., llama-cpp-python Llama class, OpenAI endpoint with grammar support like llama-cpp-python server, or llama.cpp server).
-4. Create an instance of the desired agent class (e.g., LlamaCppAgent, FunctionCallingAgent, StructuredOutputAgent).
-5. Interact with the agent using the provided methods and examples.
-
-Here's a basic example of using the LlamaCppAgent for a simple chat:
-
-```python
-from llama_cpp_agent.llm_agent import LlamaCppAgent
-from llama_cpp_agent.messages_formatter import MessagesFormatterType
-from llama_cpp_agent.providers.llama_cpp_endpoint_provider import LlamaCppEndpointSettings
-
-main_model = LlamaCppEndpointSettings(
-    completions_endpoint_url="http://127.0.0.1:8080/completion"
-)
-
-llama_cpp_agent = LlamaCppAgent(main_model, debug_output=False,
-                                system_prompt="You are Dolphin, an uncensored and unbiased AI assistant.",
-                                predefined_messages_formatter_type=MessagesFormatterType.CHATML)
-
-while True:
-    user_input = input("User: ")
-    if user_input == "exit":
-        break
-    user_input = llama_cpp_agent.get_chat_response(user_input, temperature=0.7)
-    print("AI: " + user_input)
-```
+You can find the get started guide [here!](https://llama-cpp-agent.readthedocs.io/en/latest/)
 
 ## Discord Community
 Join the Discord Community [here](https://discord.gg/6tGznupZGX)

diff --git a/docs/agents-api-reference.md b/docs/agents-api-reference.md
@@ -1,11 +1,15 @@
 ---
-title: API Reference
+title: Agents Reference
 ---
 
 ## Agents
 
 ::: llama_cpp_agent.llm_agent
 
+## Structured Output Settings
+
+::: llama_cpp_agent.llm_output_settings.settings
+
 ### Function Calling Agent    
 
 ::: llama_cpp_agent.function_calling_agent

diff --git a/docs/function-calling-agent.md b/docs/function-calling-agent.md
@@ -3,43 +3,31 @@ This example shows how to use the FunctionCallingAgent for function calling with
 
 ```python
 # Example that uses the FunctionCallingAgent class to create a function calling agent.
-import json
+import datetime
 from enum import Enum
-from typing import Union, Any
+from typing import Union, Optional
 
 from pydantic import BaseModel, Field
 
-from llama_cpp_agent.llm_settings import LlamaLLMSettings, LlamaLLMGenerationSettings
+from llama_cpp_agent import LlamaCppFunctionTool
+from llama_cpp_agent import FunctionCallingAgent
+from llama_cpp_agent import MessagesFormatterType
+from llama_cpp_agent.providers import TGIServerProvider
 
-from llama_cpp_agent.function_calling_agent import FunctionCallingAgent
+model = TGIServerProvider("http://localhost:8080")
 
 
-# llama-cpp-agent supports type hinted function definitions for function calling.
-# Write to file function that can be used by the agent. Docstring will be used in system prompt.
-def write_to_file(chain_of_thought: str, file_path: str, file_content: str):
+# Simple tool for the agent, to get the current date and time in a specific format.
+def get_current_datetime(output_format: Optional[str] = None):
     """
-    Write file to the user filesystem.
-    :param chain_of_thought: Your chain of thought while writing the file.
-    :param file_path: The file path includes the filename and file ending.
-    :param file_content: The actual content to write.
-    """
-    print(chain_of_thought)
-    with open(file_path, mode="w", encoding="utf-8") as file:
-        file.write(file_content)
-    return f"File {file_path} successfully written."
-
+    Get the current date and time in the given format.
 
-# Read file function that can be used by the agent. Docstring will be used in system prompt.
-def read_file(file_path: str):
+    Args:
+         output_format: formatting string for the date and time, defaults to '%Y-%m-%d %H:%M:%S'
     """
-    Read file from the user filesystem.
-    :param file_path: The file path includes the filename and file ending.
-    :return: File content.
-    """
-    output = ""
-    with open(file_path, mode="r", encoding="utf-8") as file:
-        output = file.read()
-    return f"Content of file '{file_path}':\n\n{output}"
+    if output_format is None:
+        output_format = '%Y-%m-%d %H:%M:%S'
+    return datetime.datetime.now().strftime(output_format)
 
 
 # Enum for the calculator tool.
@@ -50,15 +38,14 @@ class MathOperation(Enum):
     DIVIDE = "divide"
 
 
-# llama-cpp-agent also supports "Instructor" library like function definitions as Pydantic models for function calling.
 # Simple pydantic calculator tool for the agent that can add, subtract, multiply, and divide. Docstring and description of fields will be used in system prompt.
-class Calculator(BaseModel):
+class calculator(BaseModel):
     """
     Perform a math operation on two numbers.
     """
-    number_one: Any = Field(..., description="First number.")
+    number_one: Union[int, float] = Field(..., description="First number.")
     operation: MathOperation = Field(..., description="Math operation to perform.")
-    number_two: Any = Field(..., description="Second number.")
+    number_two: Union[int, float] = Field(..., description="Second number.")
 
     def run(self):
         if self.operation == MathOperation.ADD:
@@ -74,73 +61,64 @@ class Calculator(BaseModel):
 
 
 # Example function based on an OpenAI example.
-# llama-cpp-agent also supports OpenAI like dictionaries for function definition.
+# llama-cpp-agent supports OpenAI like schemas for function definition.
 def get_current_weather(location, unit):
     """Get the current weather in a given location"""
     if "London" in location:
-        return json.dumps({"location": "London", "temperature": "42", "unit": unit.value})
+        return f"Weather in {location}: {22}° {unit.value}"
     elif "New York" in location:
-        return json.dumps({"location": "New York", "temperature": "24", "unit": unit.value})
+        return f"Weather in {location}: {24}° {unit.value}"
     elif "North Pole" in location:
-        return json.dumps({"location": "North Pole", "temperature": "-42", "unit": unit.value})
+        return f"Weather in {location}: {-42}° {unit.value}"
     else:
-        return json.dumps({"location": location, "temperature": "unknown"})
+        return f"Weather in {location}: unknown"
 
 
 # Here is a function definition in OpenAI style
-tools = [
-    {
-        "type": "function",
-        "function": {
-            "name": "get_current_weather",
-            "description": "Get the current weather in a given location",
-            "parameters": {
-                "type": "object",
-                "properties": {
-                    "location": {
-                        "type": "string",
-                        "description": "The city and state, e.g. San Francisco, CA",
-                    },
-                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
+open_ai_tool_spec = {
+    "type": "function",
+    "function": {
+        "name": "get_current_weather",
+        "description": "Get the current weather in a given location",
+        "parameters": {
+            "type": "object",
+            "properties": {
+                "location": {
+                    "type": "string",
+                    "description": "The city and state, e.g. San Francisco, CA",
                 },
-                "required": ["location"],
+                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
             },
+            "required": ["location", "unit"],
         },
-    }
-]
-# To make the OpenAI function callable for the function calling agent we need a list with actual function in it:
-tool_functions = [get_current_weather]
+    },
+}
 
 
 # Callback for receiving messages for the user.
 def send_message_to_user_callback(message: str):
     print(message)
 
+# First we create the calculator tool.
+calculator_function_tool = LlamaCppFunctionTool(calculator)
 
-generation_settings = LlamaLLMGenerationSettings(temperature=0.65, top_p=0.5, tfs_z=0.975)
+# Next we create the current datetime tool.
+current_datetime_function_tool = LlamaCppFunctionTool(get_current_datetime)
 
-# Can be saved and loaded like that:
-# generation_settings.save("generation_settings.json")
-# generation_settings = LlamaLLMGenerationSettings.load_from_file("generation_settings.json")
+# The from_openai_tool function of the LlamaCppFunctionTool class converts an OpenAI tool schema and a callable function into a LlamaCppFunctionTool
+get_weather_function_tool = LlamaCppFunctionTool.from_openai_tool(open_ai_tool_spec, get_current_weather)
 
+# Create the function calling agent. We are passing the provider, the tool list, send message to user callback and the chat message formatter. Also, we allow parallel function calling.
 function_call_agent = FunctionCallingAgent(
-    # Can be lama-cpp-python Llama class, llama_cpp_agent.llm_settings.LlamaLLMSettings class or llama_cpp_agent.providers.llama_cpp_server_provider.LlamaCppServerLLMSettings.
-    LlamaLLMSettings.load_from_file("openhermes-2.5-mistral-7b.Q8_0.json"),
-    # llama_cpp_agent.llm_settings.LlamaLLMGenerationSettings  class or llama_cpp_agent.providers.llama_cpp_server_provider.LlamaCppServerGenerationSettings.
-    llama_generation_settings=generation_settings,
-    # A tuple of the OpenAI style function definitions and the actual functions
-    open_ai_functions=(tools, tool_functions),
-    # Just a list of type hinted functions for normal Python functions
-    python_functions=[write_to_file, read_file],
-    # Just a list of pydantic types
-    pydantic_functions=[Calculator],
-    # Callback for receiving messages for the user.
-    send_message_to_user_callback=send_message_to_user_callback, debug_output=True)
-
-while True:
-    user_input = input(">")
-    function_call_agent.generate_response(user_input)
-    function_call_agent.save("function_calling_agent.json")
+    model,
+    llama_cpp_function_tools=[calculator_function_tool, current_datetime_function_tool, get_weather_function_tool],
+    send_message_to_user_callback=send_message_to_user_callback,
+    allow_parallel_function_calling=True,
+    messages_formatter_type=MessagesFormatterType.CHATML)
+
+user_input = '''Get the date and time in '%d-%m-%Y %H:%M' format. Get the current weather in celsius in London, New York and at the North Pole. Solve the following calculations: 42 * 42, 74 + 26, 7 * 26, 4 + 6  and 96/8.'''
+function_call_agent.generate_response(user_input)
+
 
 
 
@@ -150,44 +128,14 @@ Example Input 1
 What is 42 * 42?
 ```
 Example output 1
-```json
-
-{
-  "function": "calculator",
-  "function-parameters": {
-    "number_one": 42,
-    "operation": "multiply",
-    "number_two": 42
-  }
-}
-{
-  "function": "send-message-to-user",
-  "function-parameters": {
-    "message": "Function Call Result: 1764"
-  }
-}
-Function Call Result: 1764
+```text
+The result of 42 * 42 is 1764.
 ```
 Example Input 2
 ```text
 What is the current weather in London celsius?
 ```
 Example output 2
-```json
-
-{
-  "function": "get-current-weather",
-  "function-parameters": {
-    "location": "London",
-    "unit": "celsius"
-  }
-}
-{
-  "function": "send-message-to-user",
-  "function-parameters": {
-    "message": "The current temperature in London is 42 degrees Celsius."
-  }
-}
-
-The current temperature in London is 42 degrees Celsius.
+```text
+The current weather in London is 22° celsius.
 ```