Function calling for OpenAI backend #573

Yiyun-Liang · 2024-06-29T18:37:42Z

Adding skeleton code for function calling with Open API models.
Example output:

system : You are a helpful assistant.
user : What's the weather like in San Francisco, Tokyo, Paris, and Beijing?
assistant : The current weather in San Francisco is 72°F, in Tokyo it is 10°C, and in Paris it is 22°C. Unfortunately, I couldn't retrieve the weather information for Beijing at the moment.

Ying1123

Thanks for the PR! I left a few comments. The review is still in progress.

examples/quick_start/openai_example_func_call.py

Ying1123 · 2024-06-30T08:34:17Z

python/sglang/lang/interpreter.py

@@ -23,6 +23,7 @@
    SglFunction,
    SglGen,
    SglImage,
+    SglFuncCall,


use dictionary order instead.

Removing this given we are moving it to be part of SglGen.

Ying1123 · 2024-06-30T08:44:25Z

examples/quick_start/openai_example_func_call.py

+def multi_turn_question(s, question_1, functions=[]):
+    s += sgl.system("You are a helpful assistant.")
+    s += sgl.user(question_1)
+    s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto")


We may also want to retrieve the results from function call.
Add tests for state["func_call_1"] in the function single().

Ying1123 · 2024-06-30T08:50:21Z

python/sglang/backend/openai.py

+                    # Open AI model requires function call information to be sent to the model
+                    # along with the prompt.
+                    for function_call in s.function_calls:
+                        prompt.append(function_call)
                else:


s.messages_ should be updated after function call finished rather than in generate, and the append logic should happen in interpreter.py, see _execute_role_end() as a reference.

Additionally, changes prompt implicitly changes s.messages_. This is not safe. Changes s.messages_ then set prompt = s.messages_ is better.

Restructured the code a little bit based on your suggestions (with some minor tweaks but I can update if you think it's still better to move the function call generate call outside of generate (we will just have a simpler generate call):

Within openai.py

build_function_call_messages(): a new function which builds function call messages. Given function signature is specific to open ai models, keeping the logic to parse inputs and produce function call messages within the backend code.

generate(): Given prompt is local to the generate() call, I directly added function_call_messages to it so that we can call with function call messages during the current completion call's prompt. The main intuition is to try resuing the generate call logic and it also only appends function call response (comp) without intermediate messages into the final text/messages.

Within interpreter.py

Updated _execute_gen() logic to include building function call messages if tools are provided, and handle both parallel function calling and non-parallel function calling by either calling backend.generate one time for parallel function call supported models, or multiple times if parallel call is not supported.

Ying1123 · 2024-06-30T08:51:34Z

python/sglang/backend/openai.py

+            "gpt-3.5-turbo-0613",
+        ]:
+            raise RuntimeError(
+                "This model currently does not support function calling."


keep in mind that the set of models that support function calling and parallel function calling are different.

Thanks for pointing out! Updated to have different handling logic.

Ying1123 · 2024-06-30T09:00:26Z

python/sglang/backend/openai.py

+            cur_tool_choice = (
+                tool_choice
+                if tool_choice in ["auto", "required", "none"]
+                else {"type": "function", "function": {"name": tool_choice}}


In this case, assert tool_choice is in names of candidate functions.

Ying1123 · 2024-06-30T09:03:34Z

python/sglang/backend/openai.py

+        tool_calls = response_message.tool_calls
+        # Check if the model wanted to call a function
+        ret_messages = []
+        if tool_calls:
+            # Call the function
+            # Note: the JSON response may not always be valid; be sure to handle errors
+            available_functions = {}
+            for tool in tools:
+                available_functions[tool.__name__] = tool
+            ret_messages.append(response_message)
+            # Send the info for each function call and function response to the model
+            for tool_call in tool_calls:
+                function_name = tool_call.function.name
+                function_to_call = available_functions[function_name]
+                function_args = json.loads(tool_call.function.arguments)
+                function_response = function_to_call(**function_args)
+                ret_messages.append(
+                    {
+                        "tool_call_id": tool_call.id,
+                        "role": "tool",
+                        "name": function_name,
+                        "content": str(function_response),
+                    }
+                )
+        return ret_messages


I think it is better to put the logic of real function call into the interpreter, so that it can be reused when we develop the feature for local models.
And remember to handle the logic of appending s.messages_ and s.text_ in the interpreter.

Ying1123 · 2024-06-30T09:06:05Z

python/sglang/lang/interpreter.py

@@ -554,6 +560,12 @@ def _execute_select(self, expr: SglSelect):
            self.variable_event[name].set()
        self.text_ += decision

+    def _execute_func_call(self, expr: SglFuncCall):
+        # TODO: Should we clear the previous function call states for the next function call


I think yes, by default. Although accumulating functions could be an option.

Ying1123 · 2024-06-30T17:06:01Z

examples/quick_start/openai_example_func_call.py

+def multi_turn_question(s, question_1, functions=[]):
+    s += sgl.system("You are a helpful assistant.")
+    s += sgl.user(question_1)
+    s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto")


A design suggestion is that it might be better to just have sgl.gen with func_call as an argument.

I agree, I think it's more straightforward to have it as part of sgl.gen. Would it make sense to have something like sgl.gen("answer_1", max_tokens=256, sgl.func_call(...)) or simply expose parameters directly to sgl.gen like sgl.gen("answer_1", max_tokens=256, tools=[...])?

Let's simply expose parameters directly to sgl.gen.

initial function calling skeleton

36ac1cf

Yiyun-Liang force-pushed the func-call branch 3 times, most recently from fb46602 to 0b7d7f1 Compare June 29, 2024 21:57

add call_func api

fbe44c2

Yiyun-Liang force-pushed the func-call branch from 0b7d7f1 to fbe44c2 Compare June 29, 2024 22:00

Ying1123 mentioned this pull request Jun 30, 2024

Development Roadmap (Deprecated) #157

Closed

Ying1123 requested changes Jun 30, 2024

View reviewed changes

Ying1123 self-assigned this Jun 30, 2024

Ying1123 changed the title ~~Func call~~ Function calling for OpenAI backend Jun 30, 2024

Ying1123 reviewed Jun 30, 2024

View reviewed changes

Yiyun-Liang force-pushed the func-call branch from bec9b73 to fcd6be5 Compare June 30, 2024 18:37

add more comments

071cedf

Yiyun-Liang force-pushed the func-call branch 2 times, most recently from e859d3e to edc30d2 Compare June 30, 2024 23:44

Ying1123 force-pushed the main branch from d530a1c to c7709d3 Compare July 3, 2024 21:59

merrymercy force-pushed the main branch 2 times, most recently from 2463404 to d737da5 Compare July 4, 2024 07:57

Yiyun-Liang force-pushed the func-call branch 8 times, most recently from dd4b1b6 to 5b3918a Compare July 9, 2024 05:11

update function call code structure

075b053

Yiyun-Liang force-pushed the func-call branch from 5b3918a to 075b053 Compare July 9, 2024 05:26

merrymercy force-pushed the main branch 2 times, most recently from 7eda0c8 to 41d1f67 Compare July 16, 2024 03:44

Ying1123 mentioned this pull request Jul 17, 2024

Development Roadmap (2024 Q3) #634

Open

27 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function calling for OpenAI backend #573

Function calling for OpenAI backend #573

Yiyun-Liang commented Jun 29, 2024 •

edited

Loading

Ying1123 left a comment

Ying1123 Jun 30, 2024

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024 •

edited

Loading

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024

Yiyun-Liang Jun 30, 2024 •

edited

Loading

Ying1123 Jun 30, 2024

Function calling for OpenAI backend #573

Are you sure you want to change the base?

Function calling for OpenAI backend #573

Conversation

Yiyun-Liang commented Jun 29, 2024 • edited Loading

Ying1123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ying1123 Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yiyun-Liang Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yiyun-Liang commented Jun 29, 2024 •

edited

Loading

Ying1123 Jun 30, 2024 •

edited

Loading

Yiyun-Liang Jun 30, 2024 •

edited

Loading