Skip to content
Josh Patole edited this page Aug 21, 2025 · 1 revision

vLLM Functionality Overview

  1. Tool (Function) Calling

vLLM offers advanced tool calling capabilities through its chat completion API, enabling integration with external functionalities:

Named Function Calling You can explicitly pin the model to invoke a specific function by setting the tool_choice to identify the desired function name. This leverages guided decoding, ensuring responses comply with the structured JSON schema defined for each tool.

Automatic (Auto) Tool Choice Set tool_choice="auto" to allow the model to autonomously select and call relevant functions, when deemed appropriate within the conversation. This requires enabling flags like --enable-auto-tool-choice, specifying a tool parser (--tool-call-parser), and optionally a chat template (--chat-template) for orchestrating tool calls.

Required Tool Calling Using tool_choice="required" ensures the model must generate one or more tool calls based on definitions in the tools list. The output strictly conforms to the provided schemas. Note that this is currently supported only with the V0 engine using guided decodingβ€”support for other backends is planned.

Disabling Tool Calling With tool_choice="none", vLLM will disregard tools entirelyβ€”even if they're definedβ€”unless you explicitly exclude them via the --exclude-tools-when-tool-choice-none flag.

Clone this wiki locally