VLLM

vLLM Functionality Overview

Tool (Function) Calling

vLLM offers advanced tool calling capabilities through its chat completion API, enabling integration with external functionalities:

Named Function Calling You can explicitly pin the model to invoke a specific function by setting the tool_choice to identify the desired function name. This leverages guided decoding, ensuring responses comply with the structured JSON schema defined for each tool.

Automatic (Auto) Tool Choice Set tool_choice="auto" to allow the model to autonomously select and call relevant functions, when deemed appropriate within the conversation. This requires enabling flags like --enable-auto-tool-choice, specifying a tool parser (--tool-call-parser), and optionally a chat template (--chat-template) for orchestrating tool calls.

Required Tool Calling Using tool_choice="required" ensures the model must generate one or more tool calls based on definitions in the tools list. The output strictly conforms to the provided schemas. Note that this is currently supported only with the V0 engine using guided decoding—support for other backends is planned.

Disabling Tool Calling With tool_choice="none", vLLM will disregard tools entirely—even if they're defined—unless you explicitly exclude them via the --exclude-tools-when-tool-choice-none flag.

ModelWorks Wiki

Welcome to the ModelWorks Wiki! Use the links below to navigate our resources quickly.

📚 Main

Product Information

🤗 Team

🔧 Admin

🛠 Tools

Gradio Notes
Javascript Notes (split into two)
- Backend
- Frontend
Ollama Notes
Langchain Notes
Chroma Notes
GPU Monitor Notes
Open WebUI Notes
Web Search Notes
Chat History Notes

VLLM

vLLM Functionality Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ModelWorks Wiki

📚 Main

🤗 Team

🔧 Admin

🛠 Tools

🗓 Meeting Minutes

Clone this wiki locally