Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

400 - {'detail': 'Invalid value: Non supported ToolPromptFormat ToolPromptFormat.json'} - with default tool_prompt_format 3.3/3.2 #695

Closed
1 of 2 tasks
aidando73 opened this issue Dec 30, 2024 · 4 comments
Assignees

Comments

@aidando73
Copy link
Contributor

aidando73 commented Dec 30, 2024

System Info

PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 15.2 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.6)
CMake version: version 3.31.0
Libc version: N/A

Python version: 3.10.16 (main, Dec 11 2024, 10:22:29) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-15.2-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3 Max

Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.26.4
[pip3] onnxruntime==1.20.1
[pip3] torch==2.5.1
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] torch                     2.5.1                    pypi_0    pypi

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

Using fireworks provider

from llama_stack_client import LlamaStackClient
import os
MODEL_ID = "meta-llama/Llama-3.3-70B-Instruct"
client = LlamaStackClient(base_url=f"http://localhost:{os.environ['LLAMA_STACK_PORT']}")
response = client.inference.chat_completion(
    model_id=MODEL_ID,
    messages=[
        {
            "role": "user",
            "content": "What's the weather today?",
        }
    ],
    tools=[{
        "tool_name": "get_weather",
        "description": "Get the weather for a given location",
        "parameters": {
            "location": {
                "description": "The location to get the weather for",
                "param_type": "string",
                "required": True,
            }
        }
    }],
)
print(response)

This returns a 400 error

Error logs

Returns

Traceback (most recent call last):
  File "/Users/aidand/dev/hello-swe-bench/repro.py", line 5, in <module>
    response = client.inference.chat_completion(
  File "/Users/aidand/dev/hello-swe-bench/env/lib/python3.10/site-packages/llama_stack_client/_utils/_utils.py", line 275, in wrapper
    return func(*args, **kwargs)
  File "/Users/aidand/dev/hello-swe-bench/env/lib/python3.10/site-packages/llama_stack_client/resources/inference.py", line 218, in chat_completion
    self._post(
  File "/Users/aidand/dev/hello-swe-bench/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1263, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
  File "/Users/aidand/dev/hello-swe-bench/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 955, in request
    return self._request(
  File "/Users/aidand/dev/hello-swe-bench/env/lib/python3.10/site-packages/llama_stack_client/_base_client.py", line 1058, in _request
    raise self._make_status_error_from_response(err.response) from None
llama_stack_client.BadRequestError: Error code: 400 - {'detail': 'Invalid value: Non supported ToolPromptFormat ToolPromptFormat.json'}

Expected behavior

The corrent tool_prompt_format is python_list - Note that I can pass it in explicitly:

response = client.inference.chat_completion(
    ...
    tool_prompt_format="python_list",
    ...
)

But that requires understanding of the internals of llama-stack and tool prompt formats which most users won't have.

@cheesecake100201
Copy link
Contributor

cheesecake100201 commented Jan 4, 2025

I think thats because the tool_prompt_format by default is json in the llama stack api, but why isn't json working when thats the default prompt format ?
image
This shouldn't be unsuccessful in the first place. That itself is bugging me, since json is supposed to be supported by this.
Just curious, which inference are you using ?

Update: For all the 3.1 models, using tool_prompt_format as json is correct but for 3.3, the tool_prompt format is supposed to be python_list only. Attached below is the ss of the code

image

But why aren't other formats like json not supported for 3.3 ?
@ashwinb

@cheesecake100201
Copy link
Contributor

To solve this we can just add a check in inference APIs, saying
tool_prompt_format: Optional[ToolPromptFormat] = ToolPromptFormat.json if "Llama-3.1" in model_id or "Llama-3.2 in model_id" else ToolPromptFormat.python_list
What do you think?
@aidando73 @ashwinb

@aidando73
Copy link
Contributor Author

@cheesecake100201 yeah that's what I'm thinking, if the user didn't specify the tool_prompt_format we should set the correct default for them.

@dineshyv
Copy link
Contributor

Thanks for filing this @aidando73 we will need to infer right format instead of always defaulting to json. I will work on fixing this.

dineshyv added a commit that referenced this issue Jan 10, 2025
…742)

# What does this PR do?
We are setting a default value of json for tool prompt format, which
conflicts with llama 3.2/3.3 models since they use python list. This PR
changes the defaults to None and in the code, we infer default based on
the model.

Addresses: #695 

Tests:
❯ LLAMA_STACK_BASE_URL=http://localhost:5000 pytest -v
tests/client-sdk/inference/test_inference.py -k
"test_text_chat_completion"

 pytest llama_stack/providers/tests/inference/test_prompt_adapter.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants