Skip to content

When using LiteLlm with a model that can produce structured output (e.g. gpt-4o), adk doesn't seem to be passing the output schema to the model #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
erdincyilmazel opened this issue Apr 16, 2025 · 12 comments · May be fixed by #580
Assignees

Comments

@erdincyilmazel
Copy link

erdincyilmazel commented Apr 16, 2025

I have a very basic agent that is using openai/gpt-4o. I am trying to get structured output from it. However, the model either doesn't return JSON or it returns a json response that doesn't follow the given schema.

I am not observing this if I use a model provided by google such as gemini-2.0-flash.

Here is a sample code to reproduce the issue:

from google.adk.agents import Agent
from google.adk.models.lite_llm import LiteLlm
from typing import List
from pydantic import BaseModel, Field

class Album(BaseModel):
    """Structured data representing an album."""
    name: str = Field(description="The name of the album.")
    artist: str = Field(description="The artist of the album.")
    year: int = Field(description="The year of the album.")
    genre: str = Field(description="The genre of the album.")

class Albums(BaseModel):
    """Structured data representing a list of albums."""
    albums: List[Album] = Field(description="A list of albums.")

root_agent = Agent(
    name="my_basic_agent",
    model=LiteLlm(model="openai/gpt-4o"),
    description="An album recommender",
    instruction="You are an album recommender. You will recommend albums to the user based on their favorite genre. Return a list of albums as a JSON object.",
    output_schema=Albums,
    output_key="albums",
)

When the response is received, Pydantic throws a validation error.

@erdincyilmazel erdincyilmazel changed the title When using LlmLite with a model that can produce structured output (e.g. gpt-4o), adk doesn't seem to be passing the schema When using LlmLite with a model that can produce structured output (e.g. gpt-4o), adk doesn't seem to be passing the output schema to the model Apr 16, 2025
@erdincyilmazel erdincyilmazel changed the title When using LlmLite with a model that can produce structured output (e.g. gpt-4o), adk doesn't seem to be passing the output schema to the model When using LiteLlm with a model that can produce structured output (e.g. gpt-4o), adk doesn't seem to be passing the output schema to the model Apr 16, 2025
@hangfei
Copy link
Collaborator

hangfei commented Apr 17, 2025

What errors are you seeing? Does the above agent definition reproduce the error?

@hangfei hangfei self-assigned this Apr 17, 2025
@erdincyilmazel
Copy link
Author

Yes the above agent definition reproduces the error every single time.

Below is what I am seeing on the terminal. Note that the response from open ai is not a Json string, and pydantic is throwing a validation error because of it. The fact that openai's response is plain text is making me think that the underlying API call isn't asking for structured output and passing the schema for the Pydantic models.

It works as expected when I use a gemini model directly. This is only a problem when I use LiteLlm.

Summary of what I am observing:

  • gpt-4o : Doesn't return JSON output
  • gpt-4.1 : Returns a JSON response, following the system instruction, but the returned JSON object's schema doesn't match the Pydantic model
  • gemini-2.0-flash: Works as expected.
LLM Request:
-----------------------------------------------------------
System Instruction:
You are an album recommender. You will recommend albums to the user based on their favorite genre. Return a list of albums as a JSON object.

You are an agent. Your internal name is "my_basic_agent".

 The description about you is "An album recommender"
-----------------------------------------------------------
Contents:
{"parts":[{"text":"what are the best classic rock albums?"}],"role":"user"}
-----------------------------------------------------------
Functions:

-----------------------------------------------------------

09:28:27 - LiteLLM:INFO: utils.py:3085 - 
LiteLLM completion() model= gpt-4o; provider = openai
2025-04-17 09:28:27,729 - INFO - utils.py:3085 - 
LiteLLM completion() model= gpt-4o; provider = openai
2025-04-17 09:28:29,749 - INFO - _client.py:1740 - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
09:28:29 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: openai/gpt-4o-2024-08-06
2025-04-17 09:28:29,759 - INFO - cost_calculator.py:636 - selected model name for cost calculation: openai/gpt-4o-2024-08-06
2025-04-17 09:28:29,764 - ERROR - fast_api.py:616 - Error in event_generator: 1 validation error for Albums
  Invalid JSON: expected value at line 1 column 1 [type=json_invalid, input_value='Here\'s a list of some o...ng Stones"\n  ]\n}\n```', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/json_invalid

@danieleforberghi
Copy link

danieleforberghi commented Apr 20, 2025

I've had somewhat the same issue: the object is not returned but a JSON string. I get this error with gemini-flash-002.

@smilovanovic
Copy link

Same here, LiteLLM with Ollama not passing the json schema correctly which causes invalid response. This is the main reason why I'm moving away from adk even though it seemed the most complete framework and I've tried them all, literally 😢

@trungpq27
Copy link

My workaround for this is to pass the schema directly into LiteLlm:

routing_agent = LlmAgent(
    name="routing_agent",
    model=LiteLlm(
        api_base=setting.OLLAMA_API_BASE,
        model="ollama_chat" + "/" + setting.ROUTING_MODEL,
        format=RoutingSchema.model_json_schema(),
    ),
    instruction=ROUTING_INSTRUCTION,
    output_schema=RoutingSchema,
    output_key="routing_agent",
    disallow_transfer_to_parent=True,
    disallow_transfer_to_peers=True,
)

I define both the output schema and the format (in my case, the variable name is format because I'm using Ollama).
When I checked the LiteLlm code, I found that the wrapper’s arguments are:

Args:
    model: The name of the LiteLlm model.
    **kwargs: Additional arguments to pass to the litellm completion api.

This means I can pass extra parameters—just like when calling the API manually—through **kwargs to LiteLlm, and this approach seems to work every time.

ADK is still a relatively new framework, so bugs like this are somewhat expected. However, output schema handling is a pretty fundamental feature, so I hope they will fix this issue soon.

@smilovanovic
Copy link

smilovanovic commented Apr 28, 2025

@trungpq27 nice workaround, thanks.
But I'm still getting the error

10:51:54 - LiteLLM:INFO: utils.py:3108 - 
LiteLLM completion() model= qwen2.5:3b; provider = ollama_chat
10:51:55 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/qwen2.5:3b
[intent_recognition_agent]: { "type": "Booking a space" }
llm_request model='ollama_chat/qwen2.5:3b' contents=[Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='book an office')], role='user'), Content(parts=[Part(video_metadata=None, th
ought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='For context:'), Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline
_data=None, text='[intent_recognition_agent] said: { "type": "Booking a space" }')], role='user')] config=GenerateContentConfig(http_options=None, system_instruction='You are a helpful assistant. Use available tools if needed\n\nYou are an agent. Your internal name is "output_agent".', temperature=None, top_p=None, t
op_k=None, candidate_count=None, max_output_tokens=None, stop_sequences=None, response_logprobs=None, logprobs=None, presence_penalty=None, frequency_penalty=None, seed=None, response_mime_type=None, response_schema=None, routing_config=None, model_selection_config=None, safety_settings=None, tools=None, tool_config=
None, labels=None, cached_content=None, response_modalities=None, media_resolution=None, speech_config=None, audio_timestamp=None, automatic_function_calling=None, thinking_config=None) live_connect_config=LiveConnectConfig(generation_config=None, response_modalities=None, temperature=None, top_p=None, top_k=None, ma
x_output_tokens=None, media_resolution=None, seed=None, speech_config=None, system_instruction=None, tools=None, session_resumption=None, input_audio_transcription=None, output_audio_transcription=None, realtime_input_config=None, context_window_compression=None) tools_dict={}

...

litellm.exceptions.APIConnectionError: litellm.APIConnectionError: Ollama_chatException - {"error":"json: cannot unmarshal array into Go struct field ChatRequest.messages.content of type string"}

'[intent_recognition_agent] said: { "type": "Booking a space" }'

it looks like agent identifier prefix is added to the text and because of it, it can't be json decoded

@danieleforberghi
Copy link

This does not apply if NOT using LiteLlm and other models than Gemini. The issue exists even for Gemini.

@trungpq27
Copy link

@trungpq27 nice workaround, thanks. But I'm still getting the error

10:51:54 - LiteLLM:INFO: utils.py:3108 - 
LiteLLM completion() model= qwen2.5:3b; provider = ollama_chat
10:51:55 - LiteLLM:INFO: cost_calculator.py:636 - selected model name for cost calculation: ollama_chat/qwen2.5:3b
[intent_recognition_agent]: { "type": "Booking a space" }
llm_request model='ollama_chat/qwen2.5:3b' contents=[Content(parts=[Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='book an office')], role='user'), Content(parts=[Part(video_metadata=None, th
ought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline_data=None, text='For context:'), Part(video_metadata=None, thought=None, code_execution_result=None, executable_code=None, file_data=None, function_call=None, function_response=None, inline
_data=None, text='[intent_recognition_agent] said: { "type": "Booking a space" }')], role='user')] config=GenerateContentConfig(http_options=None, system_instruction='You are a helpful assistant. Use available tools if needed\n\nYou are an agent. Your internal name is "output_agent".', temperature=None, top_p=None, t
op_k=None, candidate_count=None, max_output_tokens=None, stop_sequences=None, response_logprobs=None, logprobs=None, presence_penalty=None, frequency_penalty=None, seed=None, response_mime_type=None, response_schema=None, routing_config=None, model_selection_config=None, safety_settings=None, tools=None, tool_config=
None, labels=None, cached_content=None, response_modalities=None, media_resolution=None, speech_config=None, audio_timestamp=None, automatic_function_calling=None, thinking_config=None) live_connect_config=LiveConnectConfig(generation_config=None, response_modalities=None, temperature=None, top_p=None, top_k=None, ma
x_output_tokens=None, media_resolution=None, seed=None, speech_config=None, system_instruction=None, tools=None, session_resumption=None, input_audio_transcription=None, output_audio_transcription=None, realtime_input_config=None, context_window_compression=None) tools_dict={}

...

litellm.exceptions.APIConnectionError: litellm.APIConnectionError: Ollama_chatException - {"error":"json: cannot unmarshal array into Go struct field ChatRequest.messages.content of type string"}

'[intent_recognition_agent] said: { "type": "Booking a space" }'

it looks like agent identifier prefix is added to the text and because of it, it can't be json decoded

Can you provide the context in which you encountered the issue? Is this an agent or a sub-agent?

@rohan3107
Copy link

Here is a workaround that works:

model = LiteLlm(model='openai/gpt-4o',response_format={"type":"json_object"})

@smilovanovic
Copy link

smilovanovic commented Apr 28, 2025

I'm primarily using Ollama (for local) and Bedrock (for remote) as my LLM providers and the following works in both cases:

import os
from typing import Type, TypeVar
from google.adk.models.lite_llm import LiteLlm
from pydantic import BaseModel

PROIVDER = "bedrock"  # "ollama" | "bedrock"

if PROIVDER == "bedrock":
    os.environ["AWS_REGION_NAME"] = "us-east-1"

T = TypeVar("T", bound=BaseModel)


def get_model(output_model: Type[T] | None = None):
    kwargs = {}
    if PROIVDER == "bedrock":
        kwargs["model"] = "bedrock/us.anthropic.claude-3-5-haiku-20241022-v1:0"
        if output_model is not None:
            kwargs["response_format"] = output_model
    else:
        kwargs["model"] = "ollama_chat/qwen2.5:3b"
        kwargs["api_base"] = "http://localhost:11434"
        if output_model is not None:
            kwargs["format"] = output_model.model_json_schema()

    return LiteLlm(
        **kwargs,
    )

Then for the agent init I use this model getter and output_schema

@whoisarpit whoisarpit linked a pull request May 6, 2025 that will close this issue
@whoisarpit
Copy link

Fixed in #580

@whoisarpit
Copy link

What errors are you seeing? Does the above agent definition reproduce the error?

@hangfei When the LiteLlm call is made with output_schema, src/google/adk/agents/llm_agent.py:309 throws an unable to parse error. This happens because the schema isn't being passed to LiteLlm. As a result, it returns an output that doesn't adhere to the schema and parsing fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants