Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pydantic validation error when using dicts content in user message #341

Open
off6atomic opened this issue Jun 21, 2024 · 8 comments
Open
Labels
Bug Something isn't working external dependance Dependent on an issue in an external library

Comments

@off6atomic
Copy link
Contributor

off6atomic commented Jun 21, 2024

Description

I ran the following code:

import pprint

from mirascope.openai import OpenAICall
from openai.types.chat import ChatCompletionMessageParam


class Librarian(OpenAICall):
    prompt_template = """
    SYSTEM: You are the world's greatest librarian.
    MESSAGES: {history}
    """

    history: list[ChatCompletionMessageParam] = []


history = [
    {
        "role": "user",
        "content": [{"type": "text", "text": "What fantasy book should I read?"}],
    },
]
librarian = Librarian(history=history)
pprint.pprint(librarian.messages(), indent=2)

And got this message:

[ {'content': "You are the world's greatest librarian.", 'role': 'system'},
  { 'content': ValidatorIterator(index=0, schema=Some(Union(UnionValidator { mode: Smart, choices: [(TypedDict(TypedDictValidator { fields: [TypedDictField { name: "text", lookup_key: Simple { key: "text", py_key: Py(0x117aa4fb0), path: LookupPath([S("text", Py(0x1179e4af0))]) }, name_py: Py(0x104520b70), required: true, validator: Str(StrValidator { strict: false, coerce_numbers_to_str: false }) }, TypedDictField { name: "type", lookup_key: Simple { key: "type", py_key: Py(0x1179e4db0), path: LookupPath([S("type", Py(0x1179e52f0))]) }, name_py: Py(0x104500d70), required: true, validator: Literal(LiteralValidator { lookup: LiteralLookup { expected_bool: None, expected_int: None, expected_str: Some({"text": 0}), expected_py_dict: None, expected_py_list: None, values: [Py(0x104520b70)] }, expected_repr: "'text'", name: "literal['text']" }) }], extra_behavior: Ignore, extras_validator: None, strict: false, loc_by_alias: true }), None), (TypedDict(TypedDictValidator { fields: [TypedDictField { name: "image_url", lookup_key: Simple { key: "image_url", py_key: Py(0x117ac8b70), path: LookupPath([S("image_url", Py(0x117ac8bb0))]) }, name_py: Py(0x1168f7db0), required: true, validator: TypedDict(TypedDictValidator { fields: [TypedDictField { name: "url", lookup_key: Simple { key: "url", py_key: Py(0x1179f44f0), path: LookupPath([S("url", Py(0x117ac8ab0))]) }, name_py: Py(0x104728970), required: true, validator: Str(StrValidator { strict: false, coerce_numbers_to_str: false }) }, TypedDictField { name: "detail", lookup_key: Simple { key: "detail", py_key: Py(0x117ac8af0), path: LookupPath([S("detail", Py(0x117ac8b30))]) }, name_py: Py(0x107828530), required: false, validator: Literal(LiteralValidator { lookup: LiteralLookup { expected_bool: None, expected_int: None, expected_str: Some({"high": 2, "auto": 0, "low": 1}), expected_py_dict: None, expected_py_list: None, values: [Py(0x1046a95f0), Py(0x1049f0f70), Py(0x1049f0fb0)] }, expected_repr: "'auto', 'low' or 'high'", name: "literal['auto','low','high']" }) }], extra_behavior: Ignore, extras_validator: None, strict: false, loc_by_alias: true }) }, TypedDictField { name: "type", lookup_key: Simple { key: "type", py_key: Py(0x117ac8bf0), path: LookupPath([S("type", Py(0x117ac8c30))]) }, name_py: Py(0x104500d70), required: true, validator: Literal(LiteralValidator { lookup: LiteralLookup { expected_bool: None, expected_int: None, expected_str: Some({"image_url": 0}), expected_py_dict: None, expected_py_list: None, values: [Py(0x1168f7db0)] }, expected_repr: "'image_url'", name: "literal['image_url']" }) }], extra_behavior: Ignore, extras_validator: None, strict: false, loc_by_alias: true }), None)], custom_error: None, strict: false, name: "union[typed-dict,typed-dict]" }))),
    'role': 'user'}]

Notice the ValidatorIterator stuff. It should not be there, is it?

I strongly believe that this is the culprit that causes logging with Logfire to not show Chat Completion section.

By Chat Completion section, I mean the pretty log like in the following image:
image

Debugging tips

  • If you change the content field to be a string instead of a list of dicts, then there would be no validation error.
  • Try using with_logfire decorator to wrap the Librarian class to see Chat Completion section disappearing when the content is a list of dicts

Python, Mirascope & OS Versions, related packages (not required)

mirascope=0.17.0
pydantic==2.7.1
python=3.10.11
os=Mac
@off6atomic off6atomic added the Bug Something isn't working label Jun 21, 2024
@off6atomic off6atomic changed the title Pydantic validation error when using dict content in user message Pydantic validation error when using dicts content in user message Jun 21, 2024
@willbakst
Copy link
Contributor

I believe this is the same issue as pydantic/pydantic#9467

For now the answer is to add SkipValidation to your history:

from pydantic import SkipValidation
...

class Librarian(OpenAICall):
    prompt_template = """
    SYSTEM: You are the world's greatest librarian.
    MESSAGES: {history}
    """

    history: SkipValidation[list[ChatCompletionMessageParam]] = []

...

@willbakst willbakst added the external dependance Dependent on an issue in an external library label Jun 21, 2024
@off6atomic
Copy link
Contributor Author

off6atomic commented Jun 25, 2024

Thanks @willbakst
It solves the problem of ValidatorIterator. But it doesn't solve the problem of missing Chat Completion section though.

Here is the minimal code that shows missing Chat Completion section:

import pprint

import logfire
from dotenv import load_dotenv
from mirascope.logfire import with_logfire
from mirascope.openai import OpenAICall
from openai.types.chat import ChatCompletionMessageParam
from pydantic import SkipValidation

load_dotenv()
logfire.configure()


@with_logfire
class Librarian(OpenAICall):
    prompt_template = """
    SYSTEM: You are the world's greatest librarian. You answer very concisely.
    MESSAGES: {history}
    """

    history: SkipValidation[ChatCompletionMessageParam] = []


history = [
    {
        "role": "user",
        # uncomment below line to see LLM Chat Completions section
        # "content": "What fantasy book should I read?",
        "content": [{"type": "text", "text": "What fantasy book should I read?"}],
    },
]
librarian = Librarian(history=history)
pprint.pprint(librarian.messages(), indent=2)
print()

print(librarian.call().content)

You can try to uncomment the content field and comment the other one, and you will see the LLM Chat Completions section showing properly on Logfire.

What is the workaround to this bug?

@willbakst
Copy link
Contributor

willbakst commented Jun 25, 2024

Oh, I believe this is an issue with logfire's integration with LLMs as it doesn't handle a content array (only a single string content).

I will take a deeper look and likely post a bug on their repo if I can confirm it's a bug without Mirascope :)

@off6atomic
Copy link
Contributor Author

If they are buggy like that then maybe it's a not a good idea for me to use them. Because I'll also have to show text along with images too. Do you have alternatives which you like?

I used LangSmith with LangChain in the past and it was working fine but it's not allowing arbitrary log like Logfire.

Maybe LangFuse is a good alternative?

@willbakst
Copy link
Contributor

I don’t believe LangFuse allows for arbitrary logging? I would have to check.

From my understanding if Logfire, the bug should be a simple UI fix to properly render the content. All of the content is still there it’s just not handled properly.

@off6atomic
Copy link
Contributor Author

OK. I'll wait for them to fix the bug then. Thank you for helping me reporting bug on their repo. I appreciate it very much.

@off6atomic
Copy link
Contributor Author

@willbakst Is this bug already reported on upstream repo? I just want to keep track of it.

@willbakst
Copy link
Contributor

@off6atomic thank you for the reminder I totally blanked on posting this. It's posted now.

pydantic/logfire#297

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working external dependance Dependent on an issue in an external library
Projects
None yet
Development

No branches or pull requests

2 participants