Skip to content

A comparison of structured output performance among popular open and closed source large language models.

License

Notifications You must be signed in to change notification settings

mattflo/structured-output-performance

Repository files navigation

Structured Output Performance

Open In Colab

A comparison of structured output performance among popular open and closed source large language models.

Local Setup

Install dependencies

poetry install --no-root

API Keys

Copy .env.example to .env and add your API keys.

Potential Improvements

  • Measure tokens per second
  • Add Anthropic models
  • Update Groq to use with_structured_output
  • Break out GPT 4 versions
  • Ensure instance of class isn't empty
  • Add ability to mix in different Pydantic objects and prompts
  • Parallelize
  • Create a version that doesn't use LangChain

Caveats

This analysis is performed with one, fairly simple prompt template, run with just 10 samples per model. More complicated prompts/tasks will negatively impact consistency. And tinkering with different prompt strategies will improve consistency. It's also important to note, quality/accuracy of output is not considered here, only consistency and latency.

About

A comparison of structured output performance among popular open and closed source large language models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published