Add LiteLLM+instructor (for structured output) backend for curator #141

CharlieJCJ · 2024-11-18T08:05:19Z

Closes: #74
Closes #179
Closes #164

Changes:

Included two example scripts, illustrating litellm usage in prompting and structured output modes.
/examples/litellm_recipe_prompting.py
/examples/litellm_recipe_structured_output.py (Note: need to put OpenAI and Anthropic keys in environment).
Included a backend parameter when using prompter code link, right now defaults to OpenAI.
Integrated with instructor for structure output support for a large coverage of models. For models that can't use litellm + instructor (structured output), used a try except block before dataset generation to check whether instructor works on a simple example code link.
Added time and cost logging (cost using litellm.completion_cost, if model cost is in the community-maintained mapping here)
Added estimate_total_tokens that includes estimate_output_tokens which derives from get_max_tokens that gets max output token of the specified model. code link
Using the same async, retry strategy as OpenAI Online Request Processor
Get litellm rate limit through hidden params dict, including x-ratelimit-limit-requests, and x-ratelimit-limit-tokens for rpm and tpm.
litellm refactoring base online request processor #188
- Includes the breakdown of the new abstract class, and instructions for any new inheritance of OnlineRequestProcessor.
Implement a robust OpenAI online request processor's check_structured_output_support.
- A naive check is to check Model Versions: Structured Outputs are supported by specific model versions: gpt-4o-mini version 2024-07-18 and later gpt-4o version 2024-08-06 and late
- Closes openai online & batch only supported models are gpt4o and gpt4o-mini #164

Future Works:

Support a baseline structured output strategy for models from inference platforms that do not support structured outputs from litellm. Right now, will directly prompt user that the current model doesn't support structured output. Note that litellm structured output model / provider coverage isn't good, link. Need additional research.
More performance optimization for auto rate limiting / retry strategies, need battle testing & experiments and comparisons.

Example curator-viewer's view:

Tested on the following models, all working for litellm + instructor structured output.

"claude-3-5-sonnet-20240620", # https://docs.litellm.ai/docs/providers/anthropic # anthropic has a different hidden param tokens structure. 
"claude-3-5-haiku-20241022",
"claude-3-haiku-20240307",
"claude-3-opus-20240229",
"claude-3-sonnet-20240229",
"gpt-4o-mini", # https://docs.litellm.ai/docs/providers/openai
"gpt-4o-2024-08-06",
"gpt-4-0125-preview",
"gpt-3.5-turbo-1106",
"gemini/gemini-1.5-flash", # https://docs.litellm.ai/docs/providers/gemini; https://ai.google.dev/gemini-api/docs/models # 20-30 iter/s
"gemini/gemini-1.5-pro", # 20-30 iter/s
"together_ai/meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo", # https://docs.together.ai/docs/serverless-models
"together_ai/meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo",
"together_ai/mistralai/Mixtral-8x7B-Instruct-v0.1",

Note that the following models does not support structured output (i.e. response_format in Prompter)

# "together_ai/nvidia/Llama-3.1-Nemotron-70B-Instruct-HF", # instructor not supported
# "deepinfra/nvidia/Llama-3.1-Nemotron-70B-Instruct" # instructor not supported

CharlieJCJ · 2024-11-18T21:32:27Z

Works for Claude model_name="claude-3-opus-20240229"

…at I get the response and completion objects.

CharlieJCJ · 2024-11-18T22:52:19Z

CharlieJCJ · 2024-11-20T00:34:47Z

After #149's merge (that resolves #145), do performance comparison of litellm vs. OpenAI request processors

…tribution' into CURATOR-28-add-a-lite-llm-backend-for-curator

…ewer-distribution' into CURATOR-28-add-a-lite-llm-backend-for-curator

CharlieJCJ · 2024-11-21T01:59:53Z

#159 is been merged, now costs have been appropriately logged. litellm also supports cost logging now.

CharlieJCJ · 2024-11-21T02:44:05Z

TODO

Need to investigate why same prompt yield different prompt token issue [litellm+instructor] same prompt, but different prompt tokens, why? #166

CharlieJCJ · 2024-11-22T01:19:11Z

Need to add a better default timeout.

CharlieJCJ · 2024-12-04T00:18:42Z

Request review / approval @RyanMarten @vutrung96

src/bespokelabs/curator/request_processor/base_online_request_processor.py

src/bespokelabs/curator/request_processor/openai_online_request_processor.py

RyanMarten

Small changes - let me know when they are addressed and I'll do another review

examples/litellm_recipe_prompting.py

examples/litellm_recipe_structured_output.py

src/bespokelabs/curator/request_processor/base_online_request_processor.py

src/bespokelabs/curator/prompter/prompter.py

vutrung96

LGTM!

CharlieJCJ added 2 commits November 18, 2024 08:04

litellm online request processer init commit

a51716b

poetry add litellm package

cbc8797

CharlieJCJ changed the title ~~Curator 28 add a lite llm backend for curator~~ Add LiteLLM backend for curator Nov 18, 2024

CharlieJCJ changed the base branch from main to dev November 18, 2024 08:05

CharlieJCJ changed the title ~~Add LiteLLM backend for curator~~ Add LiteLLM+instructor (for structured output) backend for curator Nov 18, 2024

CharlieJCJ added 3 commits November 18, 2024 21:39

claude example

14a5dcc

Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator

7c4f82e

use client.chat.completions.create_with_completion in litellm such th…

c8623cf

…at I get the response and completion objects.

ckpt current progress

d72d917

CharlieJCJ mentioned this pull request Nov 19, 2024

[curator-viewer] add time logging and curator viewer show distribution #149

Merged

CharlieJCJ added 10 commits November 20, 2024 02:16

Merge branch 'CURATOR-43-add-time-logging-and-curator-viewer-show-dis…

959c005

…tribution' into CURATOR-28-add-a-lite-llm-backend-for-curator

add time related logging for litellm

db6ebef

Merge branch 'CURATOR-43-add-time-logging-and-curator-viewer-show-dis…

3ef9cf6

…tribution' into CURATOR-28-add-a-lite-llm-backend-for-curator

add backend as part of hash

feb042d

Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator

c8d5d05

update poetry lock

a9c65a4

Merge branch 'CURATOR-44-add-cost-logging-in-openai-and-show-it-in-vi…

4f3437b

…ewer-distribution' into CURATOR-28-add-a-lite-llm-backend-for-curator

lock revamp

f755b73

add cost and token calculations

f7be7e9

Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator

6e41905

CharlieJCJ added 3 commits November 22, 2024 00:30

ckpt example with many models and model providers

973fca4

Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator

3e2eefa

fix example

e94ad4a

This comment was marked as outdated.

Sign in to view

debug imports

73b7ab4

This comment was marked as resolved.

Sign in to view

CharlieJCJ added 3 commits December 4, 2024 00:12

revert retry logic to process during the end

29ad907

changed the default litellm model to be gemini

c56a5ed

reverse the order of litellm models

f79226d

This was linked to issues Dec 4, 2024

Add more model support with liteLLM #74

Closed

openai online & batch only supported models are gpt4o and gpt4o-mini #164

Closed

support for other APIs #179

Closed

RyanMarten self-requested a review December 4, 2024 00:56

vutrung96 reviewed Dec 4, 2024

View reviewed changes

src/bespokelabs/curator/request_processor/base_online_request_processor.py Outdated Show resolved Hide resolved

src/bespokelabs/curator/request_processor/openai_online_request_processor.py Outdated Show resolved Hide resolved

RyanMarten reviewed Dec 4, 2024

View reviewed changes

CharlieJCJ added 6 commits December 4, 2024 01:08

typing imports

d9b3cf4

rm duplicating code

ec1a1e9

add key formatted instructions

2ba4bdf

more specific try except

232dab5

avoid sequentially process the retried entries. do parallel async

50ee9e7

black

cdfe4a2

This comment was marked as resolved.

Sign in to view

black

5cf2c7f

CharlieJCJ requested review from RyanMarten and vutrung96 December 4, 2024 02:22

remove the short async timeout

2e0ac2c

This comment was marked as resolved.

Sign in to view

vutrung96 approved these changes Dec 4, 2024

View reviewed changes

CharlieJCJ merged commit 860b6b9 into dev Dec 4, 2024
2 checks passed

RyanMarten mentioned this pull request Dec 4, 2024

[Bug]: Cannot get past 50 RPS BerriAI/litellm#6592

Open

This was referenced Dec 4, 2024

OnlineRequestProcessor only retry once #201

Closed

OnlineRequestProcessor V2 Megathread #204

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LiteLLM+instructor (for structured output) backend for curator #141

Add LiteLLM+instructor (for structured output) backend for curator #141

CharlieJCJ commented Nov 18, 2024 •

edited

Loading

CharlieJCJ commented Nov 18, 2024

CharlieJCJ commented Nov 18, 2024

CharlieJCJ commented Nov 20, 2024 •

edited

Loading

CharlieJCJ commented Nov 21, 2024 •

edited

Loading

CharlieJCJ commented Nov 21, 2024 •

edited

Loading

CharlieJCJ commented Nov 22, 2024

This comment was marked as outdated.

This comment was marked as resolved.

This comment was marked as resolved.

CharlieJCJ commented Dec 4, 2024

RyanMarten left a comment

This comment was marked as resolved.

This comment was marked as resolved.

vutrung96 left a comment

Add LiteLLM+instructor (for structured output) backend for curator #141

Add LiteLLM+instructor (for structured output) backend for curator #141

Conversation

CharlieJCJ commented Nov 18, 2024 • edited Loading

CharlieJCJ commented Nov 18, 2024

CharlieJCJ commented Nov 18, 2024

CharlieJCJ commented Nov 20, 2024 • edited Loading

CharlieJCJ commented Nov 21, 2024 • edited Loading

CharlieJCJ commented Nov 21, 2024 • edited Loading

CharlieJCJ commented Nov 22, 2024

This comment was marked as outdated.

This comment was marked as resolved.

This comment was marked as resolved.

CharlieJCJ commented Dec 4, 2024

RyanMarten left a comment

Choose a reason for hiding this comment

This comment was marked as resolved.

This comment was marked as resolved.

vutrung96 left a comment

Choose a reason for hiding this comment

CharlieJCJ commented Nov 18, 2024 •

edited

Loading

CharlieJCJ commented Nov 20, 2024 •

edited

Loading

CharlieJCJ commented Nov 21, 2024 •

edited

Loading

CharlieJCJ commented Nov 21, 2024 •

edited

Loading