Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LiteLLM+instructor (for structured output) backend for curator #141

Merged
merged 74 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from 66 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
a51716b
litellm online request processer init commit
CharlieJCJ Nov 18, 2024
cbc8797
poetry add litellm package
CharlieJCJ Nov 18, 2024
14a5dcc
claude example
CharlieJCJ Nov 18, 2024
7c4f82e
Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator
CharlieJCJ Nov 18, 2024
c8623cf
use client.chat.completions.create_with_completion in litellm such th…
CharlieJCJ Nov 18, 2024
d72d917
ckpt current progress
CharlieJCJ Nov 19, 2024
959c005
Merge branch 'CURATOR-43-add-time-logging-and-curator-viewer-show-dis…
CharlieJCJ Nov 20, 2024
db6ebef
add time related logging for litellm
CharlieJCJ Nov 20, 2024
3ef9cf6
Merge branch 'CURATOR-43-add-time-logging-and-curator-viewer-show-dis…
CharlieJCJ Nov 20, 2024
feb042d
add backend as part of hash
CharlieJCJ Nov 20, 2024
c8d5d05
Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator
CharlieJCJ Nov 20, 2024
a9c65a4
update poetry lock
CharlieJCJ Nov 20, 2024
4f3437b
Merge branch 'CURATOR-44-add-cost-logging-in-openai-and-show-it-in-vi…
CharlieJCJ Nov 20, 2024
f755b73
lock revamp
CharlieJCJ Nov 20, 2024
f7be7e9
add cost and token calculations
CharlieJCJ Nov 21, 2024
6e41905
Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator
CharlieJCJ Nov 21, 2024
973fca4
ckpt example with many models and model providers
CharlieJCJ Nov 22, 2024
3e2eefa
Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator
CharlieJCJ Nov 22, 2024
e94ad4a
fix example
CharlieJCJ Nov 22, 2024
74e6375
revamped async litellm from yesterday checkpoint
CharlieJCJ Nov 23, 2024
4f0e1ce
unused import
CharlieJCJ Nov 23, 2024
e045cda
ckpt example
CharlieJCJ Nov 23, 2024
cddf820
revert back to the old litellm backend, with fixes. This is more stable
CharlieJCJ Nov 23, 2024
56e2338
add default timeout
CharlieJCJ Nov 24, 2024
e571343
add simple litellm example
CharlieJCJ Nov 24, 2024
ad0fd1f
resume pbar display
CharlieJCJ Nov 25, 2024
50053c9
add token based rate limiting (openai api like token usage, i.e. prom…
CharlieJCJ Nov 26, 2024
3e843f8
parallel retry on litellm
CharlieJCJ Nov 26, 2024
8f25ed9
unused input param
CharlieJCJ Nov 26, 2024
9065f55
renamed examples
CharlieJCJ Nov 27, 2024
2c80195
added check instructor + litellm coverage before using instructor's s…
CharlieJCJ Nov 27, 2024
a8c547b
rename examples
CharlieJCJ Nov 27, 2024
5c88363
rm commented litellm debug message
CharlieJCJ Nov 27, 2024
4115176
run black
CharlieJCJ Nov 27, 2024
80229e8
add model init logging info
CharlieJCJ Nov 27, 2024
b0cae1f
cleanup examples
CharlieJCJ Nov 28, 2024
b438e2c
rename example files
CharlieJCJ Nov 28, 2024
50f631b
add parse func fields in example
CharlieJCJ Nov 28, 2024
8a89204
litellm refactoring base online request processor
CharlieJCJ Dec 1, 2024
10ea1f4
remove unused unused function that is been refactored
CharlieJCJ Dec 1, 2024
f834531
black
CharlieJCJ Dec 1, 2024
e62bd9d
base request processor try except
CharlieJCJ Dec 1, 2024
52be211
original api_endpoint_from_url implementation
CharlieJCJ Dec 1, 2024
cee80b0
Merge pull request #188 from bespokelabsai/litellm-refactor
CharlieJCJ Dec 1, 2024
f358210
Merge branch 'dev' into CURATOR-28-add-a-lite-llm-backend-for-curator
CharlieJCJ Dec 1, 2024
f56d766
cleanup after merging from dev
CharlieJCJ Dec 1, 2024
8dbf9e9
status is retrieved from the raw response_obj, not parsed
CharlieJCJ Dec 1, 2024
6fe5fd1
consistant typing and 10* constant introduced in openai online
CharlieJCJ Dec 1, 2024
e261517
black
CharlieJCJ Dec 1, 2024
17766b5
remove unused imports
CharlieJCJ Dec 2, 2024
adc04b4
rm confusing docstring
CharlieJCJ Dec 3, 2024
b0e0f25
refactor, to have a `handle_single_request_with_retries` and `call_si…
CharlieJCJ Dec 3, 2024
d272b59
raise ValueError instead of assert
CharlieJCJ Dec 3, 2024
a3a4d74
add logging when don't have capacity
CharlieJCJ Dec 3, 2024
f9ae234
renamed online_request_processor -> base_online_request_processor
CharlieJCJ Dec 3, 2024
4ec368d
black
CharlieJCJ Dec 3, 2024
2831914
bring back the resume logging and temp file write logic
CharlieJCJ Dec 3, 2024
bd99464
bring better logging for try except in base online
CharlieJCJ Dec 3, 2024
0d5292d
set logger to info for openai
CharlieJCJ Dec 3, 2024
87b58ea
baseonlinerequestprocessor
CharlieJCJ Dec 3, 2024
e24d493
default model backend choose if backend is None, and support openai s…
CharlieJCJ Dec 3, 2024
6d8da45
black
CharlieJCJ Dec 3, 2024
73b7ab4
debug imports
CharlieJCJ Dec 3, 2024
29ad907
revert retry logic to process during the end
CharlieJCJ Dec 4, 2024
c56a5ed
changed the default litellm model to be gemini
CharlieJCJ Dec 4, 2024
f79226d
reverse the order of litellm models
CharlieJCJ Dec 4, 2024
d9b3cf4
typing imports
CharlieJCJ Dec 4, 2024
ec1a1e9
rm duplicating code
CharlieJCJ Dec 4, 2024
2ba4bdf
add key formatted instructions
CharlieJCJ Dec 4, 2024
232dab5
more specific try except
CharlieJCJ Dec 4, 2024
50ee9e7
avoid sequentially process the retried entries. do parallel async
CharlieJCJ Dec 4, 2024
cdfe4a2
black
CharlieJCJ Dec 4, 2024
5cf2c7f
black
CharlieJCJ Dec 4, 2024
2e0ac2c
remove the short async timeout
CharlieJCJ Dec 4, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions examples/litellm_recipe_prompting.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from typing import List
from pydantic import BaseModel, Field
from bespokelabs import curator
from datasets import Dataset

CharlieJCJ marked this conversation as resolved.
Show resolved Hide resolved

def main():
# List of cuisines to generate recipes for
cuisines = [
{"cuisine": cuisine}
for cuisine in [
"Chinese",
"Italian",
"Mexican",
"French",
"Japanese",
"Indian",
"Thai",
"Korean",
"Vietnamese",
"Brazilian",
]
]
cuisines = Dataset.from_list(cuisines)

# Create prompter using LiteLLM backend
recipe_prompter = curator.Prompter(
model_name="gemini/gemini-1.5-flash",
prompt_func=lambda row: f"Generate a random {row['cuisine']} recipe. Be creative but keep it realistic.",
parse_func=lambda row, response: {
"recipe": response,
"cuisine": row["cuisine"],
},
backend="litellm",
)

# Generate recipes for all cuisines
recipes = recipe_prompter(cuisines)

# Print results
print(recipes.to_pandas())


if __name__ == "__main__":
main()
59 changes: 59 additions & 0 deletions examples/litellm_recipe_structured_output.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
from typing import List
from pydantic import BaseModel, Field
from bespokelabs import curator
import logging

logger = logging.getLogger(__name__)


# Define response format using Pydantic
class Recipe(BaseModel):
title: str = Field(description="Title of the recipe")
ingredients: List[str] = Field(description="List of ingredients needed")
instructions: List[str] = Field(description="Step by step cooking instructions")
prep_time: int = Field(description="Preparation time in minutes")
cook_time: int = Field(description="Cooking time in minutes")
servings: int = Field(description="Number of servings")


class Cuisines(BaseModel):
cuisines_list: List[str] = Field(description="A list of cuisines.")


def main():
# We define a prompter that generates cuisines
cuisines_generator = curator.Prompter(
prompt_func=lambda: f"Generate 10 diverse cuisines.",
CharlieJCJ marked this conversation as resolved.
Show resolved Hide resolved
model_name="claude-3-5-haiku-20241022",
response_format=Cuisines,
parse_func=lambda _, cuisines: [{"cuisine": t} for t in cuisines.cuisines_list],
backend="litellm",
)
cuisines = cuisines_generator()
print(cuisines.to_pandas())

recipe_prompter = curator.Prompter(
model_name="gemini/gemini-1.5-flash",
prompt_func=lambda row: f"Generate a random {row['cuisine']} recipe. Be creative but keep it realistic.",
parse_func=lambda row, response: {
"title": response.title,
"ingredients": response.ingredients,
"instructions": response.instructions,
"prep_time": response.prep_time,
"cook_time": response.cook_time,
"servings": response.servings,
"cuisine": row["cuisine"],
},
response_format=Recipe,
backend="litellm",
)

# Generate recipes for all cuisines
recipes = recipe_prompter(cuisines)

# Print results
print(recipes.to_pandas())


if __name__ == "__main__":
main()
536 changes: 272 additions & 264 deletions poetry.lock

Large diffs are not rendered by default.

98 changes: 81 additions & 17 deletions src/bespokelabs/curator/prompter/prompter.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,9 @@
from bespokelabs.curator.request_processor.openai_online_request_processor import (
OpenAIOnlineRequestProcessor,
)
from bespokelabs.curator.request_processor.litellm_online_request_processor import (
LiteLLMOnlineRequestProcessor,
)

_CURATOR_DEFAULT_CACHE_DIR = "~/.cache/curator"
T = TypeVar("T")
Expand All @@ -31,6 +34,40 @@
class Prompter:
"""Interface for prompting LLMs."""

@staticmethod
def _determine_backend(
model_name: str, response_format: Optional[Type[BaseModel]] = None
) -> str:
"""Determine which backend to use based on model name and response format.

Args:
model_name (str): Name of the model
response_format (Optional[Type[BaseModel]]): Response format if specified

Returns:
str: Backend to use ("openai" or "litellm")
"""
model_name = model_name.lower()

# GPT-4o models with response format should use OpenAI
if (
response_format
and OpenAIOnlineRequestProcessor(model_name).check_structured_output_support()
):
logger.info(f"Requesting structured output from {model_name}, using OpenAI backend")
return "openai"

# GPT models and O1 models without response format should use OpenAI
if not response_format and any(x in model_name for x in ["gpt-", "o1-preview", "o1-mini"]):
logger.info(f"Requesting text output from {model_name}, using OpenAI backend")
return "openai"

# Default to LiteLLM for all other cases
logger.info(
f"Requesting {f'structured' if response_format else 'text'} output from {model_name}, using LiteLLM backend"
)
return "litellm"

def __init__(
self,
model_name: str,
Expand All @@ -45,6 +82,7 @@ def __init__(
]
] = None,
response_format: Optional[Type[BaseModel]] = None,
backend: Optional[str] = None,
batch: bool = False,
batch_size: Optional[int] = None,
temperature: Optional[float] = None,
Expand All @@ -62,6 +100,7 @@ def __init__(
response object and returns the parsed output
response_format (Optional[Type[BaseModel]]): A Pydantic model specifying the
response format from the LLM.
backend (Optional[str]): The backend to use ("openai" or "litellm"). If None, will be auto-determined
batch (bool): Whether to use batch processing
batch_size (Optional[int]): The size of the batch to use, only used if batch is True
temperature (Optional[float]): The temperature to use for the LLM, only used if batch is False
Expand All @@ -86,32 +125,56 @@ def __init__(
model_name, prompt_func, parse_func, response_format
)
self.batch_mode = batch
if batch:
if batch_size is None:
batch_size = 1_000
logger.info(
f"batch=True but no batch_size provided, using default batch_size of {batch_size:,}"
)
self._request_processor = OpenAIBatchRequestProcessor(
model=model_name,
batch_size=batch_size,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
)

# Auto-determine backend if not specified
# Use provided backend or auto-determine based on model and format
if backend is not None:
self.backend = backend
else:
if batch_size is not None:
self.backend = self._determine_backend(model_name, response_format)

# Select request processor based on backend
if self.backend == "openai":
if batch:
if batch_size is None:
batch_size = 1_000
logger.info(
f"batch=True but no batch_size provided, using default batch_size of {batch_size:,}"
)
self._request_processor = OpenAIBatchRequestProcessor(
model=model_name,
batch_size=batch_size,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
)
else:
if batch_size is not None:
logger.warning(
f"Prompter argument `batch_size` {batch_size} is ignored because `batch` is False"
)
self._request_processor = OpenAIOnlineRequestProcessor(
model=model_name,
temperature=temperature,
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
)
elif self.backend == "litellm":
if batch:
logger.warning(
f"Prompter argument `batch_size` {batch_size} is ignored because `batch` is False"
"Batch mode is not supported with LiteLLM backend, ignoring batch=True"
)
self._request_processor = OpenAIOnlineRequestProcessor(
self._request_processor = LiteLLMOnlineRequestProcessor(
model=model_name,
temperature=temperature,
RyanMarten marked this conversation as resolved.
Show resolved Hide resolved
top_p=top_p,
presence_penalty=presence_penalty,
frequency_penalty=frequency_penalty,
)
else:
raise ValueError(f"Unknown backend: {self.backend}")

def __call__(self, dataset: Optional[Iterable] = None, working_dir: str = None) -> Dataset:
"""
Expand Down Expand Up @@ -176,6 +239,7 @@ def _completions(
else "text"
),
str(self.batch_mode),
str(self.backend),
]
)

Expand Down
Loading
Loading