Merge pull request #255 from bespokelabsai/mahesh/refactor_llm

Add a SimpleLLM interface, and update documentation.
bespokelabsai · Dec 14, 2024 · 3096a29 · 3096a29
2 parents 092d2d2 + c14233d
commit 3096a29
Show file tree

Hide file tree

Showing 9 changed files with 155 additions and 122 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,4 +1,5 @@
 .venv
+.DS_Store
 __pycache__
 .vscode
 

diff --git a/README.md b/README.md
@@ -29,7 +29,7 @@
   </a>
 </p>
 
-### Overview
+## Overview
 
 Bespoke Curator makes it very easy to create high-quality synthetic data at scale, which you can use to finetune models or use for structured data extraction at scale.
 
@@ -38,56 +38,99 @@ Bespoke Curator is an open-source project:
 * A Curator Viewer which makes it easy to view the datasets, thus aiding in the dataset creation.
 * We will also be releasing high-quality datasets that should move the needle on post-training.
 
-### Key Features
+## Key Features
 
 1. **Programmability and Structured Outputs**: Synthetic data generation is lot more than just using a single prompt -- it involves calling LLMs multiple times and orchestrating control-flow. Curator treats structured outputs as first class citizens and helps you design complex pipelines.
 2. **Built-in Performance Optimization**: We often see calling LLMs in loops, or inefficient implementation of multi-threading. We have baked in performance optimizations so that you don't need to worry about those!
 3. **Intelligent Caching and Fault Recovery**: Given LLM calls can add up in cost and time, failures are undesirable but sometimes unavoidable. We cache the LLM requests and responses so that it is easy to recover from a failure. Moreover, when working on a multi-stage pipeline, caching of stages makes it easy to iterate.
 4. **Native HuggingFace Dataset Integration**: Work directly on HuggingFace Dataset objects throughout your pipeline. Your synthetic data is immediately ready for fine-tuning!
 5. **Interactive Curator Viewer**: Improve and iterate on your prompts using our built-in viewer. Inspect LLM requests and responses in real-time, allowing you to iterate and refine your data generation strategy with immediate feedback.
 
-### Installation
+## Installation
 
 ```bash
 pip install bespokelabs-curator
 ```
 
-### Usage
+## Usage
+To run the examples below, make sure to set your OpenAI API key in 
+the environment variable `OPENAI_API_KEY` by running `export OPENAI_API_KEY=sk-...` in your terminal.
+
+### Hello World with `SimpleLLM`: A simple interface for calling LLMs
+
+```python
+from bespokelabs import curator
+llm = curator.SimpleLLM(model_name="gpt-4o-mini")
+poem = llm("Write a poem about the importance of data in AI.")
+print(poem)
+# Or you can pass a list of prompts to generate multiple responses.
+poems = llm(["Write a poem about the importance of data in AI.",
+            "Write a haiku about the importance of data in AI."])
+print(poems)
+```
+Note that retries and caching are enabled by default.
+So now if you run the same prompt again, you will get the same response, pretty much instantly.
+You can delete the cache at `~/.cache/curator`.
+
+#### Use LiteLLM backend for calling other models
+You can use the [LiteLLM](https://docs.litellm.ai/docs/providers) backend for calling other models.
 
+```python
+from bespokelabs import curator
+llm = curator.SimpleLLM(model_name="claude-3-5-sonnet-20240620", backend="litellm")
+poem = llm("Write a poem about the importance of data in AI.")
+print(poem)
+```
+
+### Visualize in Curator Viewer
+Run `curator-viewer` on the command line to see the dataset in the viewer.
+
+You can click on a run and then click on a specific row to see the LLM request and response.
+![Curator Responses](docs/curator-responses.png)
+More examples below.
+
+### `LLM`: A more powerful interface for synthetic data generation
+
+Let's use structured outputs to generate poems.
 ```python
 from bespokelabs import curator
 from datasets import Dataset
 from pydantic import BaseModel, Field
 from typing import List
 
-# Create a dataset object for the topics you want to create the poems.
 topics = Dataset.from_dict({"topic": [
     "Urban loneliness in a bustling city",
     "Beauty of Bespoke Labs's Curator library"
 ]})
+```
 
-# Define a class to encapsulate a list of poems.
+Define a class to encapsulate a list of poems.
+```python
 class Poem(BaseModel):
     poem: str = Field(description="A poem.")
 
 class Poems(BaseModel):
     poems_list: List[Poem] = Field(description="A list of poems.")
+```
 
-
-# We define an `LLM` object that generates poems which gets applied to the topics dataset.
+We define an `LLM` object that generates poems which gets applied to the topics dataset.
+```python
 poet = curator.LLM(
-    # `prompt_func` takes a row of the dataset as input.
-    # `row` is a dictionary with a single key 'topic' in this case.
     prompt_func=lambda row: f"Write two poems about {row['topic']}.",
     model_name="gpt-4o-mini",
     response_format=Poems,
-    # `row` is the input row, and `poems` is the `Poems` class which 
-    # is parsed from the structured output from the LLM.
     parse_func=lambda row, poems: [
         {"topic": row["topic"], "poem": p.poem} for p in poems.poems_list
     ],
 )
+```
+Here:
+* `prompt_func` takes a row of the dataset as input and returns the prompt for the LLM.
+* `response_format` is the structured output class we defined above.
+* `parse_func` takes the input (`row`) and the structured output (`poems`) and converts it to a list of dictionaries. This is so that we can easily convert the output to a HuggingFace Dataset object.
 
+Now we can apply the `LLM` object to the dataset, which reads very pythonic.
+```python
 poem = poet(topics)
 print(poem.to_pandas())
 # Example output:
@@ -102,9 +145,6 @@ and we can scale this up to create tens of thousands of diverse poems.
 You can see a more detailed example in the [examples/poem.py](https://github.com/bespokelabsai/curator/blob/mahesh/update_doc/examples/poem.py) file,
 and other examples in the [examples](https://github.com/bespokelabsai/curator/blob/mahesh/update_doc/examples) directory.
 
-To run the examples, make sure to set your OpenAI API key in 
-the environment variable `OPENAI_API_KEY` by running `export OPENAI_API_KEY=sk-...` in your terminal.
-
 See the [docs](https://docs.bespokelabs.ai/) for more details as well as 
 for troubleshooting information.
 
@@ -118,6 +158,12 @@ curator-viewer
 
 This will pop up a browser window with the viewer running on `127.0.0.1:3000` by default if you haven't specified a different host and port.
 
+The dataset viewer shows all the different runs you have made.
+![Curator Runs](docs/curator-runs.png)
+
+You can also see the dataset and the responses from the LLM.
+![Curator Dataset](docs/curator-dataset.png)
+
 
 Optional parameters to run the viewer on a different host and port:
 ```bash

diff --git a/docs/curator-dataset.png b/docs/curator-dataset.png
diff --git a/docs/curator-responses.png b/docs/curator-responses.png
diff --git a/docs/curator-runs.png b/docs/curator-runs.png
diff --git a/examples/simple_poem.py b/examples/simple_poem.py
@@ -0,0 +1,25 @@
+"""Curator example that uses `SimpleLLM` to generate poems.
+
+Please see the poem.py for more complex use cases.
+"""
+
+from bespokelabs import curator
+
+# Use GPT-4o-mini for this example.
+llm = curator.SimpleLLM(model_name="gpt-4o-mini")
+poem = llm("Write a poem about the importance of data in AI.")
+print(poem)
+
+# Use Claude 3.5 Sonnet for this example.
+llm = curator.SimpleLLM(model_name="claude-3-5-sonnet-20240620", backend="litellm")
+poem = llm("Write a poem about the importance of data in AI.")
+print(poem)
+
+# Note that we can also pass a list of prompts to generate multiple responses.
+poems = llm(
+    [
+        "Write a sonnet about the importance of data in AI.",
+        "Write a haiku about the importance of data in AI.",
+    ]
+)
+print(poems)
diff --git a/src/bespokelabs/curator/__init__.py b/src/bespokelabs/curator/__init__.py
@@ -1,2 +1,3 @@
 from .dataset import Dataset
 from .llm.llm import LLM
+from .llm.simple_llm import SimpleLLM
diff --git a/src/bespokelabs/curator/llm/llm.py b/src/bespokelabs/curator/llm/llm.py
@@ -37,113 +37,6 @@
 class LLM:
     """Interface for prompting LLMs."""
 
-    def __init__(
-        self,
-        model_name: str,
-        prompt_func: Callable[[Union[Dict[str, Any], BaseModel]], Dict[str, str]],
-        parse_func: Optional[
-            Callable[
-                [
-                    _DictOrBaseModel,
-                    _DictOrBaseModel,
-                ],
-                T,
-            ]
-        ] = None,
-        response_format: Optional[Type[BaseModel]] = None,
-        backend: Optional[str] = None,
-        max_requests_per_minute: Optional[int] = None,
-        max_tokens_per_minute: Optional[int] = None,
-        temperature: Optional[float] = None,
-        top_p: Optional[float] = None,
-        presence_penalty: Optional[float] = None,
-        frequency_penalty: Optional[float] = None,
-        max_retries: Optional[int] = None,
-        require_all_responses: Optional[bool] = None,
-    ):
-        """Initialize a LLM.
-
-        Args:
-            model_name: The name of the LLM to use
-            prompt_func: A function that takes a single row
-                and returns either a string (assumed to be a user prompt) or messages list
-            parse_func: A function that takes the input row and
-                response object and returns the parsed output
-            response_format: A Pydantic model specifying the
-                response format from the LLM.
-            backend: The backend to use ("openai" or "litellm"). If None, will be auto-determined
-            max_requests_per_minute: Maximum requests per minute (not supported in batch mode)
-            max_tokens_per_minute: Maximum tokens per minute (not supported in batch mode)
-            temperature: The temperature to use for the LLM
-            top_p: The top_p to use for the LLM
-            presence_penalty: The presence_penalty to use for the LLM
-            frequency_penalty: The frequency_penalty to use for the LLM
-            max_retries: The maximum number of retries to use for the LLM. If 0, will only try a request once.
-            require_all_responses: Whether to require all responses
-        """
-        self.prompt_formatter = PromptFormatter(
-            model_name, prompt_func, parse_func, response_format
-        )
-
-        # Initialize context manager state
-        self._batch_config = None
-        self._original_request_processor = None
-
-        # Store model parameters
-        self.temperature = temperature
-        self.top_p = top_p
-        self.presence_penalty = presence_penalty
-        self.frequency_penalty = frequency_penalty
-        self.model_name = model_name
-
-        # Auto-determine backend if not specified
-        if backend is not None:
-            self.backend = backend
-        else:
-            self.backend = self._determine_backend(model_name, response_format)
-
-        # Initialize request processor
-        self._setup_request_processor(
-            max_requests_per_minute=max_requests_per_minute,
-            max_tokens_per_minute=max_tokens_per_minute,
-            max_retries=max_retries,
-            require_all_responses=require_all_responses,
-        )
-
-    @staticmethod
-    def _determine_backend(
-        model_name: str, response_format: Optional[Type[BaseModel]] = None
-    ) -> str:
-        """Determine which backend to use based on model name and response format.
-
-        Args:
-            model_name (str): Name of the model
-            response_format (Optional[Type[BaseModel]]): Response format if specified
-
-        Returns:
-            str: Backend to use ("openai" or "litellm")
-        """
-        model_name = model_name.lower()
-
-        # GPT-4o models with response format should use OpenAI
-        if (
-            response_format
-            and OpenAIOnlineRequestProcessor(model_name).check_structured_output_support()
-        ):
-            logger.info(f"Requesting structured output from {model_name}, using OpenAI backend")
-            return "openai"
-
-        # GPT models and O1 models without response format should use OpenAI
-        if not response_format and any(x in model_name for x in ["gpt-", "o1-preview", "o1-mini"]):
-            logger.info(f"Requesting text output from {model_name}, using OpenAI backend")
-            return "openai"
-
-        # Default to LiteLLM for all other cases
-        logger.info(
-            f"Requesting {f'structured' if response_format else 'text'} output from {model_name}, using LiteLLM backend"
-        )
-        return "litellm"
-
     def __init__(
         self,
         model_name: str,
@@ -262,6 +155,40 @@ def __init__(
         else:
             raise ValueError(f"Unknown backend: {self.backend}")
 
+    @staticmethod
+    def _determine_backend(
+        model_name: str, response_format: Optional[Type[BaseModel]] = None
+    ) -> str:
+        """Determine which backend to use based on model name and response format.
+
+        Args:
+            model_name (str): Name of the model
+            response_format (Optional[Type[BaseModel]]): Response format if specified
+
+        Returns:
+            str: Backend to use ("openai" or "litellm")
+        """
+        model_name = model_name.lower()
+
+        # GPT-4o models with response format should use OpenAI
+        if (
+            response_format
+            and OpenAIOnlineRequestProcessor(model_name).check_structured_output_support()
+        ):
+            logger.info(f"Requesting structured output from {model_name}, using OpenAI backend")
+            return "openai"
+
+        # GPT models and O1 models without response format should use OpenAI
+        if not response_format and any(x in model_name for x in ["gpt-", "o1-preview", "o1-mini"]):
+            logger.info(f"Requesting text output from {model_name}, using OpenAI backend")
+            return "openai"
+
+        # Default to LiteLLM for all other cases
+        logger.info(
+            f"Requesting {f'structured' if response_format else 'text'} output from {model_name}, using LiteLLM backend"
+        )
+        return "litellm"
+
     def __call__(
         self,
         dataset: Optional[Iterable] = None,

diff --git a/src/bespokelabs/curator/llm/simple_llm.py b/src/bespokelabs/curator/llm/simple_llm.py
@@ -0,0 +1,33 @@
+from bespokelabs.curator.llm.llm import LLM
+from datasets import Dataset
+from typing import Union, List
+
+
+class SimpleLLM:
+    """A simpler interface for the LLM class.
+
+    Usage:
+      llm = SimpleLLM(model_name="gpt-4o-mini")
+      llm("Do you know about the bitter lesson?")
+      llm(["What is the capital of France?", "What is the capital of Germany?"])
+    For more complex use cases (e.g. structured outputs and custom prompt functions), see the LLM class.
+    """
+
+    def __init__(self, model_name: str, backend: str = "openai"):
+        self._model_name = model_name
+        self._backend = backend
+
+    def __call__(self, prompt: Union[str, List[str]]) -> Union[str, List[str]]:
+        prompt_list = [prompt] if isinstance(prompt, str) else prompt
+        dataset: Dataset = Dataset.from_dict({"prompt": prompt_list})
+
+        llm = LLM(
+            prompt_func=lambda row: row["prompt"],
+            model_name=self._model_name,
+            response_format=None,
+            backend=self._backend,
+        )
+        response = llm(dataset)
+        if isinstance(prompt, str):
+            return response["response"][0]
+        return response["response"]