[NA] [DOCS] Improve filter_string parameter documentation (#3217)

Lothiraldan · web-flow · commit 214ea21d9d7d · 2025-09-07T15:45:29.000+01:00
diff --git a/apps/opik-documentation/documentation/fern/docs/evaluation/evaluate_threads.mdx b/apps/opik-documentation/documentation/fern/docs/evaluation/evaluate_threads.mdx
@@ -1,6 +1,7 @@
 ---
 subtitle: Step-by-step guide on how to evaluate conversation threads
 ---
+
 When you are running multi-turn conversations using frameworks that support LLM agents, the Opik integration will
 automatically group related traces into conversation threads using parameters suitable for each framework.
 
@@ -57,20 +58,39 @@ filter_string='id = "0197ad2a" AND status = "inactive"'
 
 **Supported filter fields and operators**
 
-The `evaluate_threads` function supports the following filter fields in the `filter_string` and
-operators to be applied to the corresponding fields:
-
-| Field              | Type     | Operators                     |
-|--------------------|----------|-------------------------------|
-| id                 | string   | ``=, contains, not_contains`` |
-| status             | string   | ``=, contains, not_contains`` |
-| start_time         | datetime | ``=, >, <, >=, <=``           |
-| end_time           | datetime | ``=, >, <, >=, <=``           |
-| feedback_scores    | dict     | ``=, >, <, >=, <=``           |
-| tags               | list     | ``contains``                  |
-| duration           | number   | ``=, >, <, >=, <=``           |
-| number_of_messages | number   | ``=, >, <, >=, <=``           |
-| created_by         | string   | ``=, contains, not_contains`` |
+The `evaluate_threads` function supports the following filter fields in the `filter_string` using Opik Query Language (OQL).
+All fields and operators are the same as those supported by `search_traces` and `search_spans`:
+
+| Field                     | Type       | Operators                                                                   |
+| ------------------------- | ---------- | --------------------------------------------------------------------------- |
+| `id`                      | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `name`                    | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `created_by`              | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `thread_id`               | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `type`                    | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `model`                   | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `provider`                | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `status`                  | String     | `=`, `contains`, `not_contains`                                             |
+| `start_time`              | DateTime   | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `end_time`                | DateTime   | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `input`                   | String     | `=`, `contains`, `not_contains`                                             |
+| `output`                  | String     | `=`, `contains`, `not_contains`                                             |
+| `metadata`                | Dictionary | `=`, `contains`, `>`, `<`                                                   |
+| `feedback_scores`         | Numeric    | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `tags`                    | List       | `contains`                                                                  |
+| `usage.total_tokens`      | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `usage.prompt_tokens`     | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `usage.completion_tokens` | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `duration`                | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `number_of_messages`      | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `total_estimated_cost`    | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+
+**Rules:**
+
+- String values must be wrapped in double quotes
+- DateTime fields require ISO 8601 format (e.g., "2024-01-01T00:00:00Z")
+- Use dot notation for nested objects: `metadata.model`, `feedback_scores.accuracy`
+- Multiple conditions can be combined with `AND` (OR is not supported)
 
 The `feedback_scores` field is a dictionary where the keys are the metric names and the values are the metric values.
 You can use it to filter threads based on their feedback scores. For example, if you want to evaluate only threads
@@ -95,9 +115,10 @@ Once the evaluation is complete, you can access the evaluation results in the Op
   Threads are automatically marked as inactive after the timeout period and you can also manually mark a thread as inactive via UI or via SDK.
 
   You can only evaluate/score threads that are inactive.
+
 </Note>
 
 ## Next steps
 
 For more details on what metrics can be used to score conversational threads, refer to
-the [conversational metrics](/evaluation/metrics/conversation_threads_metrics) page.
+the [conversational metrics](/evaluation/metrics/conversation_threads_metrics) page.
diff --git a/apps/opik-documentation/documentation/fern/docs/prompt_engineering/prompt_management.mdx b/apps/opik-documentation/documentation/fern/docs/prompt_engineering/prompt_management.mdx
@@ -171,6 +171,25 @@ for prompt in filtered:
     print(prompt.name, prompt.commit, prompt.prompt)
 ```
 
+The `filter_string` parameter uses Opik Query Language (OQL) with the format:
+`"<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"`
+
+**Supported columns for prompts:**
+
+| Column       | Type   | Operators                                                                   |
+| ------------ | ------ | --------------------------------------------------------------------------- |
+| `id`         | String | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `name`       | String | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `created_by` | String | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `tags`       | List   | `contains`                                                                  |
+
+**Examples:**
+
+- `tags contains "production"` - Filter by tag
+- `name contains "summary"` - Filter by name substring
+- `created_by = "user@example.com"` - Filter by creator
+- `tags contains "alpha" AND tags contains "beta"` - Multiple tag filtering
+
 `search_prompts` returns the **latest** version for each matching prompt.
 
 ## Linking prompts to Experiments
diff --git a/apps/opik-documentation/documentation/fern/docs/tracing/export_data.mdx b/apps/opik-documentation/documentation/fern/docs/tracing/export_data.mdx
@@ -54,36 +54,44 @@ traces = client.search_traces(
 traces = [trace.dict() for trace in traces]
 ```
 
-The `filter_string` parameter should be a string in the following format:
+The `filter_string` parameter should be a string in the following format using Opik Query Language (OQL):
 
 ```
-"<COLUMN> <OPERATOR> <VALUE> [and <COLUMN> <OPERATOR> <VALUE>]*"
+"<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"
 ```
 
-where:
-
-1. `<COLUMN>`: The column name to filter on, these can be:
-   - `name`
-   - `input`
-   - `output`
-   - `start_time`
-   - `end_time`
-   - `metadata`
-   - `feedback_scores`
-   - `tags`
-   - `usage.total_tokens`
-   - `usage.prompt_tokens`
-   - `usage.completion_tokens`.
-2. `<OPERATOR>`: The operator to use for the filter, this can be `=`, `!=`, `>`, `>=`, `<`, `<=`, `contains`, `not_contains`. Not that not all operators are supported for all columns.
-3. `<VALUE>`: The value to use in the comparison to `<COLUMN>`. If the value is a string, you will need to wrap it in double quotes.
-
-You can add as many `and` clauses as required.
-
-If a `<COLUMN>` item refers to a nested object, then you can use the
-dot notation to access contained values by using its key. For example,
-you could use:
-
-`"feedback_scores.accuracy > 0.5"`
+**Supported columns and operators:**
+
+| Column                    | Type       | Operators                                                                   |
+| ------------------------- | ---------- | --------------------------------------------------------------------------- |
+| `id`                      | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `name`                    | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `created_by`              | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `thread_id`               | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `type`                    | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `model`                   | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `provider`                | String     | `=`, `!=`, `contains`, `not_contains`, `starts_with`, `ends_with`, `>`, `<` |
+| `status`                  | String     | `=`, `contains`, `not_contains`                                             |
+| `start_time`              | DateTime   | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `end_time`                | DateTime   | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `input`                   | String     | `=`, `contains`, `not_contains`                                             |
+| `output`                  | String     | `=`, `contains`, `not_contains`                                             |
+| `metadata`                | Dictionary | `=`, `contains`, `>`, `<`                                                   |
+| `feedback_scores`         | Numeric    | `=`, `>`, `<`, `>=`, `<=`                                                   |
+| `tags`                    | List       | `contains`                                                                  |
+| `usage.total_tokens`      | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `usage.prompt_tokens`     | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `usage.completion_tokens` | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `duration`                | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `number_of_messages`      | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+| `total_estimated_cost`    | Numeric    | `=`, `!=`, `>`, `<`, `>=`, `<=`                                             |
+
+**Rules:**
+
+- String values must be wrapped in double quotes
+- Multiple conditions can be combined with `AND` (OR is not supported)
+- DateTime fields require ISO 8601 format (e.g., "2024-01-01T00:00:00Z")
+- Use dot notation for nested objects: `metadata.model`, `feedback_scores.accuracy`
 
 Here are some full examples of using `filter_string` values in searches:
 
diff --git a/sdks/python/src/opik/api_objects/opik_client.py b/sdks/python/src/opik/api_objects/opik_client.py
@@ -977,7 +977,41 @@ def search_traces(
 
         Args:
             project_name: The name of the project to search traces in. If not provided, will search across the project name configured when the Client was created which defaults to the `Default Project`.
-            filter_string: A filter string to narrow down the search. If not provided, all traces in the project will be returned up to the limit.
+            filter_string: A filter string to narrow down the search using Opik Query Language (OQL).
+                The format is: "<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"
+
+                Supported columns include:
+                - `id`, `name`, `created_by`, `thread_id`, `type`, `model`, `provider`: String fields with full operator support
+                - `status`: String field (=, contains, not_contains only)
+                - `start_time`, `end_time`: DateTime fields (use ISO 8601 format, e.g., "2024-01-01T00:00:00Z")
+                - `input`, `output`: String fields for content (=, contains, not_contains only)
+                - `metadata`: Dictionary field (use dot notation, e.g., "metadata.model")
+                - `feedback_scores`: Numeric field (use dot notation, e.g., "feedback_scores.accuracy")
+                - `tags`: List field (use "contains" operator only)
+                - `usage.total_tokens`, `usage.prompt_tokens`, `usage.completion_tokens`: Numeric usage fields
+                - `duration`, `number_of_messages`, `total_estimated_cost`: Numeric fields
+
+                Supported operators by column:
+                - `id`, `name`, `created_by`, `thread_id`, `type`, `model`, `provider`: =, !=, contains, not_contains, starts_with, ends_with, >, <
+                - `status`: =, contains, not_contains
+                - `start_time`, `end_time`: =, >, <, >=, <=
+                - `input`, `output`: =, contains, not_contains
+                - `metadata`: =, contains, >, <
+                - `feedback_scores`: =, >, <, >=, <=
+                - `tags`: contains (only)
+                - `usage.total_tokens`, `usage.prompt_tokens`, `usage.completion_tokens`, `duration`, `number_of_messages`, `total_estimated_cost`: =, !=, >, <, >=, <=
+
+                Examples:
+                - `start_time >= "2024-01-01T00:00:00Z"` - Filter by start date
+                - `start_time > "2024-01-01T00:00:00Z" AND start_time < "2024-02-01T00:00:00Z"` - Date range
+                - `input contains "question"` - Filter by input content
+                - `usage.total_tokens > 1000` - Filter by token usage
+                - `feedback_scores.accuracy > 0.8` - Filter by feedback score
+                - `tags contains "production"` - Filter by tag
+                - `metadata.model = "gpt-4"` - Filter by metadata field
+                - `thread_id = "thread_123"` - Filter by thread ID
+
+                If not provided, all traces in the project will be returned up to the limit.
             max_results: The maximum number of traces to return.
             truncate: Whether to truncate image data stored in input, output, or metadata
         """
@@ -1015,7 +1049,41 @@ def search_spans(
         Args:
             project_name: The name of the project to search spans in. If not provided, will search across the project name configured when the Client was created which defaults to the `Default Project`.
             trace_id: The ID of the trace to search spans in. If provided, the search will be limited to the spans in the given trace.
-            filter_string: A filter string to narrow down the search.
+            filter_string: A filter string to narrow down the search using Opik Query Language (OQL).
+                The format is: "<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"
+
+                Supported columns include:
+                - `id`, `name`, `created_by`, `thread_id`, `type`, `model`, `provider`: String fields with full operator support
+                - `status`: String field (=, contains, not_contains only)
+                - `start_time`, `end_time`: DateTime fields (use ISO 8601 format, e.g., "2024-01-01T00:00:00Z")
+                - `input`, `output`: String fields for content (=, contains, not_contains only)
+                - `metadata`: Dictionary field (use dot notation, e.g., "metadata.model")
+                - `feedback_scores`: Numeric field (use dot notation, e.g., "feedback_scores.accuracy")
+                - `tags`: List field (use "contains" operator only)
+                - `usage.total_tokens`, `usage.prompt_tokens`, `usage.completion_tokens`: Numeric usage fields
+                - `duration`, `number_of_messages`, `total_estimated_cost`: Numeric fields
+
+                Supported operators by column:
+                - `id`, `name`, `created_by`, `thread_id`, `type`, `model`, `provider`: =, !=, contains, not_contains, starts_with, ends_with, >, <
+                - `status`: =, contains, not_contains
+                - `start_time`, `end_time`: =, >, <, >=, <=
+                - `input`, `output`: =, contains, not_contains
+                - `metadata`: =, contains, >, <
+                - `feedback_scores`: =, >, <, >=, <=
+                - `tags`: contains (only)
+                - `usage.total_tokens`, `usage.prompt_tokens`, `usage.completion_tokens`, `duration`, `number_of_messages`, `total_estimated_cost`: =, !=, >, <, >=, <=
+
+                Examples:
+                - `start_time >= "2024-01-01T00:00:00Z"` - Filter by start date
+                - `start_time > "2024-01-01T00:00:00Z" AND start_time < "2024-02-01T00:00:00Z"` - Date range
+                - `input contains "question"` - Filter by input content
+                - `usage.total_tokens > 1000` - Filter by token usage
+                - `feedback_scores.accuracy > 0.8` - Filter by feedback score
+                - `tags contains "production"` - Filter by tag
+                - `metadata.model = "gpt-4"` - Filter by metadata field
+                - `thread_id = "thread_123"` - Filter by thread ID
+
+                If not provided, all spans in the project/trace will be returned up to the limit.
             max_results: The maximum number of spans to return.
             truncate: Whether to truncate image data stored in input, output, or metadata
         """
@@ -1211,13 +1279,36 @@ def get_all_prompts(self, name: str) -> List[Prompt]:
         )
         return self.get_prompt_history(name)
 
-    def search_prompts(self, filter_string: Optional[str] = None) -> List[Prompt]:
+    def search_prompts(
+        self, name: Optional[str] = None, filter_string: Optional[str] = None
+    ) -> List[Prompt]:
         """
         Retrieve the latest prompt versions for the given search parameters.
 
         Parameters:
-            filter_string: A filter string using Opik Query Language. It will be parsed and
-                converted into a stringified list of filters expected by the backend.
+            name: The substring of the prompt name to search for. If you have an exact name, consider using the `get_prompt` method instead since the name is a unique identifier.
+            filter_string: A filter string to narrow down the search using Opik Query Language (OQL).
+                The format is: "<COLUMN> <OPERATOR> <VALUE> [AND <COLUMN> <OPERATOR> <VALUE>]*"
+
+                Supported columns include:
+                - `id`, `name`: String fields
+                - `tags`: List field (use "contains" operator only)
+                - `created_by`: String field
+
+                Supported operators by column:
+                - `id`: =, !=, contains, not_contains, starts_with, ends_with, >, <
+                - `name`: =, !=, contains, not_contains, starts_with, ends_with, >, <
+                - `created_by`: =, !=, contains, not_contains, starts_with, ends_with, >, <
+                - `tags`: contains (only)
+
+                Examples:
+                - `tags contains "alpha"` - Filter by tag
+                - `tags contains "alpha" AND tags contains "beta"` - Filter by multiple tags
+                - `name contains "summary"` - Filter by name substring
+                - `created_by = "user@example.com"` - Filter by creator
+                - `id starts_with "prompt_"` - Filter by ID prefix
+
+                If not provided, all prompts matching the name filter will be returned.
 
         Returns:
             List[Prompt]: A list of Prompt instances found.
diff --git a/sdks/python/src/opik/api_objects/threads/threads_client.py b/sdks/python/src/opik/api_objects/threads/threads_client.py
diff --git a/sdks/python/src/opik/evaluation/threads/evaluator.py b/sdks/python/src/opik/evaluation/threads/evaluator.py