-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: tlm batch api #149
Conversation
@@ -46,21 +63,147 @@ def __init__(self, api_key: str, quality_preset: QualityPreset) -> None: | |||
|
|||
self._quality_preset = quality_preset | |||
|
|||
self._event_loop = asyncio.get_event_loop() | |||
self._query_semaphore = asyncio.Semaphore(max_concurrent_requests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to limit concurrency with the asyncio semaphore? if i set it to a high value, i.e. 1000, i get the generic API error. note: it doesn't hit the RateLimitError, which seems to be the expected one based on the error message suggesting to lower max_concurrent_requests
. not sure how likely it is that the typical user will be setting this value or just using the default
Running async code in a Jupyter notebook will require the following: https://github.com/erdewit/nest_asyncio |
Adds
batch_prompt
,batch_get_confidence_score
APIs to enable scale use of TLMIncludes refactor to use of async requests (enabling request concurrency)
Both batch methods are intended to gracefully handle query exceptions w/ retries. For rate limit errors, the retry occurs after a wait specified by the backend. For other errors, an exponential backoff is applied.
See testing instructions: https://github.com/cleanlab/cleanlab-studio-backend/pull/988