-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle ratelimit and retry #87
Comments
Hey @ccorcos. Good suggestion. I'd have to think through the ideal response object here. It would need to return some sort of error code as well to let you know that it was giving back a partial response. |
You're facing a rate limit error with OpenAI because you exceeded the token limit. This differs from the Gemini free tier, with a rate limit per request. I've worked around this issue on the Gemini model by controlling the number of pages processed per minute. Below is a brief overview of what I've done to manage the rate limit with Gemini. You can apply a similar approach by estimating the average number of tokens per page for the OpenAI model:
This snippet If you think this solution looks okay, we might consider creating a utility function within the library to help manage rate limit issues at the application level. I hope this helps! |
I'm trying with a big pdf and getting a ratelimit error:
It would be nice if (1) this library would handle those rate limits and (2) it would return intermediate results or something so that I can save results of past pages and keep going when thigns crash or error like this.
Thanks for the help!
The text was updated successfully, but these errors were encountered: