-
-
Notifications
You must be signed in to change notification settings - Fork 162
Prompt caching for Claude #234
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@crmne As I don't have an Anthropic key, I'll need you to generate the VCR cartridges for that provider. Hoping everything just works, but let me know if not. |
@tpaulshippy this would be great to have! Will you be willing to enable it on all providers? I'll do a proper review when I can. |
My five minutes of research indicates that at least OpenAI and Gemini take the approach of automatically caching for you based on the size and structure of your request. So the only support I think we'd really need for those two is to populate the cached token counts on the response messages. Unless we want to try to support explicit caching on the Gemini API but that looks complex and not as commonly needed. Do you know of other providers that require payload changes for prompt caching? |
def with_cache_control(hash, cache: false) | ||
return hash unless cache | ||
|
||
hash.merge(cache_control: { type: 'ephemeral' }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Realizing this might cause errors on older models that do not support caching. If it does, we could raise here, or just let the API validation handle it. I'm torn on whether the capabilities check complexity is worth it as these models are probably so rarely used.
Scratch that. I decided to stop being a cheapskate and just pay Anthropic their $5. |
What this does
Support prompt caching in both Anthropic and Bedrock providers for Claude models that support it.
Caching system prompts:
Caching user prompts:
Caching tool definitions:
Type of change
Scope check
Quality check
overcommit --install
and all hooks passmodels.json
,aliases.json
)API changes
Related issues
Resolves #13