Prompt caching for Claude #234

tpaulshippy · 2025-06-09T07:16:13Z

What this does

Support prompt caching in both Anthropic and Bedrock providers for Claude models that support it.

Caching system prompts:

chat = RubyLLM.chat
chat.with_instructions("You are a helpful assistant.")
chat.cache_prompts(system: true)

Caching user prompts:

chat = RubyLLM.chat
chat.with_instructions("You are a helpful assistant.")
chat.cache_prompts(system: true, user: true)
chat.ask("What is the capital of France?")

Caching tool definitions:

chat = RubyLLM.chat
chat.with_instructions("You are a helpful assistant.")
chat.with_tool(MyTool)
chat.cache_prompts(tools: true)
chat.ask("What is the capital of France?")

Type of change

New feature

Scope check

I read the Contributing Guide
This aligns with RubyLLM's focus on LLM communication
This isn't application-specific logic that belongs in user code
This benefits most users, not just my specific use case

Quality check

I ran overcommit --install and all hooks pass
I tested my changes thoroughly
I updated documentation if needed
I didn't modify auto-generated files manually (models.json, aliases.json)

API changes

New public methods/classes

Related issues

Resolves #13

tpaulshippy · 2025-06-09T21:45:24Z

@crmne As I don't have an Anthropic key, I'll need you to generate the VCR cartridges for that provider. Hoping everything just works, but let me know if not.

crmne · 2025-06-11T07:55:06Z

@tpaulshippy this would be great to have! Will you be willing to enable it on all providers?

I'll do a proper review when I can.

tpaulshippy · 2025-06-11T14:00:21Z

My five minutes of research indicates that at least OpenAI and Gemini take the approach of automatically caching for you based on the size and structure of your request. So the only support I think we'd really need for those two is to populate the cached token counts on the response messages. Unless we want to try to support explicit caching on the Gemini API but that looks complex and not as commonly needed.

Do you know of other providers that require payload changes for prompt caching?

tpaulshippy · 2025-06-11T14:06:54Z

lib/ruby_llm/providers/anthropic/media.rb

+        def with_cache_control(hash, cache: false)
+          return hash unless cache
+
+          hash.merge(cache_control: { type: 'ephemeral' })


Realizing this might cause errors on older models that do not support caching. If it does, we could raise here, or just let the API validation handle it. I'm torn on whether the capabilities check complexity is worth it as these models are probably so rarely used.

tpaulshippy · 2025-06-12T18:08:46Z

@crmne As I don't have an Anthropic key, I'll need you to generate the VCR cartridges for that provider. Hoping everything just works, but let me know if not.

Scratch that. I decided to stop being a cheapskate and just pay Anthropic their $5.

tpaulshippy added 7 commits June 8, 2025 22:57

13: Failing specs

2e84006

13: Get caching specs passing for Bedrock

be61e48

13: Remove comments in specs

edec138

13: Add unused param on other providers

971f176

13: Rubocop -A

557a5ee

13: Add cassettes for bedrock cache specs

9673b13

13: Resolve Rubocop aside from Metrics/ParameterLists

c47d270

tpaulshippy changed the title ~~Prompt caching~~ Prompt caching for Claude Jun 9, 2025

tpaulshippy added 4 commits June 9, 2025 12:08

13: Use large enough prompt to hit cache meaningfully

eaf0876

13: Ensure cache tokens are being used

160d9ab

13: Refactor completion parameters

d1698bf

16: Add guide for prompt caching

344729f

tpaulshippy marked this pull request as ready for review June 9, 2025 21:44

tpaulshippy commented Jun 11, 2025

View reviewed changes

tpaulshippy added 2 commits June 12, 2025 11:02

Add real anthropic cassettes ($0.03)

7b98277

Merge branch 'main' into prompt-caching

fd30f14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Prompt caching for Claude #234

Prompt caching for Claude #234

Uh oh!

tpaulshippy commented Jun 9, 2025 •

edited

Loading

Uh oh!

tpaulshippy commented Jun 9, 2025

Uh oh!

crmne commented Jun 11, 2025 •

edited

Loading

Uh oh!

tpaulshippy commented Jun 11, 2025

Uh oh!

tpaulshippy Jun 11, 2025

Uh oh!

tpaulshippy commented Jun 12, 2025

Uh oh!

Uh oh!

Uh oh!

Prompt caching for Claude #234

Are you sure you want to change the base?

Prompt caching for Claude #234

Uh oh!

Conversation

tpaulshippy commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

Type of change

Scope check

Quality check

API changes

Related issues

Uh oh!

tpaulshippy commented Jun 9, 2025

Uh oh!

crmne commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tpaulshippy commented Jun 11, 2025

Uh oh!

tpaulshippy Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

tpaulshippy commented Jun 12, 2025

Uh oh!

Uh oh!

tpaulshippy commented Jun 9, 2025 •

edited

Loading

crmne commented Jun 11, 2025 •

edited

Loading