BerriAI
diff --git a/‎docs/my-website/docs/response_api.md
Lines changed: 9 additions & 2 deletions b/‎docs/my-website/docs/response_api.md
Lines changed: 9 additions & 2 deletions
diff --git a/‎docs/my-website/img/release_notes/faster_caching_calls.png
270 KB b/‎docs/my-website/img/release_notes/faster_caching_calls.png
270 KB
diff --git a/‎docs/my-website/img/release_notes/responses_api_session_mgt_images.jpg
472 KB b/‎docs/my-website/img/release_notes/responses_api_session_mgt_images.jpg
472 KB
diff --git a/‎docs/my-website/release_notes/v1.75.5-stable/index.md
Lines changed: 42 additions & 1 deletion b/‎docs/my-website/release_notes/v1.75.5-stable/index.md
Lines changed: 42 additions & 1 deletion
@@ -803,11 +803,18 @@ LiteLLM Proxy supports session management for non-OpenAI models. This allows you
 
 1. Enable storing request / response content in the database
 
-Set `store_prompts_in_spend_logs: true` in your proxy config.yaml. When this is enabled, LiteLLM will store the request and response content in the database.
+Set `store_prompts_in_cold_storage: true` in your proxy config.yaml. When this is enabled, LiteLLM will store the request and response content in the s3 bucket you specify.
 
 ```yaml
+litellm_settings:
+  callbacks: ["s3_v2"]
+  s3_callback_params: # learn more https://docs.litellm.ai/docs/proxy/logging#s3-buckets
+    s3_bucket_name: litellm-logs   # AWS Bucket Name for S3
+    s3_region_name: us-west-2       
+
 general_settings:
-  store_prompts_in_spend_logs: true
+  cold_storage_custom_logger: s3_v2
+  store_prompts_in_cold_storage: true
 ```
 
 2. Make request 1 with no `previous_response_id` (new session)
 
@@ -1,5 +1,5 @@
 ---
-title: "[PRE-RELEASE]v1.75.5-stable"
+title: "v1.75.5-stable - Redis latency improvements"
 slug: "v1-75-5"
 date: 2025-08-10T10:00:00
 authors:
@@ -43,8 +43,49 @@ pip install litellm==1.75.5.post1
 
 ---
 
+## Key Highlights
+
+- **Redis - Latency Improvements** - Reduces P99 latency by 50% with Redis enabled. 
+- **Responses API Session Management** - Support for managing responses API sessions with images.
+- **Oracle Cloud Infrastructure** - New LLM provider for calling models on Oracle Cloud Infrastructure.
+- **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
+
+
+### Risk of Upgrade
+
+If you build the proxy from the pip package, you should hold off on upgrading. This version makes `prisma migrate deploy` our default for managing the DB. This is safer, as it doesn't reset the DB, but it requires a manual `prisma generate` step. 
+
+Users of our Docker image, are **not** affected by this change. 
+
 ---
 
+## Redis Latency Improvements
+
+<Image 
+  img={require('../../img/release_notes/faster_caching_calls.png')}
+  style={{width: '100%', display: 'block', margin: '2rem auto'}}
+/>
+
+<br/>
+
+This release adds in-memory caching for Redis requests, enabling faster response times in high-traffic. Now, LiteLLM instances will check their in-memory cache for a cache hit, before checking Redis. This reduces caching-related latency from 100ms for LLM API calls to sub-1ms, on cache hits. 
+
+---
+
+## Responses API Session Management w/ Images
+
+<Image 
+  img={require('../../img/release_notes/responses_api_session_mgt_images.jpg')}
+  style={{width: '100%', display: 'block', margin: '2rem auto'}}
+/>
+
+<br/>
+
+LiteLLM now supports session management for Responses API requests with images. This is great for use-cases like chatbots, that are using the Responses API to track the state of a conversation. LiteLLM session management works across **ALL** LLM API's (including Anthropic, Bedrock, OpenAI, etc). LiteLLM session management works by storing the request and response content in an s3 bucket, you can specify. 
+
+---
+
+
 ## New Models / Updated Models
 
 #### New Model Support