Skip to content

Commit c99277c

Browse files
Merge pull request #13694 from BerriAI/litellm_dev_08_16_2025_p3
Litellm dev 08 16 2025 p3
2 parents 6c4ced2 + 21549a3 commit c99277c

File tree

4 files changed

+51
-3
lines changed

4 files changed

+51
-3
lines changed

docs/my-website/docs/response_api.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -803,11 +803,18 @@ LiteLLM Proxy supports session management for non-OpenAI models. This allows you
803803

804804
1. Enable storing request / response content in the database
805805

806-
Set `store_prompts_in_spend_logs: true` in your proxy config.yaml. When this is enabled, LiteLLM will store the request and response content in the database.
806+
Set `store_prompts_in_cold_storage: true` in your proxy config.yaml. When this is enabled, LiteLLM will store the request and response content in the s3 bucket you specify.
807807

808808
```yaml
809+
litellm_settings:
810+
callbacks: ["s3_v2"]
811+
s3_callback_params: # learn more https://docs.litellm.ai/docs/proxy/logging#s3-buckets
812+
s3_bucket_name: litellm-logs # AWS Bucket Name for S3
813+
s3_region_name: us-west-2
814+
809815
general_settings:
810-
store_prompts_in_spend_logs: true
816+
cold_storage_custom_logger: s3_v2
817+
store_prompts_in_cold_storage: true
811818
```
812819
813820
2. Make request 1 with no `previous_response_id` (new session)
270 KB
Loading
472 KB
Loading

docs/my-website/release_notes/v1.75.5-stable/index.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "[PRE-RELEASE]v1.75.5-stable"
2+
title: "v1.75.5-stable - Redis latency improvements"
33
slug: "v1-75-5"
44
date: 2025-08-10T10:00:00
55
authors:
@@ -43,8 +43,49 @@ pip install litellm==1.75.5.post1
4343

4444
---
4545

46+
## Key Highlights
47+
48+
- **Redis - Latency Improvements** - Reduces P99 latency by 50% with Redis enabled.
49+
- **Responses API Session Management** - Support for managing responses API sessions with images.
50+
- **Oracle Cloud Infrastructure** - New LLM provider for calling models on Oracle Cloud Infrastructure.
51+
- **Digital Ocean's Gradient AI** - New LLM provider for calling models on Digital Ocean's Gradient AI platform.
52+
53+
54+
### Risk of Upgrade
55+
56+
If you build the proxy from the pip package, you should hold off on upgrading. This version makes `prisma migrate deploy` our default for managing the DB. This is safer, as it doesn't reset the DB, but it requires a manual `prisma generate` step.
57+
58+
Users of our Docker image, are **not** affected by this change.
59+
4660
---
4761

62+
## Redis Latency Improvements
63+
64+
<Image
65+
img={require('../../img/release_notes/faster_caching_calls.png')}
66+
style={{width: '100%', display: 'block', margin: '2rem auto'}}
67+
/>
68+
69+
<br/>
70+
71+
This release adds in-memory caching for Redis requests, enabling faster response times in high-traffic. Now, LiteLLM instances will check their in-memory cache for a cache hit, before checking Redis. This reduces caching-related latency from 100ms for LLM API calls to sub-1ms, on cache hits.
72+
73+
---
74+
75+
## Responses API Session Management w/ Images
76+
77+
<Image
78+
img={require('../../img/release_notes/responses_api_session_mgt_images.jpg')}
79+
style={{width: '100%', display: 'block', margin: '2rem auto'}}
80+
/>
81+
82+
<br/>
83+
84+
LiteLLM now supports session management for Responses API requests with images. This is great for use-cases like chatbots, that are using the Responses API to track the state of a conversation. LiteLLM session management works across **ALL** LLM API's (including Anthropic, Bedrock, OpenAI, etc). LiteLLM session management works by storing the request and response content in an s3 bucket, you can specify.
85+
86+
---
87+
88+
4889
## New Models / Updated Models
4990

5091
#### New Model Support

0 commit comments

Comments
 (0)