Add cache for llm generate #806

Undertone0809 · 2024-07-21T11:53:31Z

🚀 Feature Request

promptulate need cache for LLM generating. Having a cache means that the large model output is used on the first input, and the output is cached, and on the second input, if the same data has been entered before, the previously cached data is used directly.

Method 1

For example:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?")

The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.

Default no cache, if you want to open cache, use the following pattern:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=111)

When your cache_seed is 111, your cache is queried.

Method 2

Use enable_cache parameter. For exmaple:

import pne

response: str = pne.chat("gpt-4o", "What's promptulate?", enable_cache=True)

The answer is generated by the gpt-4o driver during the first run and then into the cache, and the cached data is used directly during the second run.

Compare

The first approach is a little more granular and can be cached based on different user ids? Useless, prompt key is the same.

So method2 is simple and enough.

import pne

user_id = "123123"
response: str = pne.chat("gpt-4o", "What's promptulate?", cache_seed=user_id)

Undertone0809 added the enhancement New feature or request label Jul 21, 2024

Undertone0809 self-assigned this Jul 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cache for llm generate #806

Add cache for llm generate #806

Undertone0809 commented Jul 21, 2024 •

edited

Loading

Add cache for llm generate #806

Add cache for llm generate #806

Comments

Undertone0809 commented Jul 21, 2024 • edited Loading

🚀 Feature Request

Method 1

Method 2

Compare

Undertone0809 commented Jul 21, 2024 •

edited

Loading