Is there anyway to save context for the model ? #2110

mdrokz · 2023-07-05T10:35:57Z

mdrokz
Jul 5, 2023

Hi im new to LLM and machine learning in general, i was wondering if its possible to save context of the prompts for example like chatgpt chats it remembers previous questions and i was wondering what saving the state does for llama.cpp. I want to build a local chat bot and save prompts in db.

Answered by KerfuffleV2

Jul 5, 2023

so basically i can load prompts from db

Well, not directly: llama.cpp (currently) only loads the cached prompts from binary files.

My advice is to start out playing around with the commandline options I mentioned so you understand how the prompt cache works. After that, you can start thinking about how to interface with a DB.

Probably the approach that makes the most sense is to leave the cached prompts as files and use the DB to index rather than trying to actually the data inside the DB. Note also that the files can get quite large. The size is basically proportional to the context size: with 16bit memory (the default) I believe 2,048 tokens will end up with a 1GB file. Most of the ti…

View full answer

KerfuffleV2 · 2023-07-05T16:56:21Z

KerfuffleV2
Jul 5, 2023
Collaborator

Yes, check out the --prompt-cache FILENAME, --prompt-cache-all and --prompt-cache-ro options.

If you want to save prompts to speed up starting from a certain point later on, you'll probably want to use --prompt-cache blah first to create the prompt cache file and then add --prompt-cache-ro too keep llama.cpp from messing with the cached prompt.

Since the project is under rapid development you probably shouldn't assume cached prompt files are a reliable archive of LLM state though.

2 replies

mdrokz Jul 5, 2023
Author

Oh interesting so basically i can load prompts from db, initialize the cache file and llama.cpp and feed the cache file to it correct ? so what does saving the state do ? i saw saving the state creates .bin file and you can load that file at init time.

KerfuffleV2 Jul 5, 2023
Collaborator

so basically i can load prompts from db

Well, not directly: llama.cpp (currently) only loads the cached prompts from binary files.

My advice is to start out playing around with the commandline options I mentioned so you understand how the prompt cache works. After that, you can start thinking about how to interface with a DB.

Probably the approach that makes the most sense is to leave the cached prompts as files and use the DB to index rather than trying to actually the data inside the DB. Note also that the files can get quite large. The size is basically proportional to the context size: with 16bit memory (the default) I believe 2,048 tokens will end up with a 1GB file. Most of the time your prompts are going to be a lot smaller than that, but it's worth keeping in mind as a reference.

so what does saving the state do ? i saw saving the state creates .bin file and you can load that file at init time.

Yes. When you start up, after it's done evaluating the prompt it will save the current state (assuming you specified --prompt-cache). Then the next time you start and either specify no prompt or the same prompt that got cached (or a prefix of it) then that will get loaded from the cache.

Answer selected by mdrokz

mdrokz · 2023-07-07T09:40:59Z

mdrokz
Jul 7, 2023
Author

Thanks for the help! appreciate it

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is there anyway to save context for the model ? #2110

{{title}}

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Is there anyway to save context for the model ? #2110

mdrokz Jul 5, 2023

Replies: 2 comments · 2 replies

KerfuffleV2 Jul 5, 2023 Collaborator

mdrokz Jul 5, 2023 Author

KerfuffleV2 Jul 5, 2023 Collaborator

mdrokz Jul 7, 2023 Author

mdrokz
Jul 5, 2023

Replies: 2 comments 2 replies

KerfuffleV2
Jul 5, 2023
Collaborator

mdrokz Jul 5, 2023
Author

KerfuffleV2 Jul 5, 2023
Collaborator

mdrokz
Jul 7, 2023
Author