LLaMA Server

LLaMA Server combines the power of LLaMA C++ (via PyLLaMACpp) with the beauty of Chatbot UI.

🦙LLaMA C++ (via 🐍PyLLaMACpp) ➕ 🤖Chatbot UI ➕ 🔗LLaMA Server 🟰 😊

UPDATE: Greatly simplified implementation thanks to the awesome Pythonic APIs of PyLLaMACpp 2.0.0!

UPDATE: Now supports better streaming through PyLLaMACpp!

UPDATE: Now supports streaming!

Demo

Better Streaming

better_stream_demo.mov

Streaming

stream_demo.mov

Non-streaming

demo.mov

Setup

Get your favorite LLaMA models by
- Download from 🤗Hugging Face;
- Or follow instructions at LLaMA C++;
- Make sure models are converted and quantized;
Create a models.yml file to provide your model_home directory and add your favorite South American camelids, e.g.:

model_home: /path/to/your/models
models:
  llama-7b:
    name: LLAMA-7B
    path: 7B/ggml-model-q4_0.bin  # relative to `model_home` or an absolute path

See models.yml for an example.

Set up python environment:

conda create -n llama python=3.9
conda activate llama

Install LLaMA Server:

From PyPI:

python -m pip install llama-server

Or from source:

python -m pip install git+https://github.com/nuance1979/llama-server.git

Start LLaMA Server with your models.yml file:

llama-server --models-yml models.yml --model-id llama-7b

Check out my fork of Chatbot UI and start the app;

git clone https://github.com/nuance1979/chatbot-ui
cd chatbot-ui
git checkout llama
npm i
npm run dev

Open the link http://localhost:3000 in your browser;
- Click "OpenAI API Key" at the bottom left corner and enter your OpenAI API Key;
- Or follow instructions at Chatbot UI to put your key into a .env.local file and restart;
```
cp .env.local.example .env.local
<edit .env.local to add your OPENAI_API_KEY>
```
Enjoy!

More

Try a larger model if you have it:

llama-server --models-yml models.yml --model-id llama-13b  # or any `model_id` defined in `models.yml`

Try non-streaming mode by restarting Chatbot UI:

export LLAMA_STREAM_MODE=0  # 1 to enable streaming
npm run dev

Fun facts

I am not fluent in JavaScript at all but I was able to make the changes in Chatbot UI by chatting with ChatGPT; no more StackOverflow.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
llama_server		llama_server
models		models
test		test
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
models.yml		models.yml
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLaMA Server

Demo

Setup

More

Fun facts

About

Releases 4

Packages

Languages

License

nuance1979/llama-server

Folders and files

Latest commit

History

Repository files navigation

LLaMA Server

Demo

Setup

More

Fun facts

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 4

Packages 0

Languages

Packages