A chat application using the Llama model via FastAPI for a REST API and a CLI for direct interaction.
- REST API with CORS enabled for web integration.
- CLI for interactive chatting in the terminal.
- Customizable chat settings (temperature, max tokens).
- Requirements: Python 3.8+, FastAPI, Uvicorn,
llama_cpp
package. - Installation:
pip install fastapi uvicorn
. - Configuration: Set
MODEL_PATH
inapi.py
andmain.py
to your Llama model location.
- API Server:
uvicorn api:app --host 0.0.0.0 --port 8000
.- Visit
http://localhost:8000/docs
for API docs.
- Visit
- CLI: Run
python main.py
and interact with the prompts