A chat application using the Llama model via FastAPI for a REST API and a CLI for direct interaction.
- REST API with CORS enabled for web integration.
- CLI for interactive chatting in the terminal.
- Customizable chat settings (temperature, max tokens).
- Requirements: Python 3.8+, FastAPI, Uvicorn,
llama_cpppackage. - Installation:
pip install fastapi uvicorn. - Configuration: Set
MODEL_PATHinapi.pyandmain.pyto your Llama model location.
- API Server:
uvicorn api:app --host 0.0.0.0 --port 8000.- Visit
http://localhost:8000/docsfor API docs.
- Visit
- CLI: Run
python main.pyand interact with the prompts