In this workshop we will walk through three progressively more advanced examples of building apps with Ollama (self-hosted LLMs) and Streamlit:
- Simple Chatbot — plain chat with a local model.
- Few-Shot Date Math — use prompt engineering to improve consistency (multi-shot learning).
- Tool-Calling Date Math — delegate deterministic work (date calculations) to Python tools.
Ollama is the runtime for local LLMs.
Follow the instructions for your OS:
- macOS:
brew install ollama
- Linux:
curl -fsSL https://ollama.com/install.sh | sh
- Windows:
Download installer
Once installed, make sure the Ollama server is running:
ollama serve
For this workshop we’ll use Mistral-7B. Pull it once:
ollama pull mistral:7b
We’ll use Python 3.9+. Install dependencies:
pip install streamlit ollama
Each example is a separate Streamlit app.
Start Ollama in one terminal (ollama serve
), then in another terminal run:
streamlit run 1_chatbot/main.py
-
Demonstrates a basic chat interface with a local LLM.
-
Try small-talk and simple questions.
-
Then ask:
What is the difference in days between 2024-12-28 and 2025-01-03?
There will likely be unreliable behavior⚠️ .
streamlit run 2_multishot/main.py
- Adds multi-shot learning (few-shot exemplars) in the prompt.
- The LLM learns a consistent format for calculating date differences.
- Try again with:
How many days are there between 24.07.2025 and 24.07.1990?
streamlit run 3_toolcalling/main.py
- Introduces tool calling: the LLM decides when to call a Python function for deterministic results.
- Uses Python’s
datetime
to guarantee correct day differences. - Example:
What is the difference in days between 24.07.2025 and 24.07.1990?
- Example 1 — See the limits of a plain LLM (inconsistent date math).
- Example 2 — Use prompt engineering (few-shot) to make the model more consistent.
- Example 3 — Introduce tool use to make the app reliable and production-ready.
- Ollama lets you self-host open LLMs on your laptop or server.
- Streamlit makes it easy to wrap them in an interactive UI.
- Few-shot prompting can guide the model but is not always reliable.
- Tool-calling bridges the gap: let the model handle reasoning while Python (or other tools) handle deterministic logic.
This workshop focused on prompting and tool use. The next step in advancing self-hosted LLMs is fine-tuning:
Fine-tuning adapts a model to your domain, data, or task — improving accuracy and reliability. For example, customer support FAQs, specialized terminology, or structured outputs.
Resources:
- mistral-finetune — lightweight LoRA-based fine-tuning code and tutorials https://github.com/mistralai/mistral-finetune
- Tutorial notebook (7B): in the repo tutorials folder https://github.com/mistralai/mistral-finetune/blob/main/tutorials/mistral_finetune_7b.ipynb
- How-to guide: Mistral Docs — Fine-tuning guide https://docs.mistral.ai/guides/finetuning/