A simple yet powerful tool to compare responses from different language models running on Ollama.
- Compare responses from any two Ollama models side by side
- Track generation time and token count for performance comparison
- Support for custom system prompts
- Ability to add custom models not listed in Ollama
- Adjustable temperature and token settings
- Simple, intuitive interface
- Python 3.10+
- Ollama installed and running locally
- Required Python packages:
- gradio
- requests
-
Clone this repository:
git clone https://github.com/your-username/ollama-model-comparator.git cd ollama-model-comparator
-
Create and activate a virtual environment (optional but recommended):
python -m venv comp-app source comp-app/bin/activate # On Windows, use: comp-app\Scripts\activate
-
Install the required dependencies:
pip install gradio requests
-
Make sure Ollama is running:
ollama serve
-
Launch the Ollama Model Comparator:
python app.py
-
Open your browser and navigate to http://127.0.0.1:7860
-
Select two models to compare from the dropdown menus. If your model isn't listed, add it in the "Custom Models" field.
-
Enter a prompt and (optionally) a system prompt.
-
Adjust the temperature and max tokens if needed.
-
Click "Compare Models" to generate responses from both models.
-
View the results side-by-side along with performance metrics.
- Models: Select from models available in your Ollama installation
- Custom Models: Add models not listed by entering comma-separated model names
- System Prompt: Optional instructions to guide the model's behavior
- Temperature: Controls randomness (0.0 to 1.0)
- Max Tokens: Maximum length of the generated response
- Timeout: Maximum time (in seconds) to wait for a response
- This version uses non-streaming API calls to Ollama for compatibility reasons.
- For very large responses, you may need to increase the max tokens and timeout settings.
- The app runs locally and requires Ollama to be running on the same machine.
- No models shown: Make sure Ollama is running with
ollama serve
- Models not responding: Check that the models are downloaded with
ollama list
- Slow responses: Large models may take longer to generate responses; consider increasing the timeout
- Add support for response streaming
- Implement more detailed comparison metrics
- Add export functionality for responses
- Support for comparing more than two models
- Chat history visualization
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.