Skip to content

MadhuNimmo/ollama-model-comparator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ollama Model Comparator

A simple yet powerful tool to compare responses from different language models running on Ollama.

Ollama Model Comparator Screenshot

Features

  • Compare responses from any two Ollama models side by side
  • Track generation time and token count for performance comparison
  • Support for custom system prompts
  • Ability to add custom models not listed in Ollama
  • Adjustable temperature and token settings
  • Simple, intuitive interface

Requirements

  • Python 3.10+
  • Ollama installed and running locally
  • Required Python packages:
    • gradio
    • requests

Installation

  1. Clone this repository:

    git clone https://github.com/your-username/ollama-model-comparator.git
    cd ollama-model-comparator
  2. Create and activate a virtual environment (optional but recommended):

    python -m venv comp-app
    source comp-app/bin/activate  # On Windows, use: comp-app\Scripts\activate
  3. Install the required dependencies:

    pip install gradio requests

Usage

  1. Make sure Ollama is running:

    ollama serve
  2. Launch the Ollama Model Comparator:

    python app.py
  3. Open your browser and navigate to http://127.0.0.1:7860

  4. Select two models to compare from the dropdown menus. If your model isn't listed, add it in the "Custom Models" field.

  5. Enter a prompt and (optionally) a system prompt.

  6. Adjust the temperature and max tokens if needed.

  7. Click "Compare Models" to generate responses from both models.

  8. View the results side-by-side along with performance metrics.

Configuration Options

  • Models: Select from models available in your Ollama installation
  • Custom Models: Add models not listed by entering comma-separated model names
  • System Prompt: Optional instructions to guide the model's behavior
  • Temperature: Controls randomness (0.0 to 1.0)
  • Max Tokens: Maximum length of the generated response
  • Timeout: Maximum time (in seconds) to wait for a response

Limitations

  • This version uses non-streaming API calls to Ollama for compatibility reasons.
  • For very large responses, you may need to increase the max tokens and timeout settings.
  • The app runs locally and requires Ollama to be running on the same machine.

Troubleshooting

  • No models shown: Make sure Ollama is running with ollama serve
  • Models not responding: Check that the models are downloaded with ollama list
  • Slow responses: Large models may take longer to generate responses; consider increasing the timeout

Future Improvements

  • Add support for response streaming
  • Implement more detailed comparison metrics
  • Add export functionality for responses
  • Support for comparing more than two models
  • Chat history visualization

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • Ollama for making local LLMs accessible
  • Gradio for the web interface framework

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages