Skip to content

Feature Request: Add --upload to llama-bench #14791

@red-co

Description

@red-co

Prerequisites

  • I am running the latest code. Mention the version if possible as well.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new and useful enhancement to share.

Feature Description

Add a new command-line flag --upload to llama-bench to enable optional submission of benchmarking results to a public repository (e.g., github.com/ggerganov/llamacpp-bench).

When invoked with --upload, llama-bench should package the benchmark result (including model info, system info, hardare info, test parameters, and performance output, Parameter selection, GPU driver version, Inference Backend), and then upload it to a designated GitHub Pages-based static dataset, using either direct GitHub API integration or a user-assisted PR generation workflow.

This will allow the community to build and maintain a transparent, decentralized hardware-throughput leaderboard for llama.cpp.

Motivation

Currently, there is no centralized or reliable platform for collecting and comparing llama.cpp benchmark results across diverse hardware configurations.

Existing performance metrics—like memory bandwidth or FLOPs—do not translate well to real-world llama.cpp inference performance, which is affected by implementation-specific factors (e.g., Parameter selection, GPU driver version, optimization level, context windows).

By leveraging llama-bench, which already functions as a consistent and reproducible benchmarking tool (similar to 7z b in compression), Many people are willing to share their optimization tips like using a specific version of Nvidia & AMD & intc gpu drivers instead of the latest drivers and a specific patch to double the speed of llamacpp

Enabling upload to a community leaderboard would also promote fair comparison across systems and make it easier to evaluate optimizations, regressions, or vendor-specific tradeoffs.

Possible Implementation

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions