Feature Request: Add --upload to llama-bench

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Add a new command-line flag `--upload` to `llama-bench` to enable optional submission of benchmarking results to a public repository (e.g., [`github.com/ggerganov/llamacpp-bench`](https://github.com/ggerganov/llamacpp-bench)).

When invoked with `--upload`, `llama-bench` should package the benchmark result (including model info, system info, hardare info, test parameters, and performance output, Parameter selection, GPU driver version, Inference Backend), and then upload it to a designated GitHub Pages-based static dataset, using either direct GitHub API integration or a user-assisted PR generation workflow.

This will allow the community to build and maintain a transparent, decentralized hardware-throughput leaderboard for `llama.cpp`.


### Motivation

Currently, there is no centralized or reliable platform for collecting and comparing `llama.cpp` benchmark results across diverse hardware configurations.

Existing performance metrics—like memory bandwidth or FLOPs—do not translate well to real-world `llama.cpp` inference performance, which is affected by implementation-specific factors (e.g., Parameter selection, GPU driver version, optimization level, context windows).

By leveraging `llama-bench`, which already functions as a consistent and reproducible benchmarking tool (similar to `7z b` in compression), Many people are willing to share their optimization tips like using a specific version of Nvidia & AMD & intc gpu drivers instead of the latest drivers and a specific patch to double the speed of llamacpp

Enabling upload to a community leaderboard would also promote fair comparison across systems and make it easier to evaluate optimizations, regressions, or vendor-specific tradeoffs.


### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add --upload to llama-bench #14791

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add --upload to llama-bench #14791

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions