Feature Request: Direct FP8 conversion from convert_hf_to_gguf.py

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Hello, I've long wanted a way to go straight from FP8 models to a q8 gguf and took a swing at it by modifying `prepare_tensors` to track tensors and marry up the scale tensor/multiplying it into a float32 tensor: https://github.com/bmtwl/llama.cpp/blob/convert-hf-from-fp8/convert_hf_to_gguf.py
The conversion starts correctly, but fails on writing due to size/shape differences, which is where I'm losing the plot:
`RuntimeError: The size of tensor a (18432) must match the size of tensor b (144) at non-singleton dimension 1`
I'm hoping the remaining issue is something small. Anyone want to take a look and figure out where I'm going wrong?

### Motivation

Right now safetensors need to be converted to BF16 before being quanted into anything else, a step that is theoretically not needed

### Possible Implementation

deal with the scale part and force the FP8 tensors into another format before converting

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Direct FP8 conversion from convert_hf_to_gguf.py #14762

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Direct FP8 conversion from convert_hf_to_gguf.py #14762

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions