Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Nov 7, 2025

This PR is a demo. It will definitely break models other than kimi-k2

IMPORTANT: This requires deleting the "quantization_config" section in config.json; You can also rename it:

image

How it works: we map int4 --> GGML's Q4_0; the original scale is BF16, and will be converted to F16 (as Q4_0 only support F16)

TODO: correct the nibble layout, seems to be reversed order

cc @jukofyork

@github-actions github-actions bot added the python python script changes label Nov 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant