-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Manual model merges #555
Comments
This should also work for EXL2 models, assuming you duplicate/rename all the sub-keys for each layer as well. The only difference is that the Changes to the config.json should be the same as for a HF model. Do note that the quantization of each layer is calibrated to the expected output of the previous layer, not to a copy of the same layer, so it's hard to predict how well this works if you're not starting from the original model and quantizing afterwards. But then I guess merges and self-merges were never really predictable to begin with. |
I used dynamic relayering and it worked well, but duplicating the model layers in the safetensors didn't work. By dynamic, I mean I load the weights into memory, and then copy.copy the weights, and finally rebuild the cache. In fact, my best results are from dynamic exl2 experiments. I can't get the same great results even with the original BFloat16 weights! |
@turboderp I have some questions on caching, do you have time for an online chat via Gmeet or Zoom? |
Hi Turbo,
I am interested in doing some model self-merges. Currently, I do this with a script with huggingface models.
Basically, I calculate the mapping, eg to duplicate layer 3:
{1:1, 2:2,3:3,4:3,5:4}
Then I go through the safetensor files, and duplicate the tensors based on these layer numbers, and generate new keys with the right layer name (eg model.layer.3.up.mlp -> model.layer.6.up.mlp). Finally, I update the model config json with the new number of layers. This works for transformers models, but not for exl2 models. What else would I need to do?
The text was updated successfully, but these errors were encountered: