Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in quant #587

Open
Orion-zhen opened this issue Aug 8, 2024 · 2 comments
Open

Error in quant #587

Orion-zhen opened this issue Aug 8, 2024 · 2 comments

Comments

@Orion-zhen
Copy link
Contributor

When converting nemolita-21b, which is a merged model, the convert.py runs into this error:

Traceback (most recent call last):
  File "/home/orion/repo/exllamav2/convert.py", line 1, in <module>
    import exllamav2.conversion.convert_exl2
  File "/home/orion/repo/exllamav2/exllamav2/conversion/convert_exl2.py", line 283, in <module>
    optimize(job, save_job, model)
  File "/home/orion/repo/exllamav2/exllamav2/conversion/optimize.py", line 167, in optimize
    logerr += math.log(err)
              ^^^^^^^^^^^^^
ValueError: math domain error

System info:

  • exllama: build from latest repo
  • pytorch: 2.4.0
  • cuda: 12.5

quant command:

python convert.py -i /path/to/nemolita-21b -o ./6.5 -cf /path/to/nemolita-21b-6.5  -r 256 -b 6.5 -hb 8
@turboderp
Copy link
Owner

If I'm reading the model cards correctly, this model is made by mixing four other models in bfloat16, so it may or may not be normalized properly after that, and then that merged model was fused back together with the original instruct model...

It's really hard to speculate as to what might be wrong when a merged model fails to convert. It's not really a well-defined quantity to begin with. From the error message I would guess maybe an overflow during measurement? Do you have some output from the quantization to help give a clue?

@Async0x42
Copy link

Async0x42 commented Sep 10, 2024

I have this same error trying to quantize https://huggingface.co/ArliAI/Llama-3.1-70B-ArliAI-RPMax-v1.1

Is there anything I can do on my end to help troubleshoot this? I was able to create measurements without any issue, but quantizing it fails on this step.

Here is the measurement.json for this model:

measurement.json

 !! Note: Overriding options with settings from existing job
 !! Job is already finished
 -- Beginning new job
 !! Warning: Output directory is not empty: Z:/Raw_Models/Llama-3.1-70B-ArliAI-RPMax-v1.1_TEMP2
 !! Cleaning output directory: Z:/Raw_Models/Llama-3.1-70B-ArliAI-RPMax-v1.1_TEMP2
 -- Input: Z:/Raw_Models/Llama-3.1-70B-ArliAI-RPMax-v1.1
 -- Output: Z:/Raw_Models/Llama-3.1-70B-ArliAI-RPMax-v1.1_TEMP2
 -- Using default calibration dataset
 -- Target bits per weight: 3.0 (decoder), 6 (head)
 -- Max shard size: 8192 MB
 -- Full model will be compiled to: P:/Models/async0x42_Llama-3.1-70B-ArliAI-RPMax-v1.1-exl2_3.50bpw/
 -- Reusing measurement: Z:/Raw_Models/Llama-3.1-70B-ArliAI-RPMax-v1.1/measurement.json
 -- Optimizing...
 -- Optimizing:    1/ 240
 -- Optimizing:   19/ 240
 -- Optimizing:   37/ 240
 -- Optimizing:   55/ 240
 -- Optimizing:   73/ 240
 -- Optimizing:   80/ 240
 -- Optimizing:   98/ 240
 -- Optimizing:  116/ 240
 -- Optimizing:  134/ 240
 -- Optimizing:  152/ 240
 -- Optimizing:  170/ 240
 -- Optimizing:  188/ 240
 -- Optimizing:  206/ 240
 -- Optimizing:  224/ 240
 -- max(err): 0.045496
 -- error_norm: 1.897152
 -- Quantization strategy:
 --   model.layers.0.self_attn                           4.1747 bpw - exp. error: 0.01745901
 --   model.layers.0.mlp                                 2.2361 bpw - exp. error: 0.02964070
 --   model.layers.1.self_attn                           4.1747 bpw - exp. error: 0.02239446
 --   model.layers.1.mlp                                 2.2361 bpw - exp. error: 0.03393357
 --   model.layers.2.self_attn                           6.2434 bpw - exp. error: 0.00387378
 --   model.layers.2.mlp                                 3.3615 bpw - exp. error: 0.01970356
 --   model.layers.3.self_attn                           4.1747 bpw - exp. error: 0.02133299
 --   model.layers.3.mlp                                 4.2559 bpw - exp. error: 0.00623711
 --   model.layers.4.self_attn                           2.1243 bpw - exp. error: 0.00331855
 --   model.layers.4.mlp                                 2.2361 bpw - exp. error: 0.00501283
 --   model.layers.5.self_attn                           2.1243 bpw - exp. error: 0.00383393
 --   model.layers.5.mlp                                 2.2361 bpw - exp. error: 0.00575347
 --   model.layers.6.self_attn                           2.2254 bpw - exp. error: 0.00298922
 --   model.layers.6.mlp                                 2.2361 bpw - exp. error: 0.00677872
 --   model.layers.7.self_attn                           2.1794 bpw - exp. error: 0.00327658
 --   model.layers.7.mlp                                 2.2361 bpw - exp. error: 0.00757924
 --   model.layers.8.self_attn                           2.2254 bpw - exp. error: 0.00407865
 --   model.layers.8.mlp                                 2.2361 bpw - exp. error: 0.00815845
 --   model.layers.9.self_attn                           2.1243 bpw - exp. error: 0.00681557
 --   model.layers.9.mlp                                 2.2361 bpw - exp. error: 0.00932074
 --   model.layers.10.self_attn                          3.1477 bpw - exp. error: 0.00530455
 --   model.layers.10.mlp                                2.2361 bpw - exp. error: 0.01058968
 --   model.layers.11.self_attn                          2.1243 bpw - exp. error: 0.00753149
 --   model.layers.11.mlp                                2.2361 bpw - exp. error: 0.01250959
 --   model.layers.12.self_attn                          2.6594 bpw - exp. error: 0.00754494
 --   model.layers.12.mlp                                2.2361 bpw - exp. error: 0.01361098
 --   model.layers.13.self_attn                          2.2254 bpw - exp. error: 0.01268869
 --   model.layers.13.mlp                                2.3168 bpw - exp. error: 0.01427771
 --   model.layers.14.self_attn                          3.1477 bpw - exp. error: 0.01084663
 --   model.layers.14.mlp                                2.2361 bpw - exp. error: 0.01732151
 --   model.layers.15.self_attn                          2.1243 bpw - exp. error: 0.00000000
Traceback (most recent call last):
  File "D:\AI\exllamav2\convert.py", line 1, in <module>
    import exllamav2.conversion.convert_exl2
  File "D:\AI\exllamav2\exllamav2\conversion\convert_exl2.py", line 283, in <module>
    optimize(job, save_job, model)
  File "D:\AI\exllamav2\exllamav2\conversion\optimize.py", line 167, in optimize
    logerr += math.log(err)
              ^^^^^^^^^^^^^
ValueError: math domain error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants