Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

量化模型转化 #4

Open
PlanetesDDH opened this issue Oct 29, 2024 · 3 comments
Open

量化模型转化 #4

PlanetesDDH opened this issue Oct 29, 2024 · 3 comments

Comments

@PlanetesDDH
Copy link

PlanetesDDH commented Oct 29, 2024

我尝试转化7B的量化模型都失败了,AWQ提示仅支持GPU,GPTQ则提示.float()不支持量化模型,请问你转化过量化模型吗,或者说是否无法转化量化模型?我看test中有一个loss.float(), 但是感觉也无从下手。谢谢!

@PlanetesDDH
Copy link
Author

我看到 Support gguf model conversion (currently only support q4_0 and fp16). 意思是仅支持 gguf的量化格式?

@wudingjian
Copy link
Owner

我尝试转化7B的量化模型都失败了,AWQ提示仅支持GPU,GPTQ则提示.float()不支持量化模型,请问你转化过量化模型吗,或者说是否无法转化量化模型?我看test中有一个loss.float(), 但是感觉也无从下手。谢谢!

我转过量化,但不建议转

rkllm-toolkit/model/test.py
ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8', target_platform='rk3588')

具体详见:
https://github.com/airockchip/rknn-llm/blob/main/doc/Rockchip_RKLLM_SDK_CN_1.1.0.pdf

rk3588支持“w8a8,”“w8a8 g128,w8a8_g256,“w8a8 g512”四种量化类型

@wudingjian
Copy link
Owner

我看到 Support gguf model conversion (currently only support q4_0 and fp16). 意思是仅支持 gguf的量化格式?

★内存要大于32G 否则会失败,转换前一定要关闭其他应用,以免资源不足,转换失败

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants