We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我尝试转化7B的量化模型都失败了,AWQ提示仅支持GPU,GPTQ则提示.float()不支持量化模型,请问你转化过量化模型吗,或者说是否无法转化量化模型?我看test中有一个loss.float(), 但是感觉也无从下手。谢谢!
The text was updated successfully, but these errors were encountered:
我看到 Support gguf model conversion (currently only support q4_0 and fp16). 意思是仅支持 gguf的量化格式?
Sorry, something went wrong.
我转过量化,但不建议转
rkllm-toolkit/model/test.py ret = llm.build(do_quantization=True, optimization_level=1, quantized_dtype='w8a8', target_platform='rk3588')
具体详见: https://github.com/airockchip/rknn-llm/blob/main/doc/Rockchip_RKLLM_SDK_CN_1.1.0.pdf
rk3588支持“w8a8,”“w8a8 g128,w8a8_g256,“w8a8 g512”四种量化类型
★内存要大于32G 否则会失败,转换前一定要关闭其他应用,以免资源不足,转换失败
No branches or pull requests
我尝试转化7B的量化模型都失败了,AWQ提示仅支持GPU,GPTQ则提示.float()不支持量化模型,请问你转化过量化模型吗,或者说是否无法转化量化模型?我看test中有一个loss.float(), 但是感觉也无从下手。谢谢!
The text was updated successfully, but these errors were encountered: