Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation dtails: GEMM_W4A4::quantize behavior different from the paper #33

Closed
xmfbit opened this issue Nov 22, 2024 · 1 comment
Closed
Assignees
Labels
question Further information is requested

Comments

@xmfbit
Copy link

xmfbit commented Nov 22, 2024

Hello author, after delving into your implementation code, I found that the quantize method in GEMM_W4A4 does not align with what is presented in the paper. I used smoothed(x) @ lora_down and the un-smoothed version x @ lora_down, and the results differ from qact.lora_act. Could you please explain this?

Thank you.

@lmxyy lmxyy added the question Further information is requested label Jan 14, 2025
@synxlin
Copy link
Collaborator

synxlin commented Feb 14, 2025

Hi @xmfbit ,

Your observation is correct. In Nunchaku, the implementation is actually $XW = (X / s) (s * W) = (X / s) [L_1 L2 + R] = (X / s) L_1 L_2 + Q(X / s)Q(R) = X L_1\prime L2 + Q(X/s)Q(R)$ where $s$ is the smoothing factor, $s * W = L_1 L_2 + R$ and $L_1 \prime = L_1 / s$. We have to unsmooth low rank branch during conversion fromdeepcompressor to nunchaku.

Two implementations $(X / s) L_1 L_2 + Q(X / s)Q(R)$ and $X L_1\prime L2 + Q(X/s)Q(R)$ are mathematically equivalent ($L_1 \prime = L_1 / s$). In the paper and deepcompressor package, the goal is to find a better combination of $L_1 L_2$ and $R$. In Nunchaku package, the goal is to calculate the final results with a faster and simpler implementation.

@xmfbit xmfbit closed this as completed Feb 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants