You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently working on implementing tensor parallelism and need some guidance on how to split AWQ weights properly. Here's the current state of the AWQ weights I'm working with:
I also created a random input of shape (1, 2048, 4096) and performed a matrix multiplication with both the original and the split weights. However, the results do not match:
>>>torch.allclose(out_left, out[:,:,:7168])
False
Could someone advise on how to correctly split the AWQ weights to achieve effective tensor parallelism? Any help or suggestions would be greatly appreciated!
Thank you!
The text was updated successfully, but these errors were encountered:
Body:
Hello,
I am currently working on implementing tensor parallelism and need some guidance on how to split AWQ weights properly. Here's the current state of the AWQ weights I'm working with:
To split the weights, I used the following approach:
I also created a random input of shape (1, 2048, 4096) and performed a matrix multiplication with both the original and the split weights. However, the results do not match:
Could someone advise on how to correctly split the AWQ weights to achieve effective tensor parallelism? Any help or suggestions would be greatly appreciated!
Thank you!
The text was updated successfully, but these errors were encountered: