-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Mixed Precision Grouped Gemm with zero points and GPT-Q semantics closes #2261 #2457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
sorry, running a bit behind. we will get to it soon. |
@ankutalev Thanks for submitting this feature MR. Have you checked the functionality of this feature? Could you post the result of running this feature (example 69) here? |
Yes, I checked - it shows "Disposition Passed" for all scenarios ({shuffled/unshuffled} X {direct convert, no zeros, zeros, gptq}). I can provide unit tests if you like. Also I don't like the way I implemented gptq mode switch, but runtime parameters seems like "not cutlass style"; I will apreciate any advices and suggestions here =) We are interested in this functionality in main branch, because nobody likes to have patched forks =) |
@Junkai-Wu Hi! any uppdates here? |
@ankutalev we are reviewing the changes internally. Will merge this PR once got approved and merged in our internal repo. |
Hi! Any news here? |
This PR has been labeled |
Hello!
This MR provides two things:
Closes #2261