Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How did you quantize the model? #5

Open
Ph0rk0z opened this issue Sep 28, 2024 · 0 comments
Open

How did you quantize the model? #5

Ph0rk0z opened this issue Sep 28, 2024 · 0 comments

Comments

@Ph0rk0z
Copy link

Ph0rk0z commented Sep 28, 2024

I have been trying to use other kernels with this implementation but none of them load the state dict without mismatch. I don't know if marlin AWQ is special, but in any case, it would be nice to know how to quantize models. Even with this implementation we don't have schnell.

Please post a quanting script or give some hints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant