You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just curious about AWQ's decision to implement its own custom int3/int4 GEMM kernels rather than using NVIDIA's CUTLASS library (such as using int4 CUTLASS GEMMs developed by FasterTransformer). Was this choice driven by performance limitations with CUTLASS, or did they perhaps encounter compatibility issues? Thanks in advance for the insights!
The text was updated successfully, but these errors were encountered:
Just curious about AWQ's decision to implement its own custom int3/int4 GEMM kernels rather than using NVIDIA's CUTLASS library (such as using int4 CUTLASS GEMMs developed by FasterTransformer). Was this choice driven by performance limitations with CUTLASS, or did they perhaps encounter compatibility issues? Thanks in advance for the insights!
The text was updated successfully, but these errors were encountered: