[Feature Request] W4A4 Quantization Support in torchao

Dear team,

I would like to inquire about the possibility of W4A4 quantization support in torchao.

Torchao has proven to be an excellent quantization inference tool, particularly with its comprehensive support for W8A8. However, regarding 4-bit operations, I've only noticed W4A8 implementation (which currently utilizes INT8 GEMM operators under the hood). Given that many modern GPUs now support INT4 GEMM operators with promising results, I was wondering if there are any plans to implement W4A4 in torchao?

Thank you for your attention to this matter.

Best regards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] W4A4 Quantization Support in torchao #1406

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] W4A4 Quantization Support in torchao #1406

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions