Skip to content

Always quantize attention math; drop the --quant-attention flag

dff1aa1
Select commit
Loading
Failed to load commit list.
Draft

[Example] Add 2:4 sparsity -> INT8 SmoothQuant PTQ -> ONNX -> TensorRT pipeline #1664

Always quantize attention math; drop the --quant-attention flag
dff1aa1
Select commit
Loading
Failed to load commit list.