Skip to content

v1.13.0: 4-bit quantization, stateful models, Whisper

Compare
Choose a tag to compare
@echarlaix echarlaix released this 25 Jan 16:48
· 413 commits to main since this release

OpenVINO

Weight only 4-bit quantization

optimum-cli export openvino --model gpt2 --weight-format int4_sym_g128 ov_model

Stateful

New architectures

Whisper

  • Add support for export and inference for whisper models by @eaidova in #470