Skip to content

Intel® Extension for Transformers v1.0b Release

Pre-release
Pre-release
Compare
Choose a tag to compare
@kevinintel kevinintel released this 12 Dec 02:56
· 1774 commits to main since this release
  • Highlights
  • Features
  • Productivity
  • Examples
  • Bug Fixing
  • Documentation

Highlights

  • Intel® Extension for Transformers provides more compression examples for popular applications like Stable Diffusion. For Stable Diffusion, we support INT8 quantization with PyTorch and BF16 fine-tune with Intel ® Extension for PyTorch.

Features

  • Pruning/Sparsity
    • Support structured sparsity pattern N:M on PyTorch (25d5e4b)
    • Support structured sparsity pattern NxM on PyTorch (25d5e4b)
  • Transformers-accelerated Neural Engine
    • Support inference on Windows (fc580d5)
  • Transformers-accelerated Libraries
    • Support INT8 Softmax operator (fece837)

Productivity

  • Simplify the integration with Alibaba BladeDISC

Examples

Bug Fixing

  • Fix Protobuf and Onnx version dependency issue
  • Fix memory leak in Neural Engine

Documentation

  • Create Notebook for Pruning/Compression Orchestration/IPEX Quantization
  • Refine the user guide and compression example

Validated Configurations

  • Centos 8.4 & Ubuntu 20.04 & Windows 10
  • Python 3.7, 3.8, 3.9
  • Intel® Extension for TensorFlow 2.9.1, 2.10.0
  • PyTorch 1.11.0+cpu,1.12.0+cpu, 1.13.0+cpu, Intel® Extension for PyTorch 1.12.0+cpu ,1.13.0+cpu