Skip to content

Latest commit

 

History

History
 
 

105-language-quantize-bert

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Accelerate Inference of NLP models with OpenVINO™ Post-Training Optimization Tool

This tutorial demonstrates how to apply INT8 quantization to the Natural Language Processing model BERT, using the Post-Training Optimization Tool API (part of OpenVINO). The HuggingFace BERT PyTorch model, fine-tuned for Microsoft Research Paraphrase Corpus (MRPC) task is used. The code of this tutorial is designed to be extendable to custom models and datasets.

Notebook Contents

The tutorial consists of the following steps:

  • Downloading and preparing the MRPC model and a dataset.
  • Defining data loading and accuracy validation functionality.
  • Preparing the model for quantization.
  • Running optimization pipeline.
  • Comparing performance of the original and quantized models.

Installation Instructions

If you have not installed all required dependencies, follow the Installation Guide.