forked from huggingface/optimum-intel
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add IPEX documentation (huggingface#828)
* change readme, source/index, source/installation * add ipex doc 1st step * update readme for command line usage * fix bug for ipex readme * add export doc * update all ipex docs * rm diffusers * change register * Update README.md Co-authored-by: Ella Charlaix <[email protected]> * Update docs/source/installation.mdx Co-authored-by: Ella Charlaix <[email protected]> * fix readme * fix ipex exporter args comments * extend ipex export explain * fix ipex reference.mdx * add comments for auto doc * rm cli export * Update optimum/commands/export/ipex.py Co-authored-by: Ella Charlaix <[email protected]> * rm commit hash in export command * rm export * rm jit * add ipex on doc's docker file * indicate that ipex model only supports for cpu and the export format will be changed to compile in the future * Update docs/source/ipex/inference.mdx Co-authored-by: Ella Charlaix <[email protected]> * explain patching * rm ipex reference * Update docs/source/ipex/inference.mdx * Update docs/source/ipex/inference.mdx * Update docs/source/ipex/inference.mdx * Update docs/source/index.mdx * Update docs/source/ipex/inference.mdx * Update docs/source/ipex/models.mdx * Update docs/Dockerfile --------- Co-authored-by: Ella Charlaix <[email protected]>
- Loading branch information
1 parent
1f3d0c2
commit 403c696
Showing
9 changed files
with
162 additions
and
4 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
<!--Copyright 2024 The HuggingFace Team. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Inference | ||
|
||
Optimum Intel can be used to load models from the [Hub](https://huggingface.co/models) and create pipelines to run inference with IPEX optimizations (including patching with custom operators, weight prepacking and graph mode) on a variety of Intel processors. For now support is only enabled for CPUs. | ||
|
||
|
||
## Loading | ||
|
||
You can load your model and apply IPEX optimizations (including weight prepacking and graph mode). For supported architectures like LLaMA, BERT and ViT, further optimizations will be applied by patching the model to use custom operators. | ||
For now, support is only enabled for CPUs and the original model will be exported via TorchScript. In the future `torch.compile` will be used and model exported via TorchScript will get deprecated. | ||
|
||
```diff | ||
import torch | ||
from transformers import AutoTokenizer, pipeline | ||
- from transformers import AutoModelForCausalLM | ||
+ from optimum.intel import IPEXModelForCausalLM | ||
|
||
model_id = "gpt2" | ||
- model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16) | ||
+ model = IPEXModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.bfloat16, export=True) | ||
tokenizer = AutoTokenizer.from_pretrained(model_id) | ||
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) | ||
results = pipe("He's a dreadful magician and") | ||
``` | ||
|
||
As shown in the table below, each task is associated with a class enabling to automatically load your model. | ||
|
||
| Auto Class | Task | | ||
|--------------------------------------|--------------------------------------| | ||
| `IPEXModelForSequenceClassification` | `text-classification` | | ||
| `IPEXModelForTokenClassification` | `token-classification` | | ||
| `IPEXModelForQuestionAnswering` | `question-answering` | | ||
| `IPEXModelForImageClassification` | `image-classification` | | ||
| `IPEXModel` | `feature-extraction` | | ||
| `IPEXModelForMaskedLM` | `fill-mask` | | ||
| `IPEXModelForAudioClassification` | `audio-classification` | | ||
| `IPEXModelForCausalLM` | `text-generation` | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
<!--Copyright 2024 The HuggingFace Team. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Supported models | ||
|
||
🤗 Optimum provides IPEX optimizations for both eager mode and graph mode. It provides classes and functions to perform this step easily. | ||
Here is the list of the supported architectures : | ||
|
||
## [Transformers](https://huggingface.co/docs/transformers/index) | ||
|
||
- Albert | ||
- Bart | ||
- Beit | ||
- Bert | ||
- BlenderBot | ||
- BlenderBotSmall | ||
- Bloom | ||
- CodeGen | ||
- DistilBert | ||
- Electra | ||
- Flaubert | ||
- GPT-2 | ||
- GPT-BigCode | ||
- GPT-Neo | ||
- GPT-NeoX | ||
- Llama | ||
- MPT | ||
- Mistral | ||
- MobileNet v1 | ||
- MobileNet v2 | ||
- MobileVit | ||
- OPT | ||
- ResNet | ||
- Roberta | ||
- Roformer | ||
- SqueezeBert | ||
- UniSpeech | ||
- Vit | ||
- Wav2Vec2 | ||
- XLM |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
<!--Copyright 2024 The HuggingFace Team. All rights reserved. | ||
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with | ||
the License. You may obtain a copy of the License at | ||
http://www.apache.org/licenses/LICENSE-2.0 | ||
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on | ||
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the | ||
specific language governing permissions and limitations under the License. | ||
--> | ||
|
||
# Notebooks | ||
|
||
## Inference | ||
|
||
| Notebook | Description | | | | ||
|:---------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------- |:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|------:| | ||
| [How to run inference with the IPEX](https://github.com/huggingface/optimum-intel/tree/main/notebooks/ipex) | Explains how to export your model to IPEX and to run inference with IPEX model on text-generation task | [![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/optimum-intel/blob/main/notebooks/ipex/text_generation.ipynb) | [![Open in AWS Studio](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/huggingface/optimum-intel/blob/main/notebooks/ipex/text_generation.ipynb) | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters