As one of the advanced parameter-efficient fine-tuning (PEFT) techniques, QLoRA enables light-weight infusion of specialty knowledge into a large language model with minimal overhead. BigDL-LLM also supports finetuning LLM (large language models) using QLora with 4bit optimizations on Intel GPUs.
Note
Currently, BigDL-LLM only supports QLoRA finetuning on any Hugging Face
transformers
models.
In Chapter 7, you will go through how to fine-tune a large language model to a text generation task using BigDL-LLM optimizations. BigDL-LLM has a comprehensive tool-set to help you fine-tune the model, merge the LoRA weights and inference with the fine-tuned model.
We are going to train with a popular open source model Llama-2-7b-hf as an example.
You could follow the detailed instructions in Chapter 6 to set up your environment on Intel GPUs. Here are some necessary steps to configure your environment properly.
⚠️ Hardware
- Intel Arc™ A-Series Graphics
- Intel Data Center GPU Flex Series
- Intel Data Center GPU Max Series
⚠️ Operating System
- Linux system, Ubuntu 22.04 is preferred
Before benifiting from BigDL-LLM on Intel GPUs, there’re several steps for tools installation:
-
First you need to install Intel GPU driver. Please refer to our driver installation for general purpose GPU capabilities.
-
You also need to download and install Intel® oneAPI Base Toolkit. OneMKL and DPC++ compiler are needed, others are optional.
Supoosed that you have already installed Conda (which is recommended) as your python environment management tool, the following commands can help you create and activate your python environment:
# Python 3.9 is recommended for running BigDL-LLM
conda create -n llm-finetune python=3.9
conda activate llm-finetune
You need to set OneAPI environment variables for BigDL-LLM on Intel GPUs.
# configure OneAPI environment variables
source /opt/intel/oneapi/setvars.sh
If you want to use Intel GPUs to do inference on the fine-tuned model, it is recommended to set more environment variables to reach optimal performance:
export USE_XETLA=OFF
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1