Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert' #1630

Open
junruizh2021 opened this issue Jun 21, 2024 · 2 comments
Assignees

Comments

@junruizh2021
Copy link

I try to run the TTS (English and Multi Language Text-to-Speech) in my PC.

https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md

It occured the cannot import name 'WeightOnlyQuantizedLinear' error info as below.

~/WorkSpace/TTS$ python eng-tts.py 
Traceback (most recent call last):
  File "/home/anna/WorkSpace/TTS/eng-tts.py", line 1, in <module>
    from intel_extension_for_transformers.neural_chat.pipeline.plugins.audio.tts import TextToSpeech
  File "/home/anna/.local/lib/python3.10/site-packages/intel_extension_for_transformers/neural_chat/__init__.py", line 26, in <module>
    from .chatbot import build_chatbot
  File "/home/anna/.local/lib/python3.10/site-packages/intel_extension_for_transformers/neural_chat/chatbot.py", line 19, in <module>
    from intel_extension_for_transformers.transformers.llm.quantization.optimization import Optimization
  File "/home/anna/.local/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/__init__.py", line 59, in <module>
    from .modeling import (
  File "/home/anna/.local/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/modeling/__init__.py", line 21, in <module>
    from .modeling_auto import (AutoModel, AutoModelForCausalLM,
  File "/home/anna/.local/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/modeling/modeling_auto.py", line 94, in <module>
    from intel_extension_for_pytorch.nn.utils._quantize_convert import (
ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert' (/opt/python-3.10.13/lib/python3.10/site-packages/intel_extension_for_pytorch/nn/utils/_quantize_convert.py)
@a32543254 a32543254 assigned a32543254 and PenghuiCheng and unassigned a32543254 Jul 2, 2024
@jketreno
Copy link

jketreno commented Jul 3, 2024

I am seeing a similar problem when using intel/intel-extension-for-pytorch:2.1.20-xpu-pip-jupyter. After installing needed modules via:

!pip install intel_extension_for_transformers accelerate uvicorn yacs fastapi datasets

And then running the following neural_chat example code:

from intel_extension_for_transformers.neural_chat import build_chatbot
from intel_extension_for_transformers.neural_chat import PipelineConfig
hf_access_token = "<put in your huggingface access token to download models"
config = PipelineConfig(device='xpu', hf_access_token=hf_access_token)

I see the following:

File /usr/local/lib/python3.10/dist-packages/intel_extension_for_transformers/transformers/modeling/modeling_auto.py:94
     90 from typing import Union
     92 if is_ipex_available() and is_intel_gpu_available():
     93     # pylint: disable=E0401
---> 94     from intel_extension_for_pytorch.nn.utils._quantize_convert import (
     95         WeightOnlyQuantizedLinear,
     96     )
     98 torch = LazyImport("torch")
    101 def recover_export_model(model, current_key_name=None):

ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert' (/usr/local/lib/python3.10/dist-packages/intel_extension_for_pytorch/nn/utils/_quantize_convert.py)

image

@jketreno
Copy link

I don't know if this will help the OP, however I was able to get things to work. I'm trying to use IPEX (intel-ext-for-python) and ITREX (intel-ext-for-transformers) on an Intel Arc A770M, which means I'm using the +xpu version of IPEX, which is older than the +cpu version.

I started looking into the WOC (WeightOnlyQuantizedLiner) implementation in IPEX and noted that there had been several code changes to it, so I thought maybe there is an API conflict between the more recent ITREX and the older version of IPEX needed for xpu.

This is what I'm using in my Dockerfile to build an image that seems to work:

FROM ubuntu:jammy

# First, setup Python and install other required packages (pip, venv, git, etc.)
RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    git \
    less \
    nano \
    gpg-agent \
    python3 \
    python3-pip \
    python3-venv \
    python3-dev \
    wget \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

# Install Intel graphics driver for Linux
RUN wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
    gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg \
    && echo "deb [arch=amd64 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy/lts/2350 unified" \
    > /etc/apt/sources.list.d/intel-gpu-jammy.list

RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    intel-level-zero-gpu \
    intel-opencl-icd \
    clinfo \
    level-zero \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

# For IPEX v2.1.10+xpu
# https://intel.github.io/intel-extension-for-pytorch/#installation?platform=gpu&version=v2.1.10%2bxpu&os=linux%2fwsl2&package=pip
# * oneAPI 2024.0
ENV oneapi_pkgs="intel-oneapi-dpcpp-cpp-2024.0 intel-oneapi-mkl-devel=2024.0.0-49656"
ENV python_modules="torch==2.1.0a0 torchvision==0.16.0a0 torchaudio==2.1.0a0 intel-extension-for-pytorch==2.1.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/"

RUN wget -qO- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor > /usr/share/keyrings/oneapi-archive-keyring.gpg \
    && echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" > /etc/apt/sources.list.d/oneAPI.list \
    && apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    ${oneapi_pkgs} \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

RUN pip3 install \
    ${python_modules}

# 1.4.2 has the WOQ bug
# 1.4.1 has the WOQ bug
# 1.4 has the WOQ bug
# 1.3.2 works!
ENV itrex_version=1.3.2
RUN pip install intel-extension-for-transformers==${itrex_version}

# Install system package dependencies that must be met for 1.3.2:
RUN apt-get update \
    && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    libgl1-mesa-glx \
    libglib2.0-0 \
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/{apt,dpkg,cache,log}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants