Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some errores #176

Open
ahe168 opened this issue Apr 26, 2024 · 2 comments
Open

some errores #176

ahe168 opened this issue Apr 26, 2024 · 2 comments

Comments

@ahe168
Copy link

ahe168 commented Apr 26, 2024

The argument trust_remote_code is to be used with Auto classes. It has no effect here and is ignored.
The load_in_4bit and load_in_8bit arguments are deprecated and will be removed in the future versions. Please, pass a BitsAndBytesConfig object in quantization_config argument instead.

ImportError Traceback (most recent call last)
Cell In[21], line 6
4 tokenizer = LlamaTokenizerFast.from_pretrained(base_model, trust_remote_code=True)
5 tokenizer.pad_token = tokenizer.eos_token
----> 6 model = LlamaForCausalLM.from_pretrained(base_model, trust_remote_code=True, device_map = "cuda:0", load_in_8bit = True,)
7 model = PeftModel.from_pretrained(model, peft_model)
8 model = model.eval()

File /opt/conda/lib/python3.10/site-packages/transformers/modeling_utils.py:3049, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, config, cache_dir, ignore_mismatched_sizes, force_download, local_files_only, token, revision, use_safetensors, *model_args, **kwargs)
3046 hf_quantizer = None
3048 if hf_quantizer is not None:
-> 3049 hf_quantizer.validate_environment(
3050 torch_dtype=torch_dtype, from_tf=from_tf, from_flax=from_flax, device_map=device_map
3051 )
3052 torch_dtype = hf_quantizer.update_torch_dtype(torch_dtype)
3053 device_map = hf_quantizer.update_device_map(device_map)

File /opt/conda/lib/python3.10/site-packages/transformers/quantizers/quantizer_bnb_8bit.py:62, in Bnb8BitHfQuantizer.validate_environment(self, *args, **kwargs)
60 def validate_environment(self, *args, **kwargs):
61 if not (is_accelerate_available() and is_bitsandbytes_available()):
---> 62 raise ImportError(
63 "Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate "
64 "and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes"
65 )
67 if kwargs.get("from_tf", False) or kwargs.get("from_flax", False):
68 raise ValueError(
69 "Converting into 4-bit or 8-bit weights from tf/flax weights is currently not supported, please make"
70 " sure the weights are in PyTorch format."
71 )

ImportError: Using bitsandbytes 8-bit quantization requires Accelerate: pip install accelerate and the latest version of bitsandbytes: pip install -i https://pypi.org/simple/ bitsandbytes

@liaorihu
Copy link

liaorihu commented May 29, 2024

The solution can be easily found on Google.

My proposed solution to this issue is as follows:

  • Update the version of 'bitsandbytes'. You can find the suitable version under this project.
  • Caerfully review the document on Hugging Face's 'peft' to adjust relevant parameters.

I hope my answer will work for you.

@Siddharth-Latthe-07
Copy link

The error message you're encountering indicates that the bitsandbytes 8-bit quantization requires the Accelerate library and the latest version of bitsandbytes. Additionally, the arguments load_in_4bit and load_in_8bit are deprecated, and you should use a BitsAndBytesConfig object with the quantization_config argument instead.
Try out these steps and let me know, if it works:-

  1. Install the required libraries
  2. Update your code to use BitsAndBytesConfig:
    sample snippet:-
from transformers import LlamaForCausalLM, LlamaTokenizerFast
from transformers import BitsAndBytesConfig

base_model = 'your_model_path_or_name'
peft_model = 'your_peft_model_path_or_name'

tokenizer = LlamaTokenizerFast.from_pretrained(base_model, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token

quant_config = BitsAndBytesConfig.load_in_8bit()

model = LlamaForCausalLM.from_pretrained(
    base_model,
    trust_remote_code=True,
    device_map="cuda:0",
    quantization_config=quant_config
)
model = PeftModel.from_pretrained(model, peft_model)
model = model.eval()

and lastly check whether you are using CUDA environment

import torch
print(torch.cuda.is_available())

Hope, this helps
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants