FineTuning AutoModelForSequenceClassification.from_pretrained(meta-llama/Llama-3.2-1B) Bug:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward) and awq importing #35365

alestrami · 2024-12-20T13:51:50Z

What does this PR do?

Fixes # (issue)

[ x] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
[ x] Did you read the [contributor guideline] (https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#create-a-pull-request),
Pull Request section?
[ x] Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

I added two contributions.

The first one regarding the mismatch about the batch of the devices then training the model with the attribute device_map='auto'. Discussed here for AutoModelForSequenceClassification.from_pretrained. Thasnks to [hust] who provided the solution. Working also in my case with llama 3.3 1B https://discuss.huggingface.co/t/fine-tune-meta-llama-llama-2-7b-hf-bug-expected-all-tensors-to-be-on-the-same-device-but-found-at-least-two-devices-cuda-1-and-cuda-0-when-checking-argument-for-argument-target-in-method-wrapper-cuda-nll-loss-forward/129341/1
The seconds is regarding the checking of availability of the package awq for loading quantized models. 'problems importing in awq # importlib.metadata.version doesn't work with awq (line 144 src/transformers/utils/import_utils.py) . mportlib.metadata.version(pkg_name) works with 'autoawq' and not 'awq' while importlib.util.find_spec("awq") works only with awq and not autoawq. Both of thems need to be taken into account (is the same package)

fix import CompileConfig

alestrami added 4 commits December 19, 2024 11:32

fix argument and target on different device cuda

5c050e4

fixed metadata.version with awq

3c4d751

Merge branch 'huggingface:main' into main

893d6a4

Add files via upload

b71093c

fix import CompileConfig