Model loaded with PretrainedModel.from_pretrained
and with torch.device("cuda"):
decorator leads to unexpected errors compared to .to("cuda")
#35371
Labels
System Info
transformers commit 4567ee8
Who can help?
@mht-sharma maybe you know
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
When using
with torch.device("cuda")
, to load a model on device, I am getting various unexpected errors asHIPBLAS_STATUS_INTERNAL_ERROR when calling hipblasLtMatmul
orRuntimeError: HIP error: no kernel image is available for execution on the device
.However, when loading a (dummy) model with
everything is fine at runtime, no error.
Similarly, when loading with
there is no error at runtime.
I do not have access to an Nvidia GPU at the time so could not reproduce there to see if this issue exists also on Nvidia distribution pytorch. So I am not sure if this is a ROCm, PyTorch or Transformers bug, this might need some investigations.
I could reproduce the issue on a few previous Transformers versions (4.45, 4.46, 4.47).
Filling for awareness, this might need some more investigation and/or extended testing in Transformers CI.
Interestingly I could not reproduce this issue with
peft-internal-testing/tiny-random-qwen-1.5-MoE
, but only withQwen/Qwen1.5-MoE-A2.7B-Chat
.Expected behavior
No error
The text was updated successfully, but these errors were encountered: