-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROCm containers fail on multi-gpu AMD systems #525
Comments
Open to ideas, but by default I propose that RamaLama should just try and use the GPU with the most VRAM (and just a sole GPU). I think heuristics more complex than that are not worth it. For multi-GPU etc. or any other non-default way of running models, there should be a way to set that up, either via flag, env var etc. We should support using the various flags people are using with llama.cpp in the AI community like HIP_VISIBLE_DEVICES, HSA_OVERRIDE_GFX_VERSION, etc. No point in reinventing the wheel. |
From my perspective, I agree that defaulting to larger VRAM GPU is great as it is likely what most users would expect anyway.
And yes, it would be great if RamaLama simply passed in all
|
Relates-to: containers#525 Signed-off-by: Arun Babu Neelicattu <[email protected]>
Relates-to: containers#525 Signed-off-by: Arun Babu Neelicattu <[email protected]>
Relates-to: containers#525 Signed-off-by: Arun Babu Neelicattu <[email protected]>
Relates-to: containers#525 Signed-off-by: Arun Babu Neelicattu <[email protected]>
When attempting to run a model, the command fails.
Running the
podman
command gives the following.The error encountered in this case seems to be similar to that seen in #497. However, in my case the desired outcome was either that the right GPU be selected or the
HSA_OVERRIDE_GFX_VERSION
env var be set rather than forcing to run on cpu.The root cause in my setup seems to be somehow related to the existence of multiple GPUs on the machine. Although, I am not certain.
Expected Outcome
--gpu <num>
) or respecting existing environment variables would be sufficient (Fixed gpu detection for cuda rocm etc using env vars #490 might resolve this).HSA_OVERRIDE_GFX_VERSION
when executing the container.I am happy to contribute to code if a direction for the fix is provided.
Workarounds
In my local environment, I had to do one of the following to work around the issue. Both had to be done in the
podman
command as I could not figure out how to configure it via RamaLama.HIP_VISIBLE_DEVICES=0
to detectAMD Radeon RX 7600M XT
.HSA_OVERRIDE_GFX_VERSION=11.0.2
so that it worked withHIP_VISIBLE_DEVICES=1
which selects the iGPU and is what RamaLama chooses to pass topodman
.The text was updated successfully, but these errors were encountered: