Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't pull predefined version of NVIDIA GPU driver (nvidia_gpu role) #1384

Open
SalaryTheft opened this issue Jul 12, 2024 · 3 comments
Open
Assignees
Labels
Bug Report Something isn't working Critical - Tracked in Jira Issues that have been escalated internally and are tracked in Jira now
Milestone

Comments

@SalaryTheft
Copy link

rhel8_driver_version: 525.105.17 # default for Rhel 8 and corresponding rhcos

스크린샷 2024-07-12 163442

I updated gpu-cluster-policy (ClusterPolicy kind) to use latest driver and it helped.

@SalaryTheft
Copy link
Author

SalaryTheft commented Jul 12, 2024

Just noticed RHOCS 4.14 is based on RHEL 9.2
https://access.redhat.com/articles/6907891


edit) For a workaround, use the GPU_DRIVER_VERSION environment variable to manually define the driver version.

export GPU_DRIVER_VERSION=550.54.14

@JonahLuckett
Copy link
Contributor

Looking into this now for you

@durera
Copy link
Contributor

durera commented Nov 1, 2024

@durera durera added the Critical - Tracked in Jira Issues that have been escalated internally and are tracked in Jira now label Nov 1, 2024
@durera durera added this to the November 2024 milestone Nov 1, 2024
dclain added a commit that referenced this issue Nov 15, 2024
Only use GPU_DRIVER_VERSION env when we need a specific driver version

#1384
#1511
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Report Something isn't working Critical - Tracked in Jira Issues that have been escalated internally and are tracked in Jira now
Projects
None yet
Development

No branches or pull requests

4 participants