-
Notifications
You must be signed in to change notification settings - Fork 18
Added CI for ROCm 7.0 #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for the PR Manpreet! |
Looks like the ROCm 7.0 Apptainer is broken. I suggest trying to build that locally on any system. |
@mawad-amd can you look into new checks , old failing check worked but now 1 -rank is failing. thats Interesting , 6.3 worked but 7.0 failed because pip tried writing to a read-only path. Maybe something different in the 7.0 Docker image? so should I force user install ? |
@mawad-amd Hi , I am getting [ Which method will you suggest? |
There seems to be something wrong with the apptainer image itself. See log here. I think we will have to fix that first, then see if the other problem you are pointing out still persist. Do you still have problems with getting AMD GPU access (single GPU is fine)? |
@mawad-amd Thanks , Ohh the testcase build-apptainer-7.0 passed so I didn’t check the logs I’ll look into it. Yes, I do have access to an AMD GPU now, but just building the Apptainer image fails with:
for all apptainers so i will have to figure some stuff , thanks for your time ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds ROCm 7.0 support to the Iris framework by extending the CI matrix to test against both ROCm 6.3.1 and 7.0. The changes address issue #170 by creating separate Apptainer definitions for each ROCm version and updating the GitHub workflow to build and test images for both versions.
- Added Apptainer definition files for ROCm 6.3.1 and 7.0 with version-specific configurations
- Extended CI workflow matrix to test both ROCm versions in parallel
- Updated build and test job names to include ROCm version identification
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
apptainer/iris-rocm7.0.def |
New Apptainer definition for ROCm 7.0 environment with PyTorch 2.8.0 |
apptainer/iris-rocm6.3.1.def |
New Apptainer definition for ROCm 6.3.1 environment |
.github/workflows/iris-tests-apptainer.yml |
Updated CI workflow to support matrix builds for multiple ROCm versions |
Motivation
Extend CI matrix to test against ROCm 7.0.
Closes #170
Technical Details
.github/workflows/iris-tests-apptainer.yml
: Extended CI matrix to test both ROCm 6.3.1 and 7.0apptainer/iris-rocm6.3.1.def
: Apptainer definition for ROCm 6.3.1apptainer/iris-rocm7.0.def
: Apptainer definition for ROCm 7.0Test Plan
Could not perform local testing since all AMD cloud droplets are currently out of stock.
Submitting as a Draft PR to validate changes via CI.
Test Result
N/A (pending)
Submission Checklist