-
Notifications
You must be signed in to change notification settings - Fork 108
Frontier Benchmarking (#453) #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #881 +/- ##
=======================================
Coverage 45.98% 45.98%
=======================================
Files 68 68
Lines 18629 18629
Branches 2239 2239
=======================================
Hits 8566 8566
Misses 8711 8711
Partials 1352 1352 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Reduced the job duration to 3 hrs to see whether it would yield the same error regardless of duration. |
5e25a92
to
c7360eb
Compare
8193a5f
to
5c8b925
Compare
This reverts commit 8dcf100.
bb6d642
to
3317bdb
Compare
527ca12
to
27aa35b
Compare
I fixed it. You |
I did |
This benchmark test will never pass in its current state because the Frontier files for benchmarking do not exist on the master branch, hence this error (cd pr && bash .github/workflows/frontier/submit-bench.sh .github/workflows/frontier/bench.sh gpu) &
(cd pr && bash .github/workflows/frontier/submit-bench.sh .github/workflows/frontier/bench.sh gpu) &
(cd master && bash .github/workflows/frontier/submit-bench.sh .github/workflows/frontier/bench.sh gpu) &
wait %1 && wait %[2](https://github.com/MFlowCode/MFC/actions/runs/15826502985/job/44607985758?pr=881#step:5:2)
shell: /usr/bin/bash -e {0}
env:
ACTIONS_RUNNER_FORCE_ACTIONS_NODE_VERSION: node16
ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION: true
bash: .github/workflows/frontier/submit-bench.sh: No such file or directory
Submitted batch job [3](https://github.com/MFlowCode/MFC/actions/runs/15826502985/job/44607985758?pr=881#step:5:3)531713 once it looks like everything is working as well as one can expect, we can merge in the minimal files ( |
aight, myself or someone has to test it out manually by cloning master & pr and adding bash files in each then benchmarking on Frontier as a slurm/interative job to make sure nothing will corrupt in the process. |
I verified that this works on my end. The IBM case still gives NaNs though... |
Thanks much, and I wonder what the deal is with the IBM case ngl. Any specific error messages or such? If the issue persists, we can just exclude that case somehow. Also, NaNs I guess won't fail the test as can be seen on my recent PR when I assigned null to IBM grind/exec #895 (comment) Edit: lmk, if you suspect anything that might have caused that. |
Well, the NaN issue was supposed to be fixed by #892 but it appears that that's not the case |
status? |
@sbryngelson done on my end tbh and nothing to add |
what's going on here? |
Any ideas @anandrdbz ? |
Description
Added one GPU benchmarking case by submitting SLURM jobs on Frontier - duplicate implementation of Phoenix. (#453)
Manually Benchmarking,
Cloning
Copying Bash Scripts into master
Submit Benchmark Jobs
Process Benchmark Results
once the slurm jobs are done