Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some fixes to benchmark framework including weak scaling of wave equation #577

Merged
merged 6 commits into from
Oct 14, 2023

Conversation

lslusarczyk
Copy link
Contributor

No description provided.

@lslusarczyk
Copy link
Contributor Author

lslusarczyk commented Oct 10, 2023

dr-bench-WaveEquation-GPU
weak scaling looks OK (not as fast as we want, but at least correct) with this PR and analysis runs fast (~ 1sec each mhp-bench invocation on devcloud)

Copy link
Member

@rscohn2 rscohn2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing the weak scaling issue.

"--different-devices",
is_flag=True,
default=False,
help="Ensures there are no multiple ranks on one SYCL device",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
help="Ensures there are no multiple ranks on one SYCL device",
help="Ensures there are not multiple ranks on one SYCL device",

@@ -129,7 +129,8 @@ inline std::string hostname() {
inline sycl::queue &sycl_queue() { return __detail::gcontext()->sycl_queue_; }
inline auto dpl_policy() { return __detail::gcontext()->dpl_policy_; }

inline sycl::queue select_queue(MPI_Comm comm = MPI_COMM_WORLD) {
inline sycl::queue select_queue(MPI_Comm comm,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
inline sycl::queue select_queue(MPI_Comm comm,
inline sycl::queue select_queue(

It is OK to delete the comm argument so the recommended usage can remain: select_queue(). We don't support or test different comms anyway.

@@ -666,7 +673,8 @@ int main(int argc, char *argv[]) {

static void WaveEquation_DR(benchmark::State &state) {

int n = 4000;
int n = ::sqrtl(default_vector_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange that the running time is only 1 second because 4000 * 4000 = 16M and default vector size is 2B, but I see you did an actual benchmarking run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems strange that the running time is only 1 second because 4000 * 4000 = 16M and default vector size is 2B, but I see you did an actual benchmarking run.

I need to rerun. I've checked this again and I saw that I had manually specified a size when creating my plot. It was smaller than default. Hence 1sec :( I need to wait with checking this until they devcloud will resolve Invalid account or account/partition combination specified problem.

Ugly hack is to add n/=4 line to change default just in this test. Nice way is to change size of this test in suite in python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous plot was made with problem size 10M, now I've put temporarily an ugly hack and problem size is 500M, it runs in 10sec and gives the following plot
dr-bench-WaveEquation-GPU

@@ -447,6 +461,7 @@ def multi_node(base):
base.vec_size = [vec_size]
base.reps = reps
base.weak_scaling = weak_scaling
base.different_devices = different_devices
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not happy with the code I wrote here. The redundancy in adding arguments is just 1 problem. Now that we understand better what the runner and plotter has to do, I expect something much simpler could work. Something to think about in the future....

@lslusarczyk lslusarczyk linked an issue Oct 12, 2023 that may be closed by this pull request
@lslusarczyk lslusarczyk marked this pull request as ready for review October 12, 2023 07:36
@rscohn2
Copy link
Member

rscohn2 commented Oct 12, 2023

The appropriate problem size depends on the benchmark. We need a size that will work for performance measurement and a size that is good for a quick test. Using --default-vector-size to control everything will not work. Something like:

  • most benchmarks rely on default-vector-size as before
  • default-vector-size is controlled by some switches (--test (short run), --bench (long run), --weak-scaling)
  • benchmarks like wave equation ignore default-vector-size and look at --test, --bench, --weak-scaling and pick problem size
  • similar for matrix multiply, fft, ...

@lslusarczyk
Copy link
Contributor Author

The appropriate problem size depends on the benchmark...

I have an idea of writing suites in just flat way - no if-s, more hardcoded params in suites, including problem sizes passed to separate benchmarks. Once I will have ready conception details I will discuss with you.

@lslusarczyk lslusarczyk enabled auto-merge (squash) October 13, 2023 07:59
@lslusarczyk lslusarczyk merged commit 3002448 into oneapi-src:main Oct 14, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix weak scaling of wave equation
2 participants