Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mhp exclusive scan much slower than inclusive (exclusive scan perf analyse) #588

Open
haichangsi opened this issue Oct 19, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@haichangsi
Copy link
Contributor

No description provided.

@haichangsi haichangsi converted this from a draft issue Oct 19, 2023
@haichangsi haichangsi self-assigned this Oct 19, 2023
@lslusarczyk lslusarczyk changed the title exclusive scan perf analyse mhp exclusive scan much slower than inclusive (exclusive scan perf analyse) Nov 2, 2023
@lslusarczyk lslusarczyk added the bug Something isn't working label Nov 2, 2023
@lslusarczyk
Copy link
Contributor

When looking at benchmarks at borealis: https://github.com/intel-sandbox/libraries.runtimes.hpc.dds.dr-ci/actions/runs/6717195674

For inclusive scan times are as follow:

Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       4083 ms         4081 ms            1 bytes_per_second=182.481G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       2501 ms         2498 ms            1 bytes_per_second=297.951G/s footprint=7.45058G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       1654 ms         1651 ms            1 bytes_per_second=450.547G/s footprint=4.96705G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       1272 ms         1270 ms            1 bytes_per_second=585.591G/s footprint=3.72529G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       1045 ms         1043 ms            1 bytes_per_second=712.966G/s footprint=2.98023G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        863 ms          862 ms            1 bytes_per_second=862.869G/s footprint=2.48353G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        737 ms          735 ms            1 bytes_per_second=1011.18G/s footprint=2.12874G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        645 ms          643 ms            1 bytes_per_second=1.1285T/s footprint=1.86265G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        573 ms          571 ms            1 bytes_per_second=1.26983T/s footprint=1.65568G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        518 ms          516 ms            1 bytes_per_second=1.40465T/s footprint=1.49012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        472 ms          470 ms            1 bytes_per_second=1.54308T/s footprint=1.35465G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time        426 ms          425 ms            1 bytes_per_second=1.70636T/s footprint=1.24176G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       4066 ms         4064 ms            1 bytes_per_second=183.241G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5099 ms         5093 ms            1 bytes_per_second=292.251G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5105 ms         5100 ms            1 bytes_per_second=437.868G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5251 ms         5246 ms            1 bytes_per_second=567.564G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5256 ms         5251 ms            1 bytes_per_second=708.757G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5360 ms         5355 ms            1 bytes_per_second=833.972G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5362 ms         5357 ms            1 bytes_per_second=972.734G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5358 ms         5354 ms            1 bytes_per_second=1112.36G/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5412 ms         4908 ms            1 bytes_per_second=1.20993T/s footprint=14.9012G
Inclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       5340 ms         5335 ms            1 bytes_per_second=1.36253T/s footprint=14.9012G

On the other hand for exclusive scan it is:

Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       7160 ms         7156 ms            1 bytes_per_second=104.056G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      25702 ms        25685 ms            1 bytes_per_second=28.9883G/s footprint=7.45058G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      23682 ms        23662 ms            1 bytes_per_second=31.4607G/s footprint=4.96705G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      31002 ms        30978 ms            1 bytes_per_second=24.0323G/s footprint=3.72529G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      27332 ms        27309 ms            1 bytes_per_second=27.2598G/s footprint=2.98023G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      24186 ms        24164 ms            1 bytes_per_second=30.8055G/s footprint=2.48353G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      25131 ms        25108 ms            1 bytes_per_second=29.6468G/s footprint=2.12874G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      25281 ms        25250 ms            1 bytes_per_second=29.4713G/s footprint=1.86265G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      27502 ms        27455 ms            1 bytes_per_second=27.0913G/s footprint=1.65568G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      27991 ms        27933 ms            1 bytes_per_second=26.6181G/s footprint=1.49012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      29720 ms        29667 ms            1 bytes_per_second=25.0693G/s footprint=1.35465G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      30183 ms        30103 ms            1 bytes_per_second=24.6849G/s footprint=1.24176G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time       7112 ms         7108 ms            1 bytes_per_second=104.766G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      54683 ms        54654 ms            1 bytes_per_second=27.2502G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time      87625 ms        87567 ms            1 bytes_per_second=25.5085G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     148550 ms       148468 ms            1 bytes_per_second=20.0622G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     191582 ms       191448 ms            1 bytes_per_second=19.4449G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     220317 ms       220133 ms            1 bytes_per_second=20.2905G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     265100 ms       264817 ms            1 bytes_per_second=19.6733G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     341283 ms       324785 ms            1 bytes_per_second=17.4649G/s footprint=14.9012G
Exclusive_Scan_DR/min_time:0.100/min_warmup_time:0.100/real_time     346785 ms       346147 ms            1 bytes_per_second=19.3362G/s footprint=14.9012G

E.g. for last benchmark it is 5sec vs 346sec.

Check why it is so. For now I am disabling also mhp benchmarks.

@haichangsi haichangsi self-assigned this Nov 8, 2023
@haichangsi haichangsi moved this to 👀 In review in Distributed-Ranges Project Nov 8, 2023
@haichangsi haichangsi moved this from 👀 In review to 🏗 In progress in Distributed-Ranges Project Nov 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🏗 In progress
Development

No branches or pull requests

2 participants