Skip to content

Optimize std::includes by using ranges::includes approach #5543

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jun 14, 2025

Conversation

AlexGuteniev
Copy link
Contributor

@AlexGuteniev AlexGuteniev commented May 24, 2025

❔ Why ranges approach

We have three outcomes of comparison of elements pair:

  1. Haystack element less. Advance only haystack
  2. Needle element less. We found mismatch, return false
  3. Elements equal. Advance haystack and needle

Out of these outcomes, we can get one in one branch, the rest will be two branches.
Since we have "less" predicate, we can't have equal in one branch.

Out of haystack less and needle less, the current std:: algorithm favors needle less, and the ranges algorithm favors haystack less.
Favoring haystack less optimizes long algorithm run, where haystack is way bigger than needle.
Favoring needle less optimizes early return, which is not so useful, as it happens only once per algorithm run.

🏁 Benchmarks

Not sure what is common case of includes, trying these three:
(0) Needle is seen contiguously in haystack
(1) Needle is seen not completely contiguously in haystack, but still compact (stride is geometric distribution)
(2) Needle is spread in haystack with almost the same stride (plus-minus one)
(3) Needle is sampled randomly from haystack

All these are multiplied by
(1) The needle is actually found
(0) The middle element of the needle does not match anything in haystack

I think these combinations are enough to explore the algorithm performance properties

Benchmark results show being affected by codegen gremlin, and expectedly 300/290 cases are less faster and more slower.
But overall looks like an improvement.

📜 Benchmark results

⚠️ mixed result with improvements over 2.0 sometimes, but degradation below 0.3 sometimes too ⚠️

⚠️ whole benchmark runs long ⚠️

Mixed results with overall improvement, but some great degradation
Benchmark Before After Speedup
bm_includes<int8_t, alg_type::std_fn>/3000/3/0/1 771 ns 374 ns 2.06
bm_includes<int8_t, alg_type::std_fn>/3000/22/0/1 786 ns 453 ns 1.74
bm_includes<int8_t, alg_type::std_fn>/3000/105/0/1 809 ns 432 ns 1.87
bm_includes<int8_t, alg_type::std_fn>/3000/1504/0/1 1169 ns 924 ns 1.27
bm_includes<int8_t, alg_type::std_fn>/3000/2750/0/1 1363 ns 1362 ns 1.00
bm_includes<int8_t, alg_type::std_fn>/300/3/0/1 85.9 ns 57.7 ns 1.49
bm_includes<int8_t, alg_type::std_fn>/300/22/0/1 91.0 ns 76.9 ns 1.18
bm_includes<int8_t, alg_type::std_fn>/300/105/0/1 123 ns 112 ns 1.10
bm_includes<int8_t, alg_type::std_fn>/300/290/0/1 162 ns 155 ns 1.05
bm_includes<int8_t, alg_type::std_fn>/3000/3/0/0 733 ns 375 ns 1.95
bm_includes<int8_t, alg_type::std_fn>/3000/22/0/0 726 ns 392 ns 1.85
bm_includes<int8_t, alg_type::std_fn>/3000/105/0/0 720 ns 409 ns 1.76
bm_includes<int8_t, alg_type::std_fn>/3000/1504/0/0 729 ns 572 ns 1.27
bm_includes<int8_t, alg_type::std_fn>/3000/2750/0/0 732 ns 710 ns 1.03
bm_includes<int8_t, alg_type::std_fn>/300/3/0/0 86.1 ns 58.0 ns 1.48
bm_includes<int8_t, alg_type::std_fn>/300/22/0/0 86.1 ns 63.5 ns 1.36
bm_includes<int8_t, alg_type::std_fn>/300/105/0/0 84.6 ns 85.7 ns 0.99
bm_includes<int8_t, alg_type::std_fn>/300/290/0/0 83.0 ns 86.9 ns 0.96
bm_includes<int8_t, alg_type::std_fn>/3000/3/1/1 727 ns 376 ns 1.93
bm_includes<int8_t, alg_type::std_fn>/3000/22/1/1 826 ns 696 ns 1.19
bm_includes<int8_t, alg_type::std_fn>/3000/105/1/1 817 ns 780 ns 1.05
bm_includes<int8_t, alg_type::std_fn>/3000/1504/1/1 1098 ns 1042 ns 1.05
bm_includes<int8_t, alg_type::std_fn>/3000/2750/1/1 1401 ns 1355 ns 1.03
bm_includes<int8_t, alg_type::std_fn>/300/3/1/1 90.0 ns 73.0 ns 1.23
bm_includes<int8_t, alg_type::std_fn>/300/22/1/1 110 ns 114 ns 0.96
bm_includes<int8_t, alg_type::std_fn>/300/105/1/1 130 ns 153 ns 0.85
bm_includes<int8_t, alg_type::std_fn>/300/290/1/1 153 ns 154 ns 0.99
bm_includes<int8_t, alg_type::std_fn>/3000/3/1/0 729 ns 374 ns 1.95
bm_includes<int8_t, alg_type::std_fn>/3000/22/1/0 733 ns 385 ns 1.90
bm_includes<int8_t, alg_type::std_fn>/3000/105/1/0 756 ns 784 ns 0.96
bm_includes<int8_t, alg_type::std_fn>/3000/1504/1/0 762 ns 562 ns 1.36
bm_includes<int8_t, alg_type::std_fn>/3000/2750/1/0 744 ns 707 ns 1.05
bm_includes<int8_t, alg_type::std_fn>/300/3/1/0 88.1 ns 57.2 ns 1.54
bm_includes<int8_t, alg_type::std_fn>/300/22/1/0 90.7 ns 68.9 ns 1.32
bm_includes<int8_t, alg_type::std_fn>/300/105/1/0 93.8 ns 106 ns 0.88
bm_includes<int8_t, alg_type::std_fn>/300/290/1/0 92.7 ns 86.7 ns 1.07
bm_includes<int8_t, alg_type::std_fn>/3000/3/2/1 1228 ns 635 ns 1.93
bm_includes<int8_t, alg_type::std_fn>/3000/22/2/1 1554 ns 1010 ns 1.54
bm_includes<int8_t, alg_type::std_fn>/3000/105/2/1 2208 ns 1929 ns 1.14
bm_includes<int8_t, alg_type::std_fn>/3000/1504/2/1 2007 ns 2468 ns 0.81
bm_includes<int8_t, alg_type::std_fn>/3000/2750/2/1 1416 ns 1572 ns 0.90
bm_includes<int8_t, alg_type::std_fn>/300/3/2/1 151 ns 109 ns 1.39
bm_includes<int8_t, alg_type::std_fn>/300/22/2/1 199 ns 157 ns 1.27
bm_includes<int8_t, alg_type::std_fn>/300/105/2/1 230 ns 237 ns 0.97
bm_includes<int8_t, alg_type::std_fn>/300/290/2/1 143 ns 161 ns 0.89
bm_includes<int8_t, alg_type::std_fn>/3000/3/2/0 731 ns 393 ns 1.86
bm_includes<int8_t, alg_type::std_fn>/3000/22/2/0 863 ns 553 ns 1.56
bm_includes<int8_t, alg_type::std_fn>/3000/105/2/0 1079 ns 867 ns 1.24
bm_includes<int8_t, alg_type::std_fn>/3000/1504/2/0 1008 ns 1251 ns 0.81
bm_includes<int8_t, alg_type::std_fn>/3000/2750/2/0 699 ns 775 ns 0.90
bm_includes<int8_t, alg_type::std_fn>/300/3/2/0 96.3 ns 72.4 ns 1.33
bm_includes<int8_t, alg_type::std_fn>/300/22/2/0 108 ns 88.0 ns 1.23
bm_includes<int8_t, alg_type::std_fn>/300/105/2/0 122 ns 124 ns 0.98
bm_includes<int8_t, alg_type::std_fn>/300/290/2/0 72.0 ns 84.8 ns 0.85
bm_includes<int8_t, alg_type::std_fn>/3000/3/3/1 1277 ns 668 ns 1.91
bm_includes<int8_t, alg_type::std_fn>/3000/22/3/1 1542 ns 1080 ns 1.43
bm_includes<int8_t, alg_type::std_fn>/3000/105/3/1 2167 ns 1571 ns 1.38
bm_includes<int8_t, alg_type::std_fn>/3000/1504/3/1 2042 ns 2429 ns 0.84
bm_includes<int8_t, alg_type::std_fn>/3000/2750/3/1 1416 ns 2189 ns 0.65
bm_includes<int8_t, alg_type::std_fn>/300/3/3/1 169 ns 119 ns 1.42
bm_includes<int8_t, alg_type::std_fn>/300/22/3/1 206 ns 159 ns 1.30
bm_includes<int8_t, alg_type::std_fn>/300/105/3/1 241 ns 220 ns 1.10
bm_includes<int8_t, alg_type::std_fn>/300/290/3/1 142 ns 192 ns 0.74
bm_includes<int8_t, alg_type::std_fn>/3000/3/3/0 976 ns 521 ns 1.87
bm_includes<int8_t, alg_type::std_fn>/3000/22/3/0 977 ns 643 ns 1.52
bm_includes<int8_t, alg_type::std_fn>/3000/105/3/0 1082 ns 841 ns 1.29
bm_includes<int8_t, alg_type::std_fn>/3000/1504/3/0 990 ns 1170 ns 0.85
bm_includes<int8_t, alg_type::std_fn>/3000/2750/3/0 706 ns 782 ns 0.90
bm_includes<int8_t, alg_type::std_fn>/300/3/3/0 140 ns 98.1 ns 1.43
bm_includes<int8_t, alg_type::std_fn>/300/22/3/0 97.6 ns 87.5 ns 1.12
bm_includes<int8_t, alg_type::std_fn>/300/105/3/0 121 ns 112 ns 1.08
bm_includes<int8_t, alg_type::std_fn>/300/290/3/0 72.2 ns 93.9 ns 0.77
bm_includes<int16_t, alg_type::std_fn>/3000/3/0/1 724 ns 372 ns 1.95
bm_includes<int16_t, alg_type::std_fn>/3000/22/0/1 720 ns 414 ns 1.74
bm_includes<int16_t, alg_type::std_fn>/3000/105/0/1 746 ns 426 ns 1.75
bm_includes<int16_t, alg_type::std_fn>/3000/1504/0/1 1078 ns 918 ns 1.17
bm_includes<int16_t, alg_type::std_fn>/3000/2750/0/1 1371 ns 1369 ns 1.00
bm_includes<int16_t, alg_type::std_fn>/300/3/0/1 89.3 ns 54.7 ns 1.63
bm_includes<int16_t, alg_type::std_fn>/300/22/0/1 88.6 ns 79.3 ns 1.12
bm_includes<int16_t, alg_type::std_fn>/300/105/0/1 116 ns 108 ns 1.07
bm_includes<int16_t, alg_type::std_fn>/300/290/0/1 152 ns 157 ns 0.97
bm_includes<int16_t, alg_type::std_fn>/3000/3/0/0 753 ns 367 ns 2.05
bm_includes<int16_t, alg_type::std_fn>/3000/22/0/0 717 ns 378 ns 1.90
bm_includes<int16_t, alg_type::std_fn>/3000/105/0/0 928 ns 394 ns 2.36
bm_includes<int16_t, alg_type::std_fn>/3000/1504/0/0 728 ns 562 ns 1.30
bm_includes<int16_t, alg_type::std_fn>/3000/2750/0/0 729 ns 705 ns 1.03
bm_includes<int16_t, alg_type::std_fn>/300/3/0/0 83.1 ns 52.9 ns 1.57
bm_includes<int16_t, alg_type::std_fn>/300/22/0/0 85.5 ns 64.0 ns 1.34
bm_includes<int16_t, alg_type::std_fn>/300/105/0/0 89.3 ns 79.8 ns 1.12
bm_includes<int16_t, alg_type::std_fn>/300/290/0/0 82.7 ns 90.2 ns 0.92
bm_includes<int16_t, alg_type::std_fn>/3000/3/1/1 717 ns 370 ns 1.94
bm_includes<int16_t, alg_type::std_fn>/3000/22/1/1 734 ns 404 ns 1.82
bm_includes<int16_t, alg_type::std_fn>/3000/105/1/1 792 ns 532 ns 1.49
bm_includes<int16_t, alg_type::std_fn>/3000/1504/1/1 1077 ns 917 ns 1.17
bm_includes<int16_t, alg_type::std_fn>/3000/2750/1/1 1368 ns 1372 ns 1.00
bm_includes<int16_t, alg_type::std_fn>/300/3/1/1 87.2 ns 55.4 ns 1.57
bm_includes<int16_t, alg_type::std_fn>/300/22/1/1 97.7 ns 91.2 ns 1.07
bm_includes<int16_t, alg_type::std_fn>/300/105/1/1 136 ns 201 ns 0.68
bm_includes<int16_t, alg_type::std_fn>/300/290/1/1 151 ns 156 ns 0.97
bm_includes<int16_t, alg_type::std_fn>/3000/3/1/0 720 ns 368 ns 1.96
bm_includes<int16_t, alg_type::std_fn>/3000/22/1/0 718 ns 383 ns 1.87
bm_includes<int16_t, alg_type::std_fn>/3000/105/1/0 728 ns 451 ns 1.61
bm_includes<int16_t, alg_type::std_fn>/3000/1504/1/0 737 ns 561 ns 1.31
bm_includes<int16_t, alg_type::std_fn>/3000/2750/1/0 730 ns 703 ns 1.04
bm_includes<int16_t, alg_type::std_fn>/300/3/1/0 83.6 ns 52.2 ns 1.60
bm_includes<int16_t, alg_type::std_fn>/300/22/1/0 87.2 ns 66.3 ns 1.32
bm_includes<int16_t, alg_type::std_fn>/300/105/1/0 90.2 ns 130 ns 0.69
bm_includes<int16_t, alg_type::std_fn>/300/290/1/0 84.1 ns 90.4 ns 0.93
bm_includes<int16_t, alg_type::std_fn>/3000/3/2/1 1228 ns 643 ns 1.91
bm_includes<int16_t, alg_type::std_fn>/3000/22/2/1 1585 ns 1036 ns 1.53
bm_includes<int16_t, alg_type::std_fn>/3000/105/2/1 2156 ns 2358 ns 0.91
bm_includes<int16_t, alg_type::std_fn>/3000/1504/2/1 1818 ns 1905 ns 0.95
bm_includes<int16_t, alg_type::std_fn>/3000/2750/2/1 1536 ns 1458 ns 1.05
bm_includes<int16_t, alg_type::std_fn>/300/3/2/1 154 ns 110 ns 1.40
bm_includes<int16_t, alg_type::std_fn>/300/22/2/1 170 ns 189 ns 0.90
bm_includes<int16_t, alg_type::std_fn>/300/105/2/1 190 ns 239 ns 0.79
bm_includes<int16_t, alg_type::std_fn>/300/290/2/1 159 ns 230 ns 0.69
bm_includes<int16_t, alg_type::std_fn>/3000/3/2/0 746 ns 384 ns 1.94
bm_includes<int16_t, alg_type::std_fn>/3000/22/2/0 845 ns 554 ns 1.53
bm_includes<int16_t, alg_type::std_fn>/3000/105/2/0 1076 ns 1164 ns 0.92
bm_includes<int16_t, alg_type::std_fn>/3000/1504/2/0 922 ns 1093 ns 0.84
bm_includes<int16_t, alg_type::std_fn>/3000/2750/2/0 767 ns 757 ns 1.01
bm_includes<int16_t, alg_type::std_fn>/300/3/2/0 97.8 ns 68.8 ns 1.42
bm_includes<int16_t, alg_type::std_fn>/300/22/2/0 88.5 ns 107 ns 0.83
bm_includes<int16_t, alg_type::std_fn>/300/105/2/0 104 ns 125 ns 0.83
bm_includes<int16_t, alg_type::std_fn>/300/290/2/0 74.2 ns 120 ns 0.62
bm_includes<int16_t, alg_type::std_fn>/3000/3/3/1 1304 ns 672 ns 1.94
bm_includes<int16_t, alg_type::std_fn>/3000/22/3/1 1552 ns 990 ns 1.57
bm_includes<int16_t, alg_type::std_fn>/3000/105/3/1 2143 ns 1809 ns 1.18
bm_includes<int16_t, alg_type::std_fn>/3000/1504/3/1 2082 ns 3124 ns 0.67
bm_includes<int16_t, alg_type::std_fn>/3000/2750/3/1 1439 ns 1956 ns 0.74
bm_includes<int16_t, alg_type::std_fn>/300/3/3/1 180 ns 123 ns 1.46
bm_includes<int16_t, alg_type::std_fn>/300/22/3/1 181 ns 220 ns 0.82
bm_includes<int16_t, alg_type::std_fn>/300/105/3/1 192 ns 278 ns 0.69
bm_includes<int16_t, alg_type::std_fn>/300/290/3/1 145 ns 215 ns 0.67
bm_includes<int16_t, alg_type::std_fn>/3000/3/3/0 999 ns 515 ns 1.94
bm_includes<int16_t, alg_type::std_fn>/3000/22/3/0 991 ns 623 ns 1.59
bm_includes<int16_t, alg_type::std_fn>/3000/105/3/0 1077 ns 886 ns 1.22
bm_includes<int16_t, alg_type::std_fn>/3000/1504/3/0 991 ns 1540 ns 0.64
bm_includes<int16_t, alg_type::std_fn>/3000/2750/3/0 719 ns 1058 ns 0.68
bm_includes<int16_t, alg_type::std_fn>/300/3/3/0 144 ns 91.1 ns 1.58
bm_includes<int16_t, alg_type::std_fn>/300/22/3/0 83.7 ns 118 ns 0.71
bm_includes<int16_t, alg_type::std_fn>/300/105/3/0 94.4 ns 136 ns 0.69
bm_includes<int16_t, alg_type::std_fn>/300/290/3/0 73.5 ns 106 ns 0.69
bm_includes<int32_t, alg_type::std_fn>/3000/3/0/1 728 ns 386 ns 1.89
bm_includes<int32_t, alg_type::std_fn>/3000/22/0/1 742 ns 387 ns 1.92
bm_includes<int32_t, alg_type::std_fn>/3000/105/0/1 764 ns 424 ns 1.80
bm_includes<int32_t, alg_type::std_fn>/3000/1504/0/1 1101 ns 916 ns 1.20
bm_includes<int32_t, alg_type::std_fn>/3000/2750/0/1 1401 ns 1345 ns 1.04
bm_includes<int32_t, alg_type::std_fn>/300/3/0/1 86.5 ns 58.2 ns 1.49
bm_includes<int32_t, alg_type::std_fn>/300/22/0/1 91.7 ns 72.1 ns 1.27
bm_includes<int32_t, alg_type::std_fn>/300/105/0/1 123 ns 105 ns 1.17
bm_includes<int32_t, alg_type::std_fn>/300/290/0/1 152 ns 152 ns 1.00
bm_includes<int32_t, alg_type::std_fn>/3000/3/0/0 733 ns 375 ns 1.95
bm_includes<int32_t, alg_type::std_fn>/3000/22/0/0 729 ns 385 ns 1.89
bm_includes<int32_t, alg_type::std_fn>/3000/105/0/0 746 ns 401 ns 1.86
bm_includes<int32_t, alg_type::std_fn>/3000/1504/0/0 737 ns 559 ns 1.32
bm_includes<int32_t, alg_type::std_fn>/3000/2750/0/0 742 ns 706 ns 1.05
bm_includes<int32_t, alg_type::std_fn>/300/3/0/0 88.9 ns 57.5 ns 1.55
bm_includes<int32_t, alg_type::std_fn>/300/22/0/0 86.8 ns 70.4 ns 1.23
bm_includes<int32_t, alg_type::std_fn>/300/105/0/0 94.4 ns 81.0 ns 1.17
bm_includes<int32_t, alg_type::std_fn>/300/290/0/0 83.3 ns 83.2 ns 1.00
bm_includes<int32_t, alg_type::std_fn>/3000/3/1/1 730 ns 374 ns 1.95
bm_includes<int32_t, alg_type::std_fn>/3000/22/1/1 754 ns 405 ns 1.86
bm_includes<int32_t, alg_type::std_fn>/3000/105/1/1 832 ns 512 ns 1.63
bm_includes<int32_t, alg_type::std_fn>/3000/1504/1/1 1104 ns 912 ns 1.21
bm_includes<int32_t, alg_type::std_fn>/3000/2750/1/1 1393 ns 1343 ns 1.04
bm_includes<int32_t, alg_type::std_fn>/300/3/1/1 87.0 ns 57.1 ns 1.52
bm_includes<int32_t, alg_type::std_fn>/300/22/1/1 108 ns 87.5 ns 1.23
bm_includes<int32_t, alg_type::std_fn>/300/105/1/1 129 ns 156 ns 0.83
bm_includes<int32_t, alg_type::std_fn>/300/290/1/1 153 ns 151 ns 1.01
bm_includes<int32_t, alg_type::std_fn>/3000/3/1/0 727 ns 372 ns 1.95
bm_includes<int32_t, alg_type::std_fn>/3000/22/1/0 733 ns 385 ns 1.90
bm_includes<int32_t, alg_type::std_fn>/3000/105/1/0 762 ns 442 ns 1.72
bm_includes<int32_t, alg_type::std_fn>/3000/1504/1/0 741 ns 565 ns 1.31
bm_includes<int32_t, alg_type::std_fn>/3000/2750/1/0 758 ns 708 ns 1.07
bm_includes<int32_t, alg_type::std_fn>/300/3/1/0 86.4 ns 56.6 ns 1.53
bm_includes<int32_t, alg_type::std_fn>/300/22/1/0 92.0 ns 71.2 ns 1.29
bm_includes<int32_t, alg_type::std_fn>/300/105/1/0 95.9 ns 107 ns 0.90
bm_includes<int32_t, alg_type::std_fn>/300/290/1/0 81.5 ns 83.8 ns 0.97
bm_includes<int32_t, alg_type::std_fn>/3000/3/2/1 1200 ns 635 ns 1.89
bm_includes<int32_t, alg_type::std_fn>/3000/22/2/1 1577 ns 1030 ns 1.53
bm_includes<int32_t, alg_type::std_fn>/3000/105/2/1 2212 ns 1654 ns 1.34
bm_includes<int32_t, alg_type::std_fn>/3000/1504/2/1 1499 ns 1211 ns 1.24
bm_includes<int32_t, alg_type::std_fn>/3000/2750/2/1 1413 ns 1365 ns 1.04
bm_includes<int32_t, alg_type::std_fn>/300/3/2/1 148 ns 107 ns 1.38
bm_includes<int32_t, alg_type::std_fn>/300/22/2/1 213 ns 170 ns 1.25
bm_includes<int32_t, alg_type::std_fn>/300/105/2/1 194 ns 215 ns 0.90
bm_includes<int32_t, alg_type::std_fn>/300/290/2/1 166 ns 225 ns 0.74
bm_includes<int32_t, alg_type::std_fn>/3000/3/2/0 738 ns 392 ns 1.88
bm_includes<int32_t, alg_type::std_fn>/3000/22/2/0 857 ns 584 ns 1.47
bm_includes<int32_t, alg_type::std_fn>/3000/105/2/0 1142 ns 846 ns 1.35
bm_includes<int32_t, alg_type::std_fn>/3000/1504/2/0 778 ns 606 ns 1.28
bm_includes<int32_t, alg_type::std_fn>/3000/2750/2/0 713 ns 690 ns 1.03
bm_includes<int32_t, alg_type::std_fn>/300/3/2/0 92.0 ns 71.5 ns 1.29
bm_includes<int32_t, alg_type::std_fn>/300/22/2/0 118 ns 93.1 ns 1.27
bm_includes<int32_t, alg_type::std_fn>/300/105/2/0 112 ns 124 ns 0.90
bm_includes<int32_t, alg_type::std_fn>/300/290/2/0 72.3 ns 104 ns 0.70
bm_includes<int32_t, alg_type::std_fn>/3000/3/3/1 1273 ns 666 ns 1.91
bm_includes<int32_t, alg_type::std_fn>/3000/22/3/1 1533 ns 1010 ns 1.52
bm_includes<int32_t, alg_type::std_fn>/3000/105/3/1 2260 ns 1710 ns 1.32
bm_includes<int32_t, alg_type::std_fn>/3000/1504/3/1 1981 ns 2473 ns 0.80
bm_includes<int32_t, alg_type::std_fn>/3000/2750/3/1 1415 ns 1867 ns 0.76
bm_includes<int32_t, alg_type::std_fn>/300/3/3/1 177 ns 118 ns 1.50
bm_includes<int32_t, alg_type::std_fn>/300/22/3/1 209 ns 163 ns 1.28
bm_includes<int32_t, alg_type::std_fn>/300/105/3/1 209 ns 209 ns 1.00
bm_includes<int32_t, alg_type::std_fn>/300/290/3/1 142 ns 156 ns 0.91
bm_includes<int32_t, alg_type::std_fn>/3000/3/3/0 974 ns 527 ns 1.85
bm_includes<int32_t, alg_type::std_fn>/3000/22/3/0 975 ns 619 ns 1.58
bm_includes<int32_t, alg_type::std_fn>/3000/105/3/0 1172 ns 853 ns 1.37
bm_includes<int32_t, alg_type::std_fn>/3000/1504/3/0 970 ns 1239 ns 0.78
bm_includes<int32_t, alg_type::std_fn>/3000/2750/3/0 706 ns 799 ns 0.88
bm_includes<int32_t, alg_type::std_fn>/300/3/3/0 137 ns 99.0 ns 1.38
bm_includes<int32_t, alg_type::std_fn>/300/22/3/0 96.0 ns 90.8 ns 1.06
bm_includes<int32_t, alg_type::std_fn>/300/105/3/0 122 ns 117 ns 1.04
bm_includes<int32_t, alg_type::std_fn>/300/290/3/0 72.0 ns 98.8 ns 0.73
bm_includes<int64_t, alg_type::std_fn>/3000/3/0/1 721 ns 374 ns 1.93
bm_includes<int64_t, alg_type::std_fn>/3000/22/0/1 726 ns 450 ns 1.61
bm_includes<int64_t, alg_type::std_fn>/3000/105/0/1 752 ns 428 ns 1.76
bm_includes<int64_t, alg_type::std_fn>/3000/1504/0/1 1081 ns 914 ns 1.18
bm_includes<int64_t, alg_type::std_fn>/3000/2750/0/1 1368 ns 1364 ns 1.00
bm_includes<int64_t, alg_type::std_fn>/300/3/0/1 85.7 ns 57.9 ns 1.48
bm_includes<int64_t, alg_type::std_fn>/300/22/0/1 96.3 ns 83.1 ns 1.16
bm_includes<int64_t, alg_type::std_fn>/300/105/0/1 122 ns 108 ns 1.13
bm_includes<int64_t, alg_type::std_fn>/300/290/0/1 152 ns 162 ns 0.94
bm_includes<int64_t, alg_type::std_fn>/3000/3/0/0 721 ns 372 ns 1.94
bm_includes<int64_t, alg_type::std_fn>/3000/22/0/0 720 ns 384 ns 1.88
bm_includes<int64_t, alg_type::std_fn>/3000/105/0/0 728 ns 402 ns 1.81
bm_includes<int64_t, alg_type::std_fn>/3000/1504/0/0 729 ns 565 ns 1.29
bm_includes<int64_t, alg_type::std_fn>/3000/2750/0/0 730 ns 709 ns 1.03
bm_includes<int64_t, alg_type::std_fn>/300/3/0/0 87.5 ns 56.3 ns 1.55
bm_includes<int64_t, alg_type::std_fn>/300/22/0/0 92.0 ns 67.7 ns 1.36
bm_includes<int64_t, alg_type::std_fn>/300/105/0/0 97.2 ns 80.9 ns 1.20
bm_includes<int64_t, alg_type::std_fn>/300/290/0/0 84.5 ns 89.8 ns 0.94
bm_includes<int64_t, alg_type::std_fn>/3000/3/1/1 726 ns 383 ns 1.90
bm_includes<int64_t, alg_type::std_fn>/3000/22/1/1 745 ns 416 ns 1.79
bm_includes<int64_t, alg_type::std_fn>/3000/105/1/1 856 ns 556 ns 1.54
bm_includes<int64_t, alg_type::std_fn>/3000/1504/1/1 1079 ns 912 ns 1.18
bm_includes<int64_t, alg_type::std_fn>/3000/2750/1/1 1367 ns 1348 ns 1.01
bm_includes<int64_t, alg_type::std_fn>/300/3/1/1 89.6 ns 65.4 ns 1.37
bm_includes<int64_t, alg_type::std_fn>/300/22/1/1 116 ns 96.9 ns 1.20
bm_includes<int64_t, alg_type::std_fn>/300/105/1/1 163 ns 203 ns 0.80
bm_includes<int64_t, alg_type::std_fn>/300/290/1/1 151 ns 158 ns 0.96
bm_includes<int64_t, alg_type::std_fn>/3000/3/1/0 719 ns 373 ns 1.93
bm_includes<int64_t, alg_type::std_fn>/3000/22/1/0 719 ns 385 ns 1.87
bm_includes<int64_t, alg_type::std_fn>/3000/105/1/0 812 ns 466 ns 1.74
bm_includes<int64_t, alg_type::std_fn>/3000/1504/1/0 726 ns 561 ns 1.29
bm_includes<int64_t, alg_type::std_fn>/3000/2750/1/0 754 ns 707 ns 1.07
bm_includes<int64_t, alg_type::std_fn>/300/3/1/0 86.8 ns 55.4 ns 1.57
bm_includes<int64_t, alg_type::std_fn>/300/22/1/0 92.2 ns 70.1 ns 1.32
bm_includes<int64_t, alg_type::std_fn>/300/105/1/0 114 ns 136 ns 0.84
bm_includes<int64_t, alg_type::std_fn>/300/290/1/0 84.2 ns 89.2 ns 0.94
bm_includes<int64_t, alg_type::std_fn>/3000/3/2/1 1217 ns 644 ns 1.89
bm_includes<int64_t, alg_type::std_fn>/3000/22/2/1 1621 ns 1064 ns 1.52
bm_includes<int64_t, alg_type::std_fn>/3000/105/2/1 2709 ns 2479 ns 1.09
bm_includes<int64_t, alg_type::std_fn>/3000/1504/2/1 1493 ns 1256 ns 1.19
bm_includes<int64_t, alg_type::std_fn>/3000/2750/2/1 1425 ns 1371 ns 1.04
bm_includes<int64_t, alg_type::std_fn>/300/3/2/1 163 ns 116 ns 1.41
bm_includes<int64_t, alg_type::std_fn>/300/22/2/1 250 ns 229 ns 1.09
bm_includes<int64_t, alg_type::std_fn>/300/105/2/1 307 ns 274 ns 1.12
bm_includes<int64_t, alg_type::std_fn>/300/290/2/1 218 ns 247 ns 0.88
bm_includes<int64_t, alg_type::std_fn>/3000/3/2/0 724 ns 390 ns 1.86
bm_includes<int64_t, alg_type::std_fn>/3000/22/2/0 855 ns 577 ns 1.48
bm_includes<int64_t, alg_type::std_fn>/3000/105/2/0 1349 ns 1243 ns 1.09
bm_includes<int64_t, alg_type::std_fn>/3000/1504/2/0 884 ns 623 ns 1.42
bm_includes<int64_t, alg_type::std_fn>/3000/2750/2/0 757 ns 691 ns 1.10
bm_includes<int64_t, alg_type::std_fn>/300/3/2/0 96.1 ns 74.1 ns 1.30
bm_includes<int64_t, alg_type::std_fn>/300/22/2/0 133 ns 126 ns 1.06
bm_includes<int64_t, alg_type::std_fn>/300/105/2/0 161 ns 144 ns 1.12
bm_includes<int64_t, alg_type::std_fn>/300/290/2/0 81.6 ns 126 ns 0.65
bm_includes<int64_t, alg_type::std_fn>/3000/3/3/1 1278 ns 676 ns 1.89
bm_includes<int64_t, alg_type::std_fn>/3000/22/3/1 1598 ns 1047 ns 1.53
bm_includes<int64_t, alg_type::std_fn>/3000/105/3/1 2324 ns 2056 ns 1.13
bm_includes<int64_t, alg_type::std_fn>/3000/1504/3/1 2771 ns 10143 ns 0.27
bm_includes<int64_t, alg_type::std_fn>/3000/2750/3/1 1587 ns 3351 ns 0.47
bm_includes<int64_t, alg_type::std_fn>/300/3/3/1 180 ns 125 ns 1.44
bm_includes<int64_t, alg_type::std_fn>/300/22/3/1 235 ns 239 ns 0.98
bm_includes<int64_t, alg_type::std_fn>/300/105/3/1 264 ns 286 ns 0.92
bm_includes<int64_t, alg_type::std_fn>/300/290/3/1 150 ns 235 ns 0.64
bm_includes<int64_t, alg_type::std_fn>/3000/3/3/0 972 ns 517 ns 1.88
bm_includes<int64_t, alg_type::std_fn>/3000/22/3/0 995 ns 649 ns 1.53
bm_includes<int64_t, alg_type::std_fn>/3000/105/3/0 1234 ns 1130 ns 1.09
bm_includes<int64_t, alg_type::std_fn>/3000/1504/3/0 1351 ns 4722 ns 0.29
bm_includes<int64_t, alg_type::std_fn>/3000/2750/3/0 801 ns 1608 ns 0.50
bm_includes<int64_t, alg_type::std_fn>/300/3/3/0 140 ns 95.7 ns 1.46
bm_includes<int64_t, alg_type::std_fn>/300/22/3/0 123 ns 125 ns 0.98
bm_includes<int64_t, alg_type::std_fn>/300/105/3/0 145 ns 144 ns 1.01
bm_includes<int64_t, alg_type::std_fn>/300/290/3/0 77.8 ns 121 ns 0.64
And random variation for ranges part
Benchmark Before After Speedup
bm_includes<int8_t, alg_type::rng>/3000/3/0/1 372 ns 370 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/22/0/1 440 ns 437 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/105/0/1 455 ns 459 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/1504/0/1 965 ns 975 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/2750/0/1 1384 ns 1409 ns 0.98
bm_includes<int8_t, alg_type::rng>/300/3/0/1 57.4 ns 56.1 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/22/0/1 72.8 ns 72.5 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/105/0/1 131 ns 132 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/290/0/1 154 ns 154 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/3/0/0 370 ns 371 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/22/0/0 376 ns 378 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/105/0/0 413 ns 438 ns 0.94
bm_includes<int8_t, alg_type::rng>/3000/1504/0/0 724 ns 744 ns 0.97
bm_includes<int8_t, alg_type::rng>/3000/2750/0/0 839 ns 837 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/3/0/0 57.5 ns 56.8 ns 1.01
bm_includes<int8_t, alg_type::rng>/300/22/0/0 72.5 ns 72.3 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/105/0/0 83.1 ns 81.3 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/290/0/0 86.3 ns 87.1 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/3/1/1 370 ns 370 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/22/1/1 515 ns 513 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/105/1/1 976 ns 978 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/1504/1/1 1053 ns 1063 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/2750/1/1 1468 ns 1429 ns 1.03
bm_includes<int8_t, alg_type::rng>/300/3/1/1 58.1 ns 57.2 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/22/1/1 180 ns 181 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/105/1/1 524 ns 527 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/290/1/1 154 ns 154 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/3/1/0 380 ns 373 ns 1.02
bm_includes<int8_t, alg_type::rng>/3000/22/1/0 379 ns 379 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/105/1/0 675 ns 677 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/1504/1/0 706 ns 720 ns 0.98
bm_includes<int8_t, alg_type::rng>/3000/2750/1/0 924 ns 870 ns 1.06
bm_includes<int8_t, alg_type::rng>/300/3/1/0 60.0 ns 56.6 ns 1.06
bm_includes<int8_t, alg_type::rng>/300/22/1/0 71.5 ns 67.9 ns 1.05
bm_includes<int8_t, alg_type::rng>/300/105/1/0 270 ns 271 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/290/1/0 86.6 ns 87.6 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/3/2/1 631 ns 634 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/22/2/1 1018 ns 1002 ns 1.02
bm_includes<int8_t, alg_type::rng>/3000/105/2/1 2016 ns 2002 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/1504/2/1 11429 ns 11401 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/2750/2/1 3722 ns 3691 ns 1.01
bm_includes<int8_t, alg_type::rng>/300/3/2/1 110 ns 108 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/22/2/1 292 ns 279 ns 1.05
bm_includes<int8_t, alg_type::rng>/300/105/2/1 922 ns 904 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/290/2/1 222 ns 220 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/3/2/0 388 ns 386 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/22/2/0 550 ns 548 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/105/2/0 1003 ns 996 ns 1.01
bm_includes<int8_t, alg_type::rng>/3000/1504/2/0 5625 ns 5643 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/2750/2/0 1832 ns 1863 ns 0.98
bm_includes<int8_t, alg_type::rng>/300/3/2/0 72.5 ns 71.1 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/22/2/0 135 ns 133 ns 1.02
bm_includes<int8_t, alg_type::rng>/300/105/2/0 371 ns 375 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/290/2/0 116 ns 116 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/3/3/1 666 ns 667 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/22/3/1 986 ns 991 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/105/3/1 1810 ns 1822 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/1504/3/1 10890 ns 10860 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/2750/3/1 3708 ns 3740 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/3/3/1 118 ns 118 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/22/3/1 286 ns 288 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/105/3/1 804 ns 813 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/290/3/1 296 ns 291 ns 1.02
bm_includes<int8_t, alg_type::rng>/3000/3/3/0 517 ns 517 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/22/3/0 615 ns 616 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/105/3/0 954 ns 957 ns 1.00
bm_includes<int8_t, alg_type::rng>/3000/1504/3/0 5281 ns 5348 ns 0.99
bm_includes<int8_t, alg_type::rng>/3000/2750/3/0 1869 ns 1885 ns 0.99
bm_includes<int8_t, alg_type::rng>/300/3/3/0 93.1 ns 94.8 ns 0.98
bm_includes<int8_t, alg_type::rng>/300/22/3/0 154 ns 154 ns 1.00
bm_includes<int8_t, alg_type::rng>/300/105/3/0 356 ns 366 ns 0.97
bm_includes<int8_t, alg_type::rng>/300/290/3/0 105 ns 106 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/3/0/1 369 ns 375 ns 0.98
bm_includes<int16_t, alg_type::rng>/3000/22/0/1 383 ns 418 ns 0.92
bm_includes<int16_t, alg_type::rng>/3000/105/0/1 417 ns 437 ns 0.95
bm_includes<int16_t, alg_type::rng>/3000/1504/0/1 910 ns 1318 ns 0.69
bm_includes<int16_t, alg_type::rng>/3000/2750/0/1 1353 ns 1357 ns 1.00
bm_includes<int16_t, alg_type::rng>/300/3/0/1 53.7 ns 57.2 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/22/0/1 66.9 ns 70.1 ns 0.95
bm_includes<int16_t, alg_type::rng>/300/105/0/1 101 ns 107 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/290/0/1 152 ns 153 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/3/0/0 368 ns 374 ns 0.98
bm_includes<int16_t, alg_type::rng>/3000/22/0/0 378 ns 397 ns 0.95
bm_includes<int16_t, alg_type::rng>/3000/105/0/0 456 ns 412 ns 1.11
bm_includes<int16_t, alg_type::rng>/3000/1504/0/0 581 ns 866 ns 0.67
bm_includes<int16_t, alg_type::rng>/3000/2750/0/0 710 ns 707 ns 1.00
bm_includes<int16_t, alg_type::rng>/300/3/0/0 53.5 ns 56.5 ns 0.95
bm_includes<int16_t, alg_type::rng>/300/22/0/0 62.7 ns 66.8 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/105/0/0 80.3 ns 81.8 ns 0.98
bm_includes<int16_t, alg_type::rng>/300/290/0/0 84.1 ns 84.8 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/3/1/1 370 ns 375 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/22/1/1 397 ns 434 ns 0.91
bm_includes<int16_t, alg_type::rng>/3000/105/1/1 506 ns 707 ns 0.72
bm_includes<int16_t, alg_type::rng>/3000/1504/1/1 906 ns 1225 ns 0.74
bm_includes<int16_t, alg_type::rng>/3000/2750/1/1 1343 ns 1350 ns 0.99
bm_includes<int16_t, alg_type::rng>/300/3/1/1 54.2 ns 57.2 ns 0.95
bm_includes<int16_t, alg_type::rng>/300/22/1/1 80.0 ns 87.2 ns 0.92
bm_includes<int16_t, alg_type::rng>/300/105/1/1 151 ns 174 ns 0.87
bm_includes<int16_t, alg_type::rng>/300/290/1/1 152 ns 153 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/3/1/0 369 ns 372 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/22/1/0 439 ns 391 ns 1.12
bm_includes<int16_t, alg_type::rng>/3000/105/1/0 526 ns 441 ns 1.19
bm_includes<int16_t, alg_type::rng>/3000/1504/1/0 614 ns 876 ns 0.70
bm_includes<int16_t, alg_type::rng>/3000/2750/1/0 704 ns 709 ns 0.99
bm_includes<int16_t, alg_type::rng>/300/3/1/0 53.2 ns 56.4 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/22/1/0 65.8 ns 68.7 ns 0.96
bm_includes<int16_t, alg_type::rng>/300/105/1/0 104 ns 115 ns 0.90
bm_includes<int16_t, alg_type::rng>/300/290/1/0 84.8 ns 84.9 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/3/2/1 631 ns 637 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/22/2/1 1023 ns 1022 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/105/2/1 1671 ns 1751 ns 0.95
bm_includes<int16_t, alg_type::rng>/3000/1504/2/1 1719 ns 1669 ns 1.03
bm_includes<int16_t, alg_type::rng>/3000/2750/2/1 1504 ns 1571 ns 0.96
bm_includes<int16_t, alg_type::rng>/300/3/2/1 106 ns 107 ns 0.99
bm_includes<int16_t, alg_type::rng>/300/22/2/1 171 ns 177 ns 0.97
bm_includes<int16_t, alg_type::rng>/300/105/2/1 217 ns 255 ns 0.85
bm_includes<int16_t, alg_type::rng>/300/290/2/1 219 ns 220 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/3/2/0 384 ns 388 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/22/2/0 555 ns 554 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/105/2/0 884 ns 885 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/1504/2/0 901 ns 945 ns 0.95
bm_includes<int16_t, alg_type::rng>/3000/2750/2/0 775 ns 797 ns 0.97
bm_includes<int16_t, alg_type::rng>/300/3/2/0 67.3 ns 71.3 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/22/2/0 89.8 ns 94.6 ns 0.95
bm_includes<int16_t, alg_type::rng>/300/105/2/0 109 ns 143 ns 0.76
bm_includes<int16_t, alg_type::rng>/300/290/2/0 97.6 ns 105 ns 0.93
bm_includes<int16_t, alg_type::rng>/3000/3/3/1 673 ns 671 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/22/3/1 998 ns 1009 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/105/3/1 1411 ns 1559 ns 0.91
bm_includes<int16_t, alg_type::rng>/3000/1504/3/1 2446 ns 2565 ns 0.95
bm_includes<int16_t, alg_type::rng>/3000/2750/3/1 1584 ns 1726 ns 0.92
bm_includes<int16_t, alg_type::rng>/300/3/3/1 116 ns 120 ns 0.97
bm_includes<int16_t, alg_type::rng>/300/22/3/1 156 ns 161 ns 0.97
bm_includes<int16_t, alg_type::rng>/300/105/3/1 222 ns 226 ns 0.98
bm_includes<int16_t, alg_type::rng>/300/290/3/1 151 ns 155 ns 0.97
bm_includes<int16_t, alg_type::rng>/3000/3/3/0 505 ns 516 ns 0.98
bm_includes<int16_t, alg_type::rng>/3000/22/3/0 613 ns 613 ns 1.00
bm_includes<int16_t, alg_type::rng>/3000/105/3/0 742 ns 840 ns 0.88
bm_includes<int16_t, alg_type::rng>/3000/1504/3/0 1229 ns 1242 ns 0.99
bm_includes<int16_t, alg_type::rng>/3000/2750/3/0 780 ns 782 ns 1.00
bm_includes<int16_t, alg_type::rng>/300/3/3/0 89.2 ns 96.6 ns 0.92
bm_includes<int16_t, alg_type::rng>/300/22/3/0 84.0 ns 89.7 ns 0.94
bm_includes<int16_t, alg_type::rng>/300/105/3/0 113 ns 127 ns 0.89
bm_includes<int16_t, alg_type::rng>/300/290/3/0 78.3 ns 112 ns 0.70
bm_includes<int32_t, alg_type::rng>/3000/3/0/1 371 ns 390 ns 0.95
bm_includes<int32_t, alg_type::rng>/3000/22/0/1 398 ns 394 ns 1.01
bm_includes<int32_t, alg_type::rng>/3000/105/0/1 461 ns 430 ns 1.07
bm_includes<int32_t, alg_type::rng>/3000/1504/0/1 911 ns 926 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/2750/0/1 1343 ns 1383 ns 0.97
bm_includes<int32_t, alg_type::rng>/300/3/0/1 55.8 ns 59.7 ns 0.93
bm_includes<int32_t, alg_type::rng>/300/22/0/1 76.0 ns 72.5 ns 1.05
bm_includes<int32_t, alg_type::rng>/300/105/0/1 105 ns 105 ns 1.00
bm_includes<int32_t, alg_type::rng>/300/290/0/1 152 ns 154 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/3/0/0 368 ns 385 ns 0.96
bm_includes<int32_t, alg_type::rng>/3000/22/0/0 384 ns 393 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/105/0/0 396 ns 404 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/1504/0/0 558 ns 570 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/2750/0/0 702 ns 722 ns 0.97
bm_includes<int32_t, alg_type::rng>/300/3/0/0 52.0 ns 59.9 ns 0.87
bm_includes<int32_t, alg_type::rng>/300/22/0/0 65.5 ns 70.1 ns 0.93
bm_includes<int32_t, alg_type::rng>/300/105/0/0 80.7 ns 80.9 ns 1.00
bm_includes<int32_t, alg_type::rng>/300/290/0/0 84.6 ns 85.3 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/3/1/1 373 ns 382 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/22/1/1 406 ns 417 ns 0.97
bm_includes<int32_t, alg_type::rng>/3000/105/1/1 655 ns 521 ns 1.26
bm_includes<int32_t, alg_type::rng>/3000/1504/1/1 915 ns 922 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/2750/1/1 1347 ns 1384 ns 0.97
bm_includes<int32_t, alg_type::rng>/300/3/1/1 57.0 ns 60.8 ns 0.94
bm_includes<int32_t, alg_type::rng>/300/22/1/1 90.1 ns 92.3 ns 0.98
bm_includes<int32_t, alg_type::rng>/300/105/1/1 171 ns 166 ns 1.03
bm_includes<int32_t, alg_type::rng>/300/290/1/1 153 ns 153 ns 1.00
bm_includes<int32_t, alg_type::rng>/3000/3/1/0 368 ns 381 ns 0.97
bm_includes<int32_t, alg_type::rng>/3000/22/1/0 383 ns 396 ns 0.97
bm_includes<int32_t, alg_type::rng>/3000/105/1/0 440 ns 457 ns 0.96
bm_includes<int32_t, alg_type::rng>/3000/1504/1/0 559 ns 565 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/2750/1/0 705 ns 719 ns 0.98
bm_includes<int32_t, alg_type::rng>/300/3/1/0 52.1 ns 57.8 ns 0.90
bm_includes<int32_t, alg_type::rng>/300/22/1/0 67.1 ns 72.3 ns 0.93
bm_includes<int32_t, alg_type::rng>/300/105/1/0 106 ns 116 ns 0.91
bm_includes<int32_t, alg_type::rng>/300/290/1/0 84.8 ns 84.6 ns 1.00
bm_includes<int32_t, alg_type::rng>/3000/3/2/1 635 ns 645 ns 0.98
bm_includes<int32_t, alg_type::rng>/3000/22/2/1 1002 ns 1050 ns 0.95
bm_includes<int32_t, alg_type::rng>/3000/105/2/1 1598 ns 1692 ns 0.94
bm_includes<int32_t, alg_type::rng>/3000/1504/2/1 1231 ns 1240 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/2750/2/1 1384 ns 1388 ns 1.00
bm_includes<int32_t, alg_type::rng>/300/3/2/1 109 ns 110 ns 0.99
bm_includes<int32_t, alg_type::rng>/300/22/2/1 138 ns 174 ns 0.79
bm_includes<int32_t, alg_type::rng>/300/105/2/1 268 ns 244 ns 1.10
bm_includes<int32_t, alg_type::rng>/300/290/2/1 249 ns 231 ns 1.08
bm_includes<int32_t, alg_type::rng>/3000/3/2/0 381 ns 402 ns 0.95
bm_includes<int32_t, alg_type::rng>/3000/22/2/0 525 ns 565 ns 0.93
bm_includes<int32_t, alg_type::rng>/3000/105/2/0 690 ns 866 ns 0.80
bm_includes<int32_t, alg_type::rng>/3000/1504/2/0 601 ns 617 ns 0.97
bm_includes<int32_t, alg_type::rng>/3000/2750/2/0 691 ns 709 ns 0.97
bm_includes<int32_t, alg_type::rng>/300/3/2/0 65.3 ns 76.2 ns 0.86
bm_includes<int32_t, alg_type::rng>/300/22/2/0 81.1 ns 96.4 ns 0.84
bm_includes<int32_t, alg_type::rng>/300/105/2/0 134 ns 137 ns 0.98
bm_includes<int32_t, alg_type::rng>/300/290/2/0 126 ns 108 ns 1.17
bm_includes<int32_t, alg_type::rng>/3000/3/3/1 666 ns 676 ns 0.99
bm_includes<int32_t, alg_type::rng>/3000/22/3/1 960 ns 1033 ns 0.93
bm_includes<int32_t, alg_type::rng>/3000/105/3/1 1394 ns 1698 ns 0.82
bm_includes<int32_t, alg_type::rng>/3000/1504/3/1 2781 ns 2518 ns 1.10
bm_includes<int32_t, alg_type::rng>/3000/2750/3/1 2174 ns 1842 ns 1.18
bm_includes<int32_t, alg_type::rng>/300/3/3/1 117 ns 119 ns 0.98
bm_includes<int32_t, alg_type::rng>/300/22/3/1 138 ns 166 ns 0.83
bm_includes<int32_t, alg_type::rng>/300/105/3/1 227 ns 216 ns 1.05
bm_includes<int32_t, alg_type::rng>/300/290/3/1 217 ns 159 ns 1.36
bm_includes<int32_t, alg_type::rng>/3000/3/3/0 508 ns 528 ns 0.96
bm_includes<int32_t, alg_type::rng>/3000/22/3/0 601 ns 624 ns 0.96
bm_includes<int32_t, alg_type::rng>/3000/105/3/0 775 ns 857 ns 0.90
bm_includes<int32_t, alg_type::rng>/3000/1504/3/0 1384 ns 1255 ns 1.10
bm_includes<int32_t, alg_type::rng>/3000/2750/3/0 1096 ns 788 ns 1.39
bm_includes<int32_t, alg_type::rng>/300/3/3/0 87.9 ns 97.1 ns 0.91
bm_includes<int32_t, alg_type::rng>/300/22/3/0 73.9 ns 89.6 ns 0.82
bm_includes<int32_t, alg_type::rng>/300/105/3/0 116 ns 116 ns 1.00
bm_includes<int32_t, alg_type::rng>/300/290/3/0 111 ns 96.8 ns 1.15
bm_includes<int64_t, alg_type::rng>/3000/3/0/1 375 ns 371 ns 1.01
bm_includes<int64_t, alg_type::rng>/3000/22/0/1 399 ns 466 ns 0.86
bm_includes<int64_t, alg_type::rng>/3000/105/0/1 424 ns 425 ns 1.00
bm_includes<int64_t, alg_type::rng>/3000/1504/0/1 936 ns 908 ns 1.03
bm_includes<int64_t, alg_type::rng>/3000/2750/0/1 1351 ns 1347 ns 1.00
bm_includes<int64_t, alg_type::rng>/300/3/0/1 58.1 ns 57.5 ns 1.01
bm_includes<int64_t, alg_type::rng>/300/22/0/1 70.3 ns 82.5 ns 0.85
bm_includes<int64_t, alg_type::rng>/300/105/0/1 102 ns 108 ns 0.94
bm_includes<int64_t, alg_type::rng>/300/290/0/1 150 ns 158 ns 0.95
bm_includes<int64_t, alg_type::rng>/3000/3/0/0 374 ns 370 ns 1.01
bm_includes<int64_t, alg_type::rng>/3000/22/0/0 382 ns 384 ns 0.99
bm_includes<int64_t, alg_type::rng>/3000/105/0/0 395 ns 398 ns 0.99
bm_includes<int64_t, alg_type::rng>/3000/1504/0/0 671 ns 563 ns 1.19
bm_includes<int64_t, alg_type::rng>/3000/2750/0/0 703 ns 704 ns 1.00
bm_includes<int64_t, alg_type::rng>/300/3/0/0 57.8 ns 55.6 ns 1.04
bm_includes<int64_t, alg_type::rng>/300/22/0/0 59.4 ns 68.1 ns 0.87
bm_includes<int64_t, alg_type::rng>/300/105/0/0 78.3 ns 81.1 ns 0.97
bm_includes<int64_t, alg_type::rng>/300/290/0/0 82.8 ns 90.1 ns 0.92
bm_includes<int64_t, alg_type::rng>/3000/3/1/1 372 ns 379 ns 0.98
bm_includes<int64_t, alg_type::rng>/3000/22/1/1 408 ns 413 ns 0.99
bm_includes<int64_t, alg_type::rng>/3000/105/1/1 541 ns 575 ns 0.94
bm_includes<int64_t, alg_type::rng>/3000/1504/1/1 945 ns 918 ns 1.03
bm_includes<int64_t, alg_type::rng>/3000/2750/1/1 1359 ns 1351 ns 1.01
bm_includes<int64_t, alg_type::rng>/300/3/1/1 56.9 ns 66.0 ns 0.86
bm_includes<int64_t, alg_type::rng>/300/22/1/1 91.1 ns 96.8 ns 0.94
bm_includes<int64_t, alg_type::rng>/300/105/1/1 169 ns 202 ns 0.84
bm_includes<int64_t, alg_type::rng>/300/290/1/1 153 ns 159 ns 0.96
bm_includes<int64_t, alg_type::rng>/3000/3/1/0 377 ns 370 ns 1.02
bm_includes<int64_t, alg_type::rng>/3000/22/1/0 393 ns 386 ns 1.02
bm_includes<int64_t, alg_type::rng>/3000/105/1/0 445 ns 463 ns 0.96
bm_includes<int64_t, alg_type::rng>/3000/1504/1/0 663 ns 562 ns 1.18
bm_includes<int64_t, alg_type::rng>/3000/2750/1/0 703 ns 705 ns 1.00
bm_includes<int64_t, alg_type::rng>/300/3/1/0 56.0 ns 55.3 ns 1.01
bm_includes<int64_t, alg_type::rng>/300/22/1/0 70.0 ns 71.2 ns 0.98
bm_includes<int64_t, alg_type::rng>/300/105/1/0 113 ns 136 ns 0.83
bm_includes<int64_t, alg_type::rng>/300/290/1/0 82.9 ns 89.4 ns 0.93
bm_includes<int64_t, alg_type::rng>/3000/3/2/1 634 ns 643 ns 0.99
bm_includes<int64_t, alg_type::rng>/3000/22/2/1 1017 ns 1062 ns 0.96
bm_includes<int64_t, alg_type::rng>/3000/105/2/1 1930 ns 2500 ns 0.77
bm_includes<int64_t, alg_type::rng>/3000/1504/2/1 1196 ns 1250 ns 0.96
bm_includes<int64_t, alg_type::rng>/3000/2750/2/1 1365 ns 1369 ns 1.00
bm_includes<int64_t, alg_type::rng>/300/3/2/1 107 ns 116 ns 0.92
bm_includes<int64_t, alg_type::rng>/300/22/2/1 170 ns 229 ns 0.74
bm_includes<int64_t, alg_type::rng>/300/105/2/1 254 ns 275 ns 0.92
bm_includes<int64_t, alg_type::rng>/300/290/2/1 225 ns 253 ns 0.89
bm_includes<int64_t, alg_type::rng>/3000/3/2/0 390 ns 390 ns 1.00
bm_includes<int64_t, alg_type::rng>/3000/22/2/0 549 ns 575 ns 0.95
bm_includes<int64_t, alg_type::rng>/3000/105/2/0 966 ns 1246 ns 0.78
bm_includes<int64_t, alg_type::rng>/3000/1504/2/0 601 ns 628 ns 0.96
bm_includes<int64_t, alg_type::rng>/3000/2750/2/0 690 ns 702 ns 0.98
bm_includes<int64_t, alg_type::rng>/300/3/2/0 72.0 ns 76.5 ns 0.94
bm_includes<int64_t, alg_type::rng>/300/22/2/0 96.5 ns 126 ns 0.77
bm_includes<int64_t, alg_type::rng>/300/105/2/0 142 ns 145 ns 0.98
bm_includes<int64_t, alg_type::rng>/300/290/2/0 119 ns 125 ns 0.95
bm_includes<int64_t, alg_type::rng>/3000/3/3/1 668 ns 678 ns 0.99
bm_includes<int64_t, alg_type::rng>/3000/22/3/1 989 ns 1120 ns 0.88
bm_includes<int64_t, alg_type::rng>/3000/105/3/1 1811 ns 2013 ns 0.90
bm_includes<int64_t, alg_type::rng>/3000/1504/3/1 9556 ns 10083 ns 0.95
bm_includes<int64_t, alg_type::rng>/3000/2750/3/1 3942 ns 2989 ns 1.32
bm_includes<int64_t, alg_type::rng>/300/3/3/1 117 ns 125 ns 0.94
bm_includes<int64_t, alg_type::rng>/300/22/3/1 195 ns 239 ns 0.82
bm_includes<int64_t, alg_type::rng>/300/105/3/1 228 ns 286 ns 0.80
bm_includes<int64_t, alg_type::rng>/300/290/3/1 216 ns 235 ns 0.92
bm_includes<int64_t, alg_type::rng>/3000/3/3/0 514 ns 515 ns 1.00
bm_includes<int64_t, alg_type::rng>/3000/22/3/0 617 ns 653 ns 0.94
bm_includes<int64_t, alg_type::rng>/3000/105/3/0 914 ns 1010 ns 0.90
bm_includes<int64_t, alg_type::rng>/3000/1504/3/0 4295 ns 4619 ns 0.93
bm_includes<int64_t, alg_type::rng>/3000/2750/3/0 1858 ns 1400 ns 1.33
bm_includes<int64_t, alg_type::rng>/300/3/3/0 95.0 ns 95.1 ns 1.00
bm_includes<int64_t, alg_type::rng>/300/22/3/0 92.3 ns 127 ns 0.73
bm_includes<int64_t, alg_type::rng>/300/105/3/0 123 ns 147 ns 0.84
bm_includes<int64_t, alg_type::rng>/300/290/3/0 95.1 ns 123 ns 0.77

@AlexGuteniev AlexGuteniev requested a review from a team as a code owner May 24, 2025 21:22
@github-project-automation github-project-automation bot moved this to Initial Review in STL Code Reviews May 24, 2025
@StephanTLavavej StephanTLavavej self-assigned this May 27, 2025
@StephanTLavavej StephanTLavavej added the performance Must go faster label May 27, 2025
@StephanTLavavej
Copy link
Member

Thanks! 😻 I pushed medium-small changes, please double-check.

@StephanTLavavej StephanTLavavej removed their assignment Jun 3, 2025
@StephanTLavavej StephanTLavavej moved this from Initial Review to Ready To Merge in STL Code Reviews Jun 3, 2025
@AlexGuteniev
Copy link
Contributor Author

I think the chance of UB were lower, than your commit message says. We pick the middle needle elemnt to break the match, not an arbitrary random number.

Otherwise fine.

Please confirm that my results where some cases became worse, don't concern you too much.

@StephanTLavavej
Copy link
Member

I think the chance of UB were lower, than your commit message says.

Good point, yeah. 🎲

Please confirm that my results where some cases became worse, don't concern you too much.

Yeah, that's fine. Thanks for the analysis! 😸

@StephanTLavavej StephanTLavavej moved this from Ready To Merge to Merging in STL Code Reviews Jun 11, 2025
@StephanTLavavej
Copy link
Member

I'm mirroring this to the MSVC-internal repo - please notify me if any further changes are pushed.

@StephanTLavavej StephanTLavavej merged commit 343e96e into microsoft:main Jun 14, 2025
48 checks passed
@github-project-automation github-project-automation bot moved this from Merging to Done in STL Code Reviews Jun 14, 2025
@StephanTLavavej
Copy link
Member

Must go faster! 🚗 🦕 🦖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants