-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
char to_digit: avoid unnecessary casts to u64 #134969
Conversation
A quick look shows that the assembly is different as expected, so I'm going to kick off a perf run to verify things. If anything it should be slightly faster, but I know better than to predict perf blindly. @bors try @rust-timer queue |
This comment has been minimized.
This comment has been minimized.
char to_digit: avoid unnecessary casts to u64 Hello, in the `char::to_digit` method there are a few `as u64` casts that are not strictly necessary. I assume that the reason behind these casts is to avoid possible overflows in the `+ 10` add. This PR removes the aforementioned casts, avoiding the overflow issue by slightly modifying the ASCII letter to int mapping. Thanks, Happy new year.
⌛ Trying commit aa685bc with merge 3757ef29634d5acf6b6e0cbf46e69cd7de2481cd... |
☀️ Try build successful - checks-actions |
This comment has been minimized.
This comment has been minimized.
Finished benchmarking commit (3757ef2): comparison URL. Overall result: ❌ regressions - no action neededBenchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Instruction countThis is the most reliable metric that we have; it was used to determine the overall result at the top of this comment. However, even this metric can sometimes exhibit noise.
Max RSS (memory usage)Results (primary 0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -4.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 762.584s -> 761.845s (-0.10%) |
Regressions seem to be noise, with there being minor improvements in other (less reliable) metrics. Even if it were neutral that would be sufficient for me. Thanks! @bors r+ |
does perf actually run the |
@bors r- Let's discuss on Zulip. |
perf runs only on crates listed on page, smaller benchmarks from library not benched that way. |
I manually ran the for b in master pull-34969; do git switch "$b"; for i in {1..3}; do RUST_BACKTRACE=1 ./x.py bench library/core -- to_digit > "$b".txt; done; done
diff -U3 master.txt pull-34969.txt master is at d117b7f, pull-34969 is at aa685bc --- master.txt 2024-12-31 16:52:47.377514215 -0800
+++ pull-34969.txt 2024-12-31 16:54:43.787387531 -0800
@@ -11,15 +11,15 @@
bbbbb
benchmarks:
- char::methods::bench_to_digit_radix_10 2856.63ns/iter +/- 20.97
- char::methods::bench_to_digit_radix_16 2845.83ns/iter +/- 13.06
- char::methods::bench_to_digit_radix_2 2843.90ns/iter +/- 9.80
- char::methods::bench_to_digit_radix_36 9652.53ns/iter +/- 54.87
- char::methods::bench_to_digit_radix_var 15246.25ns/iter +/- 283.04
+ char::methods::bench_to_digit_radix_10 2882.53ns/iter +/- 9.13
+ char::methods::bench_to_digit_radix_16 2885.77ns/iter +/- 5.28
+ char::methods::bench_to_digit_radix_2 2882.67ns/iter +/- 3.06
+ char::methods::bench_to_digit_radix_36 9654.94ns/iter +/- 38.77
+ char::methods::bench_to_digit_radix_var 14723.90ns/iter +/- 41.44
-test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 514 filtered out; finished in 1.24s
+test result: ok. 0 passed; 0 failed; 0 ignored; 5 measured; 514 filtered out; finished in 5.39s
- finished in 1.255 seconds
+ finished in 5.413 seconds |
ran it another time and the results reversed for radix 2, 10, and 16: - char::methods::bench_to_digit_radix_10 2897.59ns/iter +/- 10.50
- char::methods::bench_to_digit_radix_16 2884.83ns/iter +/- 6.94
- char::methods::bench_to_digit_radix_2 2883.97ns/iter +/- 11.17
- char::methods::bench_to_digit_radix_36 9758.82ns/iter +/- 5.09
- char::methods::bench_to_digit_radix_var 15222.17ns/iter +/- 27.43
+ char::methods::bench_to_digit_radix_10 2843.65ns/iter +/- 13.94
+ char::methods::bench_to_digit_radix_16 2853.59ns/iter +/- 17.59
+ char::methods::bench_to_digit_radix_2 2850.38ns/iter +/- 15.10
+ char::methods::bench_to_digit_radix_36 9544.87ns/iter +/- 58.91
+ char::methods::bench_to_digit_radix_var 14527.29ns/iter +/- 38.01 so looks like just measurement noise to me for those and a distinct improvement for the |
Looks neutral for the common case and an improvement for the variable case, then. A 1%ish difference in either direction is quite small and difficult to detect. @bors r=jhpratt,programmerjake |
☀️ Test successful - checks-actions |
Finished benchmarking commit (eeeff9a): comparison URL. Overall result: no relevant changes - no action needed@rustbot label: -perf-regression Instruction countThis benchmark run did not return any relevant results for this metric. Max RSS (memory usage)Results (primary -0.6%, secondary -2.6%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResults (secondary -2.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Binary sizeResults (primary 0.0%, secondary -0.0%)This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Bootstrap: 762.453s -> 760.671s (-0.23%) |
Hello,
in the
char::to_digit
method there are a fewas u64
casts that are not strictly necessary.I assume that the reason behind these casts is to avoid possible overflows in the
+ 10
add.This PR removes the aforementioned casts, avoiding the overflow issue by slightly modifying the ASCII letter to int mapping.
Thanks,
Happy new year.