Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reduce binary operations for Chi() & Maj() #104

Closed
wants to merge 2 commits into from

Conversation

kewde
Copy link

@kewde kewde commented Feb 4, 2025

I noticed some interesting albeit insignificant optimizations in some SHA256 implementations.

https://github.com/bitcoin-core/secp256k1/blob/00774d0723af1974e2a113db4adc479bfc47e20f/src/hash_impl.h#L18C14-L18C55

They remove one binary operation, but perhaps not worth the extra mental load for auditors since it does deviate from FIPS etc.

I've also seen variants where the last operator in Maj is ^ instead, I assume that it has the same performance as |.
https://github.com/digitalbazaar/forge/blob/2bb97afb5058285ef09bcf1d04d6bd6b87cffd58/lib/sha256.js#L297

@kewde kewde force-pushed the kewde/sha-optimizations branch from 06734e0 to a657337 Compare February 4, 2025 16:09
@paulmillr
Copy link
Owner

Could you run benchmarks and tell the difference? npm run bench

@kewde
Copy link
Author

kewde commented Feb 4, 2025

I haven't seen a statistically notable difference on my machine.

I've changed the benchmark to only run sha256 & do 32B for 10M runs.
After running each one for about 10 times, I've picked the best results for each:

master

# 32B
sha256 x 1,381,215 ops/sec @ 724ns/op

PR

# 32B
sha256 x 1,396,648 ops/sec @ 716ns/op

Slightly better, at best a 1% improvement in the best runs

All master runs

user@Mac noble-hashes % yarn bench             
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,315,789 ops/sec @ 760ns/op

# 1MB
sha256 x 243 ops/sec @ 4ms/op

✨  Done in 11.91s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,371,742 ops/sec @ 729ns/op

# 1MB
sha256 x 230 ops/sec @ 4ms/op

✨  Done in 11.37s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,381,215 ops/sec @ 724ns/op

# 1MB
sha256 x 235 ops/sec @ 4ms/op ± 5.02% (4ms..12ms)

✨  Done in 11.25s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,375,515 ops/sec @ 727ns/op

# 1MB
sha256 x 233 ops/sec @ 4ms/op

✨  Done in 11.23s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,335,113 ops/sec @ 749ns/op

# 1MB
sha256 x 242 ops/sec @ 4ms/op

✨  Done in 11.74s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,355,013 ops/sec @ 738ns/op

# 1MB
sha256 x 242 ops/sec @ 4ms/op

✨  Done in 11.48s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,342,281 ops/sec @ 745ns/op

# 1MB
sha256 x 228 ops/sec @ 4ms/op

✨  Done in 11.83s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,355,013 ops/sec @ 738ns/op

# 1MB
sha256 x 241 ops/sec @ 4ms/op

✨  Done in 11.56s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,356,852 ops/sec @ 737ns/op

# 1MB
sha256 x 243 ops/sec @ 4ms/op

✨  Done in 11.62s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,356,852 ops/sec @ 737ns/op

# 1MB
sha256 x 242 ops/sec @ 4ms/op

✨  Done in 11.31s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,344,086 ops/sec @ 744ns/op

# 1MB
sha256 x 245 ops/sec @ 4ms/op

✨  Done in 11.54s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,366,120 ops/sec @ 732ns/op

# 1MB
sha256 x 232 ops/sec @ 4ms/op

✨  Done in 11.58s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,366,120 ops/sec @ 732ns/op

# 1MB
sha256 x 233 ops/sec @ 4ms/op

✨  Done in 11.53s.

All PR runs

user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,373,626 ops/sec @ 728ns/op

# 1MB
sha256 x 243 ops/sec @ 4ms/op

✨  Done in 11.35s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,396,648 ops/sec @ 716ns/op

# 1MB
sha256 x 243 ops/sec @ 4ms/op

✨  Done in 11.22s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,392,757 ops/sec @ 718ns/op

# 1MB
sha256 x 244 ops/sec @ 4ms/op

✨  Done in 11.12s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,375,515 ops/sec @ 727ns/op

# 1MB
sha256 x 219 ops/sec @ 4ms/op ± 5.35% (4ms..15ms)

✨  Done in 11.69s.
user@Mac noble-hashes % yarn bench
yarn run v1.19.1
$ node benchmark/noble.js
# 32B
sha256 x 1,351,351 ops/sec @ 740ns/op

# 1MB
sha256 x 241 ops/sec @ 4ms/op

@paulmillr
Copy link
Owner

Thanks for the pull request! We will keep it as-is for now. There are different ways of doing that.

@paulmillr paulmillr closed this Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants