Skip to content

#1073: Caching hash method pointers globally to improve performances #1077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

haxelion
Copy link

@haxelion haxelion commented Apr 25, 2025

This PR fixes the performance issue with TCrypto hash functions we've observed as explained in issue #1073.

Summary of the changes

  • common/inc/internal/se_tcrypto_common.h: add a global structure containing IppsHashMethod pointers when TCrypto is build with IPP
  • sdk/tlibcrypto/ipp/init_tcrypto_lib.cpp: populate the global hash methods structure after IPP has been initialized with CPUID values.
  • sdk/tlibcrypto/Makefile: avoid SHA1 deprecation warning when building init_tcrypto_lib.cpp.
  • other sdk/tlibcrypto/ipp/*.cpp files: switch to use the global hash method structure.

I've also performed the change in RSA and ECC code but haven't benchmarked it. However I don't expect a significant performance uplift in those cases. I simply did it for consistency.

I'm also unsure of what code conventions I should follow so please don't hesitate to ask for such changes.

Performance data

The following benchmark were run on a dual Intel(R) Xeon(R) Silver 4310 system. The support of SHANI amplify the magnitude of the issue so systems without SHANI support will see a smaller uplift.

Benchmarks can be replicated using the enclave code here: https://github.com/haxelion/sgx_tcrypto_bench

The benchmark code perform 1,000,000 iteration of the selected algorithm inside an ECALL. Several of those ECALL are called in parallel from multiple threads. What is expected is that, up to the core count, there shouldn't be any noticeable slowdown.

I've also attached the detailed console output to this PR: Intel TCrypto benchmark data.md.

SHA1

Thread count 1 2 6 12 24
Without fix 87 ms 727 ms 1093 ms 1977 ms 3684 ms
With fix 83 ms 83 ms 83 ms 83 ms 84 ms

24 threads speed-up: 44x

SHA256

Thread count 1 2 6 12 24
Without fix 114 ms 884 ms 1671 ms 2883 ms 4732 ms
With fix 107 ms 108 ms 107 ms 110 ms 109 ms

24 threads speed-up: 43x

HMAC-SHA256

Thread count 1 2 6 12 24
Without fix 640 ms 766 ms 3404 ms 6183 ms 8875 ms
With fix 636 ms 636 ms 634 ms 640 ms 637 ms

24 threads speed-up: 14x

SHA384

Thread count 1 2 6 12 24
Without fix 332 ms 1018 ms 1451 ms 2263 ms 4074 ms
With fix 333 ms 333 ms 333 ms 333 ms 333 ms

24 threads speed-up: 12x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant