-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance vs GMP? #1
Comments
The library is still vaporware at the moment. That said I plan to bring all the lessons learned from https://github.com/mratsim/constantine on writing high performance big int code. In particular the inline assembler I developed for example for multiplication using MULX/ADCX/ADOX: https://github.com/mratsim/constantine/blob/e89429e/constantine/arithmetic/assembly/limbs_asm_mul_x86_adx_bmi2.nim#L72-L110 Constantine is significantly faster than GMP even though its constant-time but it's mostly because of specialization:
In particular, no library that uses pure C can reach GMP or Constantine speed simply due to the compiler being the biggest optimization barrier for big integer code, even with intrinsics: https://gcc.godbolt.org/z/2h768y #include <stdint.h>
#include <x86intrin.h>
void add256(uint64_t a[4], uint64_t b[4]){
uint8_t carry = 0;
for (int i = 0; i < 4; ++i)
carry = _addcarry_u64(carry, a[i], b[i], &a[i]);
} GCC: add256:
movq (%rsi), %rax
addq (%rdi), %rax
setc %dl
movq %rax, (%rdi)
movq 8(%rdi), %rax
addb $-1, %dl
adcq 8(%rsi), %rax
setc %dl
movq %rax, 8(%rdi)
movq 16(%rdi), %rax
addb $-1, %dl
adcq 16(%rsi), %rax
setc %dl
movq %rax, 16(%rdi)
movq 24(%rsi), %rax
addb $-1, %dl
adcq %rax, 24(%rdi)
ret Clang: add256:
movq (%rsi), %rax
addq %rax, (%rdi)
movq 8(%rsi), %rax
adcq %rax, 8(%rdi)
movq 16(%rsi), %rax
adcq %rax, 16(%rdi)
movq 24(%rsi), %rax
adcq %rax, 24(%rdi)
retq |
Curious how it compares to GMP, and, less importantly, to these:
links
The text was updated successfully, but these errors were encountered: