-
Notifications
You must be signed in to change notification settings - Fork 4
Add spec and proof for polyvec_matrix_expand
#232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4b193f4
to
b729a2d
Compare
Signed-off-by: manastasova <[email protected]>
b729a2d
to
4642e80
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
123219 cycles |
122576 cycles |
1.01 |
ML-DSA-44 sign |
278554 cycles |
277654 cycles |
1.00 |
ML-DSA-44 verify |
124131 cycles |
123314 cycles |
1.01 |
ML-DSA-65 keypair |
222802 cycles |
220403 cycles |
1.01 |
ML-DSA-65 sign |
477931 cycles |
475379 cycles |
1.01 |
ML-DSA-65 verify |
209864 cycles |
207524 cycles |
1.01 |
ML-DSA-87 keypair |
376802 cycles |
373246 cycles |
1.01 |
ML-DSA-87 sign |
660875 cycles |
660211 cycles |
1.00 |
ML-DSA-87 verify |
372335 cycles |
368672 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
479505 cycles |
464078 cycles |
1.03 |
ML-DSA-44 sign |
1173525 cycles |
1156451 cycles |
1.01 |
ML-DSA-44 verify |
491013 cycles |
476282 cycles |
1.03 |
ML-DSA-65 keypair |
838581 cycles |
818208 cycles |
1.02 |
ML-DSA-65 sign |
1958744 cycles |
1940565 cycles |
1.01 |
ML-DSA-65 verify |
803149 cycles |
786332 cycles |
1.02 |
ML-DSA-87 keypair |
1414318 cycles |
1378671 cycles |
1.03 |
ML-DSA-87 sign |
2672454 cycles |
2617778 cycles |
1.02 |
ML-DSA-87 verify |
1390985 cycles |
1360781 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A55 (Snapdragon 888) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03
.
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
479505 cycles |
464078 cycles |
1.03 |
ML-DSA-44 verify |
491013 cycles |
476282 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mac Mini (M1, 2020) benchmarks (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
139847 cycles |
139095 cycles |
1.01 |
ML-DSA-44 sign |
422235 cycles |
421316 cycles |
1.00 |
ML-DSA-44 verify |
149170 cycles |
148298 cycles |
1.01 |
ML-DSA-65 keypair |
245719 cycles |
243880 cycles |
1.01 |
ML-DSA-65 sign |
698481 cycles |
697316 cycles |
1.00 |
ML-DSA-65 verify |
244489 cycles |
242812 cycles |
1.01 |
ML-DSA-87 keypair |
408311 cycles |
403888 cycles |
1.01 |
ML-DSA-87 sign |
909697 cycles |
906357 cycles |
1.00 |
ML-DSA-87 verify |
416564 cycles |
412387 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
103302 cycles |
104068 cycles |
0.99 |
ML-DSA-44 sign |
292727 cycles |
291238 cycles |
1.01 |
ML-DSA-44 verify |
109078 cycles |
108970 cycles |
1.00 |
ML-DSA-65 keypair |
185885 cycles |
183677 cycles |
1.01 |
ML-DSA-65 sign |
467793 cycles |
469547 cycles |
1.00 |
ML-DSA-65 verify |
175577 cycles |
174213 cycles |
1.01 |
ML-DSA-87 keypair |
292589 cycles |
293908 cycles |
1.00 |
ML-DSA-87 sign |
603901 cycles |
606411 cycles |
1.00 |
ML-DSA-87 verify |
291969 cycles |
291296 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
177556 cycles |
175462 cycles |
1.01 |
ML-DSA-44 sign |
489520 cycles |
489436 cycles |
1.00 |
ML-DSA-44 verify |
186297 cycles |
183646 cycles |
1.01 |
ML-DSA-65 keypair |
300760 cycles |
298780 cycles |
1.01 |
ML-DSA-65 sign |
777805 cycles |
774354 cycles |
1.00 |
ML-DSA-65 verify |
299504 cycles |
297524 cycles |
1.01 |
ML-DSA-87 keypair |
505472 cycles |
501651 cycles |
1.01 |
ML-DSA-87 sign |
1026919 cycles |
1021227 cycles |
1.01 |
ML-DSA-87 verify |
510673 cycles |
506310 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
161130 cycles |
159895 cycles |
1.01 |
ML-DSA-44 sign |
481085 cycles |
480571 cycles |
1.00 |
ML-DSA-44 verify |
171055 cycles |
169828 cycles |
1.01 |
ML-DSA-65 keypair |
274710 cycles |
271950 cycles |
1.01 |
ML-DSA-65 sign |
774315 cycles |
772756 cycles |
1.00 |
ML-DSA-65 verify |
276172 cycles |
274866 cycles |
1.00 |
ML-DSA-87 keypair |
461045 cycles |
457363 cycles |
1.01 |
ML-DSA-87 sign |
1014239 cycles |
1010753 cycles |
1.00 |
ML-DSA-87 verify |
467004 cycles |
463411 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 4th gen (c7i) (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
103448 cycles |
103861 cycles |
1.00 |
ML-DSA-44 sign |
294746 cycles |
292828 cycles |
1.01 |
ML-DSA-44 verify |
109190 cycles |
108536 cycles |
1.01 |
ML-DSA-65 keypair |
186516 cycles |
183621 cycles |
1.02 |
ML-DSA-65 sign |
473220 cycles |
470258 cycles |
1.01 |
ML-DSA-65 verify |
175835 cycles |
173809 cycles |
1.01 |
ML-DSA-87 keypair |
292137 cycles |
294110 cycles |
0.99 |
ML-DSA-87 sign |
599914 cycles |
603640 cycles |
0.99 |
ML-DSA-87 verify |
291974 cycles |
290781 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
137704 cycles |
136462 cycles |
1.01 |
ML-DSA-44 sign |
397592 cycles |
398091 cycles |
1.00 |
ML-DSA-44 verify |
145003 cycles |
144238 cycles |
1.01 |
ML-DSA-65 keypair |
235766 cycles |
232903 cycles |
1.01 |
ML-DSA-65 sign |
618926 cycles |
621434 cycles |
1.00 |
ML-DSA-65 verify |
232380 cycles |
231882 cycles |
1.00 |
ML-DSA-87 keypair |
391837 cycles |
385368 cycles |
1.02 |
ML-DSA-87 sign |
809945 cycles |
807806 cycles |
1.00 |
ML-DSA-87 verify |
394136 cycles |
389322 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intel Xeon 3rd gen (c6i) (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
177552 cycles |
174898 cycles |
1.02 |
ML-DSA-44 sign |
489532 cycles |
486543 cycles |
1.01 |
ML-DSA-44 verify |
186313 cycles |
183631 cycles |
1.01 |
ML-DSA-65 keypair |
300910 cycles |
298892 cycles |
1.01 |
ML-DSA-65 sign |
777075 cycles |
775754 cycles |
1.00 |
ML-DSA-65 verify |
299545 cycles |
297441 cycles |
1.01 |
ML-DSA-87 keypair |
505623 cycles |
501292 cycles |
1.01 |
ML-DSA-87 sign |
1027304 cycles |
1022004 cycles |
1.01 |
ML-DSA-87 verify |
510824 cycles |
506144 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
144416 cycles |
142985 cycles |
1.01 |
ML-DSA-44 sign |
311589 cycles |
309980 cycles |
1.01 |
ML-DSA-44 verify |
144608 cycles |
142888 cycles |
1.01 |
ML-DSA-65 keypair |
256633 cycles |
252438 cycles |
1.02 |
ML-DSA-65 sign |
515513 cycles |
513050 cycles |
1.00 |
ML-DSA-65 verify |
243550 cycles |
240177 cycles |
1.01 |
ML-DSA-87 keypair |
437257 cycles |
430194 cycles |
1.02 |
ML-DSA-87 sign |
710283 cycles |
704629 cycles |
1.01 |
ML-DSA-87 verify |
427702 cycles |
418929 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 3rd gen (c6a) (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
160733 cycles |
159988 cycles |
1.00 |
ML-DSA-44 sign |
480923 cycles |
480528 cycles |
1.00 |
ML-DSA-44 verify |
170854 cycles |
169874 cycles |
1.01 |
ML-DSA-65 keypair |
273476 cycles |
271927 cycles |
1.01 |
ML-DSA-65 sign |
774378 cycles |
772026 cycles |
1.00 |
ML-DSA-65 verify |
276123 cycles |
274184 cycles |
1.01 |
ML-DSA-87 keypair |
461091 cycles |
457466 cycles |
1.01 |
ML-DSA-87 sign |
1014261 cycles |
1008998 cycles |
1.01 |
ML-DSA-87 verify |
466988 cycles |
463072 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AMD EPYC 4th gen (c7a) (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
137918 cycles |
135876 cycles |
1.02 |
ML-DSA-44 sign |
398505 cycles |
396380 cycles |
1.01 |
ML-DSA-44 verify |
145222 cycles |
143556 cycles |
1.01 |
ML-DSA-65 keypair |
236133 cycles |
232982 cycles |
1.01 |
ML-DSA-65 sign |
617132 cycles |
622646 cycles |
0.99 |
ML-DSA-65 verify |
232358 cycles |
232224 cycles |
1.00 |
ML-DSA-87 keypair |
391852 cycles |
385400 cycles |
1.02 |
ML-DSA-87 sign |
812425 cycles |
808433 cycles |
1.00 |
ML-DSA-87 verify |
394900 cycles |
389522 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton4 (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
157171 cycles |
155860 cycles |
1.01 |
ML-DSA-44 sign |
429450 cycles |
428124 cycles |
1.00 |
ML-DSA-44 verify |
165299 cycles |
163711 cycles |
1.01 |
ML-DSA-65 keypair |
276040 cycles |
271650 cycles |
1.02 |
ML-DSA-65 sign |
712600 cycles |
709595 cycles |
1.00 |
ML-DSA-65 verify |
273953 cycles |
271014 cycles |
1.01 |
ML-DSA-87 keypair |
460629 cycles |
454421 cycles |
1.01 |
ML-DSA-87 sign |
924454 cycles |
918993 cycles |
1.01 |
ML-DSA-87 verify |
464867 cycles |
456015 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A55 (Snapdragon 888) benchmarks (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
570965 cycles |
555271 cycles |
1.03 |
ML-DSA-44 sign |
1945379 cycles |
1928895 cycles |
1.01 |
ML-DSA-44 verify |
632121 cycles |
618036 cycles |
1.02 |
ML-DSA-65 keypair |
965556 cycles |
943705 cycles |
1.02 |
ML-DSA-65 sign |
3156311 cycles |
3133685 cycles |
1.01 |
ML-DSA-65 verify |
1000989 cycles |
982820 cycles |
1.02 |
ML-DSA-87 keypair |
1587964 cycles |
1551893 cycles |
1.02 |
ML-DSA-87 sign |
4031248 cycles |
3995110 cycles |
1.01 |
ML-DSA-87 verify |
1656852 cycles |
1622473 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
240271 cycles |
238259 cycles |
1.01 |
ML-DSA-44 sign |
544068 cycles |
541559 cycles |
1.00 |
ML-DSA-44 verify |
240812 cycles |
238982 cycles |
1.01 |
ML-DSA-65 keypair |
436633 cycles |
432552 cycles |
1.01 |
ML-DSA-65 sign |
912087 cycles |
908818 cycles |
1.00 |
ML-DSA-65 verify |
408709 cycles |
404022 cycles |
1.01 |
ML-DSA-87 keypair |
724471 cycles |
718760 cycles |
1.01 |
ML-DSA-87 sign |
1245715 cycles |
1241888 cycles |
1.00 |
ML-DSA-87 verify |
710316 cycles |
702505 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
153702 cycles |
154274 cycles |
1.00 |
ML-DSA-44 sign |
332668 cycles |
333787 cycles |
1.00 |
ML-DSA-44 verify |
153973 cycles |
154477 cycles |
1.00 |
ML-DSA-65 keypair |
273295 cycles |
275850 cycles |
0.99 |
ML-DSA-65 sign |
556712 cycles |
557966 cycles |
1.00 |
ML-DSA-65 verify |
259693 cycles |
260716 cycles |
1.00 |
ML-DSA-87 keypair |
463876 cycles |
466703 cycles |
0.99 |
ML-DSA-87 sign |
770097 cycles |
773992 cycles |
0.99 |
ML-DSA-87 verify |
452328 cycles |
454741 cycles |
0.99 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton2 (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
259056 cycles |
257465 cycles |
1.01 |
ML-DSA-44 sign |
705358 cycles |
704879 cycles |
1.00 |
ML-DSA-44 verify |
271243 cycles |
269324 cycles |
1.01 |
ML-DSA-65 keypair |
463268 cycles |
459838 cycles |
1.01 |
ML-DSA-65 sign |
1161620 cycles |
1160553 cycles |
1.00 |
ML-DSA-65 verify |
451145 cycles |
448246 cycles |
1.01 |
ML-DSA-87 keypair |
760351 cycles |
755312 cycles |
1.01 |
ML-DSA-87 sign |
1534745 cycles |
1530320 cycles |
1.00 |
ML-DSA-87 verify |
770721 cycles |
760760 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Graviton3 (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
165924 cycles |
166662 cycles |
1.00 |
ML-DSA-44 sign |
439047 cycles |
440512 cycles |
1.00 |
ML-DSA-44 verify |
172785 cycles |
173614 cycles |
1.00 |
ML-DSA-65 keypair |
291486 cycles |
293210 cycles |
0.99 |
ML-DSA-65 sign |
719368 cycles |
720651 cycles |
1.00 |
ML-DSA-65 verify |
286537 cycles |
287411 cycles |
1.00 |
ML-DSA-87 keypair |
488307 cycles |
491567 cycles |
0.99 |
ML-DSA-87 sign |
957756 cycles |
961622 cycles |
1.00 |
ML-DSA-87 verify |
489888 cycles |
492214 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
240147 cycles |
238149 cycles |
1.01 |
ML-DSA-44 sign |
542538 cycles |
541039 cycles |
1.00 |
ML-DSA-44 verify |
240082 cycles |
238413 cycles |
1.01 |
ML-DSA-65 keypair |
435752 cycles |
431498 cycles |
1.01 |
ML-DSA-65 sign |
910954 cycles |
906895 cycles |
1.00 |
ML-DSA-65 verify |
406801 cycles |
402775 cycles |
1.01 |
ML-DSA-87 keypair |
721774 cycles |
717656 cycles |
1.01 |
ML-DSA-87 sign |
1242382 cycles |
1240366 cycles |
1.00 |
ML-DSA-87 verify |
706815 cycles |
700727 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A76 (Raspberry Pi 5) benchmarks (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
259199 cycles |
257046 cycles |
1.01 |
ML-DSA-44 sign |
704949 cycles |
705718 cycles |
1.00 |
ML-DSA-44 verify |
270923 cycles |
268710 cycles |
1.01 |
ML-DSA-65 keypair |
462699 cycles |
458714 cycles |
1.01 |
ML-DSA-65 sign |
1159301 cycles |
1159791 cycles |
1.00 |
ML-DSA-65 verify |
449608 cycles |
447250 cycles |
1.01 |
ML-DSA-87 keypair |
759847 cycles |
754238 cycles |
1.01 |
ML-DSA-87 sign |
1531906 cycles |
1526092 cycles |
1.00 |
ML-DSA-87 verify |
765426 cycles |
758532 cycles |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpacemiT K1 8 (Banana Pi F3) benchmarks (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
1104920 cycles |
1099647 cycles |
1.00 |
ML-DSA-44 sign |
4007589 cycles |
4001780 cycles |
1.00 |
ML-DSA-44 verify |
1230769 cycles |
1225498 cycles |
1.00 |
ML-DSA-65 keypair |
1885025 cycles |
1864250 cycles |
1.01 |
ML-DSA-65 sign |
6567120 cycles |
6547708 cycles |
1.00 |
ML-DSA-65 verify |
1997422 cycles |
1981330 cycles |
1.01 |
ML-DSA-87 keypair |
3089332 cycles |
3084924 cycles |
1.00 |
ML-DSA-87 sign |
8295338 cycles |
8269471 cycles |
1.00 |
ML-DSA-87 verify |
3276216 cycles |
3260711 cycles |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
320619 cycles |
316629 cycles |
1.01 |
ML-DSA-44 sign |
846687 cycles |
843864 cycles |
1.00 |
ML-DSA-44 verify |
319730 cycles |
316599 cycles |
1.01 |
ML-DSA-65 keypair |
603641 cycles |
595356 cycles |
1.01 |
ML-DSA-65 sign |
1255200 cycles |
1247634 cycles |
1.01 |
ML-DSA-65 verify |
545136 cycles |
535921 cycles |
1.02 |
ML-DSA-87 keypair |
958297 cycles |
941844 cycles |
1.02 |
ML-DSA-87 sign |
1719544 cycles |
1708271 cycles |
1.01 |
ML-DSA-87 verify |
945666 cycles |
927618 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03
.
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-65 sign |
1285562 cycles |
1247634 cycles |
1.03 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-44 keypair |
353820 cycles |
349834 cycles |
1.01 |
ML-DSA-44 sign |
1043370 cycles |
1043530 cycles |
1.00 |
ML-DSA-44 verify |
373122 cycles |
368216 cycles |
1.01 |
ML-DSA-65 keypair |
650839 cycles |
640876 cycles |
1.02 |
ML-DSA-65 sign |
1695863 cycles |
1683356 cycles |
1.01 |
ML-DSA-65 verify |
617702 cycles |
608139 cycles |
1.02 |
ML-DSA-87 keypair |
1022816 cycles |
1007488 cycles |
1.02 |
ML-DSA-87 sign |
2221994 cycles |
2206799 cycles |
1.01 |
ML-DSA-87 verify |
1041803 cycles |
1024829 cycles |
1.02 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Arm Cortex-A72 (Raspberry Pi 4) benchmarks (no-opt)'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.03
.
Benchmark suite | Current: 4642e80 | Previous: 858772b | Ratio |
---|---|---|---|
ML-DSA-65 sign |
1744505 cycles |
1683356 cycles |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @manastasova.
The performance penalty for the copying is noticable (3% overall on some platforms).
Is this a temporary workaround for diffblue/cbmc#8617, or is this unrelated? If it is a temporary workaround, we can accept it now and revisit it later. Otherwise, I think we need to think about alternatives.
Another problem is that matrix_expand should really be using a 4-way batched poly_uniform (which then uses a 4-way batched Keccak). See #210. So this function will have to be rewritten anyway. Do you maybe want to do that already?
|
||
for (i = 0; i < MLDSA_K; ++i) | ||
__loop__( | ||
assigns(i, memory_slice(mat, MLDSA_K * sizeof(polyvecl))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly then this should be a whole_object(mat)
for (j = 0; j < MLDSA_L; ++j) | ||
__loop__( | ||
assigns(j, memory_slice(&tmp_polyvecl, sizeof(polyvecl))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here? Not sure.
{ | ||
unsigned int j; | ||
polyvecl tmp_polyvecl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a temporary workaround for diffblue/cbmc#8617? If so, please add an appropriate comment with a TODO.
{ | ||
poly_uniform(&mat[i].vec[j], rho, (i << 8) + j); | ||
poly temp_poly; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
Thanks a lot @manastasova for tackling this!
I think @mkannwischer raises an important point here. This is one of the main areas of code that we know will need to be rewritten, so we may want to do this first. What do you think, @manastasova? |
I will give |
Thank you @mkannwischer and @hanno-becker for the comments and suggestions! |
Solves #137.