Skip to content

Conversation

@patrick-ogrady
Copy link
Contributor

@patrick-ogrady patrick-ogrady commented Nov 10, 2025

Related: #2184

Changes (Summarized by LLM)

BTE Implementation

  • BatchRequest now caches affine headers, rho transcript seeds, and the desired concurrency level during the one-time ciphertext filtering pass; optional Rayon parallelism accelerates Chaum–Pedersen validation, and each header’s rho seed is stored as a Transcript::Summary for later reuse (cryptography/src/bls12381/bte.rs:124-224).
  • Added BatchVerifyScratch (rho buffer + shared MsmScratch) plus a canonical InvalidCiphertextSet error so verifiers can enforce that servers prove over exactly the caller’s filtered indices without repeated allocations (cryptography/src/bls12381/bte.rs:255-294).
  • respond_to_batch now uses cached raw scalars (Share::private.to_raw()) and optional Rayon parallelism to exponentiate headers, derives rho scalars once, and performs MSMs over cached affine headers using the new scratch space before forming the aggregated proof (cryptography/src/bls12381/bte.rs:400-498).
  • Verification reuses caller-provided scratch, checks the server-reported indices against the canonical set, recomputes rho scalars from stored seeds, batch-converts partials to affine form, and MSMs both bases and partials before validating the aggregated Chaum–Pedersen proof (cryptography/src/bls12381/bte.rs:511-589).
  • combine_partials accepts a concurrency argument so column-wise MSMs over verified partials can run serially or via a Rayon pool; helper functions perform MSMs with reusable scratch to avoid per-column allocations (cryptography/src/bls12381/bte.rs:591-716).
  • derive_rhos now resumes the per-header transcripts captured in the request, appends the responder index plus claimed partial, and feeds that into the transcript RNG—eliminating repeated domain initialization during verification (cryptography/src/bls12381/bte.rs:784-812).
  • The TDH keystream migrated from SHA-256 to CoreBlake3 and now commits to label length/output length, while the XOR helper applies unaligned u64 chunks for faster masking (cryptography/src/bls12381/bte.rs:815-856).

Group Primitives Infrastructure

  • Introduced reusable MSM infrastructure: MsmScratch for cached Pippenger scratch buffers, RawScalar/Scalar::to_raw for pre-serialized scalars, and extended the Point trait with affine conversions, scratch-aware MSMs, and raw-scalar multiplication hooks that both G1 and G2 implement (cryptography/src/bls12381/primitives/group.rs:51-369).
  • G1::msm/G2::msm now convert inputs to affine form (using the new blst_*s_to_affine helpers), drop zero points/scalars up front, and delegate to msm_affine_with_scratch, drastically reducing temporary allocations during large MSMs (cryptography/src/bls12381/primitives/group.rs:607-742 and cryptography/src/bls12381/primitives/group.rs:972-1113).
  • Both groups expose batch_to_affine, mul_raw, and msm_scratch_len, which the revamped BTE code uses for faster header exponentiation and aggregated-proof verification (cryptography/src/bls12381/primitives/group.rs:637-742 and cryptography/src/bls12381/primitives/group.rs:1004-1113).

Performance (decrypt == prepare + verify + recover)

bte_decrypt_prepare/n=10/threads=1/size=10
    time:   [7.5457 ms 7.6023 ms 7.6468 ms]
bte_decrypt_verify/n=10/threads=1/size=10
    time:   [19.562 ms 19.815 ms 20.303 ms]
bte_decrypt_recover/n=10/threads=1/size=10
    time:   [5.6933 ms 5.8327 ms 6.0476 ms]

=> 33.2ms

bte_decrypt_prepare/n=10/threads=1/size=100
    time:   [70.264 ms 70.964 ms 72.520 ms]
bte_decrypt_verify/n=10/threads=1/size=100
    time:   [129.21 ms 131.55 ms 134.98 ms]
bte_decrypt_recover/n=10/threads=1/size=100
    time:   [56.797 ms 56.972 ms 57.228 ms]

=> 259ms

bte_decrypt_prepare/n=10/threads=1/size=1000
    time:   [677.37 ms 684.25 ms 694.33 ms]
bte_decrypt_verify/n=10/threads=1/size=1000
    time:   [972.14 ms 976.19 ms 980.86 ms]
bte_decrypt_recover/n=10/threads=1/size=1000
    time:   [568.26 ms 569.14 ms 570.20 ms]

=> 2.23s

bte_decrypt_prepare/n=10/threads=8/size=10
    time:   [2.9143 ms 3.0526 ms 3.2288 ms]
bte_decrypt_verify/n=10/threads=8/size=10
    time:   [4.6297 ms 4.6338 ms 4.6386 ms]
bte_decrypt_recover/n=10/threads=8/size=10
    time:   [1.3080 ms 1.3270 ms 1.3654 ms]

=> 8.9ms

bte_decrypt_prepare/n=10/threads=8/size=100
    time:   [17.366 ms 18.219 ms 19.423 ms]
bte_decrypt_verify/n=10/threads=8/size=100
    time:   [25.075 ms 25.718 ms 26.592 ms]
bte_decrypt_recover/n=10/threads=8/size=100
    time:   [9.3808 ms 9.8714 ms 10.375 ms]

-> 53.7ms

bte_decrypt_prepare/n=10/threads=8/size=1000
    time:   [144.96 ms 146.63 ms 148.85 ms]
bte_decrypt_verify/n=10/threads=8/size=1000
    time:   [165.95 ms 168.70 ms 173.53 ms]
bte_decrypt_recover/n=10/threads=8/size=1000
    time:   [89.563 ms 91.839 ms 94.958 ms]

=> 0.41s

bte_decrypt_prepare/n=100/threads=1/size=10
    time:   [7.7492 ms 7.9252 ms 8.0643 ms]
bte_decrypt_verify/n=100/threads=1/size=10
    time:   [150.86 ms 151.23 ms 151.66 ms]
bte_decrypt_recover/n=100/threads=1/size=10
   time:   [19.074 ms 19.104 ms 19.166 ms]

=> 178.25ms (-17%)

bte_decrypt_prepare/n=100/threads=1/size=100
    time:   [71.399 ms 72.270 ms 73.838 ms]
bte_decrypt_verify/n=100/threads=1/size=100
    time:   [884.55 ms 897.72 ms 916.62 ms]
bte_decrypt_recover/n=100/threads=1/size=100
    time:   [189.17 ms 190.43 ms 191.73 ms]

=> 1.16s (-24%)

bte_decrypt_prepare/n=100/threads=1/size=1000
    time:   [691.80 ms 694.54 ms 697.17 ms]
bte_decrypt_verify/n=100/threads=1/size=1000
    time:   [5.8849 s 5.9193 s 5.9614 s]
bte_decrypt_recover/n=100/threads=1/size=1000
    time:   [1.8874 s 1.8967 s 1.9109 s]

=> 8.54s (-31%)

bte_decrypt_prepare/n=100/threads=8/size=10
    time:   [2.8861 ms 2.9442 ms 3.0289 ms]
bte_decrypt_verify/n=100/threads=8/size=10
    time:   [25.752 ms 26.454 ms 27.418 ms]
bte_decrypt_recover/n=100/threads=8/size=10
    time:   [4.3780 ms 4.4140 ms 4.4518 ms]

=> 33.8ms (-13%)

bte_decrypt_prepare/n=100/threads=8/size=100
    time:   [17.135 ms 17.375 ms 17.853 ms]
bte_decrypt_verify/n=100/threads=8/size=100
    time:   [142.00 ms 142.38 ms 143.35 ms]
bte_decrypt_recover/n=100/threads=8/size=100
    time:   [30.809 ms 31.409 ms 32.639 ms]

=> 191ms (-29%)

bte_decrypt_prepare/n=100/threads=8/size=1000
    time:   [142.87 ms 146.59 ms 151.91 ms]
bte_decrypt_verify/n=100/threads=8/size=1000
    time:   [947.65 ms 950.88 ms 954.25 ms]
bte_decrypt_recover/n=100/threads=8/size=1000
    time:   [299.12 ms 303.67 ms 309.80 ms]

=> 1.40s (-40%)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants