feat+refactor: remove allocation from hashing #82

Daniel-Aaron-Bloom · 2024-04-14T05:15:20Z

Replace all allocation in hashing code with iterators.

This produces around a 7% gain in FMT performance for Poseidon and around a 20% gain for Keccak.

Doing this also required replacing keccak-hash with the library that library depends on, which is tiny-keccak.

plonky2/Cargo.toml

matthiasgoergens · 2024-04-16T02:47:27Z

plonky2/src/hash/hash_types.rs

-                .collect::<Vec<_>>()
-                .try_into()
-                .unwrap(),
+            elements: [(); NUM_HASH_OUT_ELTS].map(|()| inputs.next().unwrap()),


Is this better than .collect::<Vec<_>>.try_into().unwrap() or something like that?

"Better" is a matter of opinion. This approach avoids allocation at the cost of being pretty ugly.

matthiasgoergens · 2024-04-16T02:48:01Z

plonky2/src/hash/hash_types.rs

        }
    }

-    fn to_vec(&self) -> Vec<F> {
-        self.elements.to_vec()
+    fn into_iter(self) -> impl Iterator<Item = F> {


Shouldn't this be a trait implementation for the IntoIterator trait or so?

Not sure what you're asking. This is using "return position impl Trait" to enable unique return types between the different implementations without associated types.

Co-authored-by: Matthias Görgens <[email protected]>

matthiasgoergens

I had a quick read. But this PR is too big for me too judge enough to approve.

matthiasgoergens · 2024-04-16T02:51:14Z

plonky2/src/hash/hash_types.rs

+        // Chunks of 7 bytes since 8 bytes would allow collisions.
+        const STRIDE: usize = 7;
+
+        (0..((N + STRIDE - 1) / STRIDE)).map(move |i| {


This reads like you want something like next_multiple_of? Also available for usize.

Indeed I do. Fancy.

matthiasgoergens · 2024-04-16T02:52:38Z

plonky2/src/hash/hash_types.rs

+    }
+
+    fn into_iter(self) -> impl Iterator<Item = F> {
+        // Chunks of 7 bytes since 8 bytes would allow collisions.


@dimdumon Please check with your work. Perhaps you can piggy-back on this, instead of re-implementing so much of your own packing?

matthiasgoergens · 2024-04-16T02:54:09Z

plonky2/src/hash/hash_types.rs

+
+        (0..((N + STRIDE - 1) / STRIDE)).map(move |i| {
+            let mut arr = [0; 8];
+            let i = i * STRIDE;


Perhaps use https://stackoverflow.com/questions/76783321/can-we-use-step-in-rust-slice instead?

matthiasgoergens · 2024-04-16T03:08:06Z

plonky2/src/hash/field_merkle_tree.rs

                    &mut digests_buf[digests_buf_pos..(digests_buf_pos + num_tmp_digests)],
                    tmp_cap_buf,
                    &new_leaves[..],
                    next_cap_height,
+                    |i, cap_hash| {
+                        H::hash_or_noop_iter(chain!(cap_hash.into_iter(), cur[i].iter().copied()))


This smells like it wants to be a zip or so?

zip? Why? The goal is to concatenate them before hashing. Mirroring the previous code which used extend

matthiasgoergens · 2024-04-16T03:43:07Z

plonky2/src/hash/hashing.rs

+) -> HashOut<F> {
+    let mut elements = hash_n_to_m_no_pad_iter::<F, P, I>(inputs);
+    HashOut {
+        elements: core::array::from_fn(|_| elements.next().unwrap()),


Why with from_fn here and via eg Self([(); N].map(|()| bytes.next().unwrap())) elsewhere?

Just being silly, I guess.

matthiasgoergens · 2024-04-16T03:44:48Z

plonky2/src/hash/keccak.rs

-            let output = keccak(state_bytes.clone()).to_fixed_bytes();
-            state_bytes = output.to_vec();
-            output
+        let hash_onion = (0..).scan(keccak(state_bytes), |state, _| {


Why do you use scan here and, repeat_with elsewhere?

(This one could be repeat_with and take. Or the other version could turn into a scan.)

successors might also work, and would probably be the cleanest, if it does.

matthiasgoergens · 2024-04-16T03:47:39Z

plonky2/src/hash/merkle_tree.rs

-    pub fn flatten(&self) -> Vec<F> {
-        self.0.iter().flat_map(|&h| h.to_vec()).collect()
+    pub fn flatten(&self) -> impl Iterator<Item = F> + '_ {
+        self.0.iter().flat_map(|h| h.into_iter())


You can probably remove the closure and just hand over the function you want to call directly to flat_map.

Unfortunately, I cannot. h is a reference and into_iter takes by value, which works in the closure because h is Copy. I could put a copied in here, but that seemed less readable.

matthiasgoergens · 2024-04-16T03:50:24Z

plonky2/src/hash/merkle_tree.rs

    assert_eq!(leaves.len(), digests_buf.len() / 2 + 1);
    if digests_buf.is_empty() {
-        H::hash_or_noop(&leaves[0])
+        hash_fn(index, &leaves[0])


I wonder if we can do this without mucking around with indices.

So... we could do it by passing in some aux data which we split using the existing split code. It's a little less versatile, but should handle current use cases.

matthiasgoergens · 2024-04-16T03:56:34Z

plonky2/src/plonk/config.rs

+
+    /// Hash a message without any padding step. Note that this can enable length-extension attacks.
+    /// However, it is still collision-resistant in cases where the input has a fixed length.
+    fn hash_no_pad_iter<I: IntoIterator<Item = F>>(input: I) -> Self::Hash;

    /// Pad the message using the `pad10*1` rule, then hash it.


Btw, can we pad with pad10* rule instead? That one is one element shorter on average.

(I don't think this is something introduced in this PR. So we can solve it in a new PR, too.)

As far as I know we totally can. We could even do a rule 1(0*1)?, padding with 1, 11, or 10*1 as required.

This would break compatibility with any existing code because it changes the hash, so we probably can't upstream it.

Daniel-Aaron-Bloom added 8 commits April 9, 2024 19:37

feat: add iter support to hashing

7774040

replace vec with iter

68e7fa9

Merge branch 'main' into dbloom/hashing

ac6c9da

field-merkle-tree

46ef6ce

fmt+clippy

4db1abc

no-std

c9399dc

fmt

96fc022

fix offset

9b455d6

matthiasgoergens reviewed Apr 16, 2024

View reviewed changes

plonky2/Cargo.toml Outdated Show resolved Hide resolved

matthiasgoergens reviewed Apr 16, 2024

View reviewed changes

Update plonky2/Cargo.toml

ad8a828

Co-authored-by: Matthias Görgens <[email protected]>

matthiasgoergens reviewed Apr 16, 2024

View reviewed changes

This was referenced Apr 16, 2024

test: add FMT benchmark #83

Merged

refactor: switch to tiny-keccak library #84

Merged

Daniel-Aaron-Bloom added 4 commits April 16, 2024 16:21

merge

c090dc8

Merge branch 'main' into dbloom/hashing

8b05603

fmt

785d06f

feedback

e3da146

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat+refactor: remove allocation from hashing #82

feat+refactor: remove allocation from hashing #82

Daniel-Aaron-Bloom commented Apr 14, 2024

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens Apr 16, 2024 •

edited

Loading

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens left a comment

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens Apr 16, 2024

matthiasgoergens Apr 16, 2024

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024 •

edited

Loading

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens Apr 16, 2024

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

matthiasgoergens Apr 16, 2024

Daniel-Aaron-Bloom Apr 16, 2024

feat+refactor: remove allocation from hashing #82

Are you sure you want to change the base?

feat+refactor: remove allocation from hashing #82

Conversation

Daniel-Aaron-Bloom commented Apr 14, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthiasgoergens Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthiasgoergens left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Daniel-Aaron-Bloom Apr 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matthiasgoergens Apr 16, 2024 •

edited

Loading

Daniel-Aaron-Bloom Apr 16, 2024 •

edited

Loading