-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat+refactor: remove allocation from hashing #82
base: main
Are you sure you want to change the base?
Conversation
.collect::<Vec<_>>() | ||
.try_into() | ||
.unwrap(), | ||
elements: [(); NUM_HASH_OUT_ELTS].map(|()| inputs.next().unwrap()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this better than .collect::<Vec<_>>.try_into().unwrap()
or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Better" is a matter of opinion. This approach avoids allocation at the cost of being pretty ugly.
} | ||
} | ||
|
||
fn to_vec(&self) -> Vec<F> { | ||
self.elements.to_vec() | ||
fn into_iter(self) -> impl Iterator<Item = F> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a trait implementation for the IntoIterator trait or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what you're asking. This is using "return position impl Trait" to enable unique return types between the different implementations without associated types.
Co-authored-by: Matthias Görgens <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick read. But this PR is too big for me too judge enough to approve.
plonky2/src/hash/hash_types.rs
Outdated
// Chunks of 7 bytes since 8 bytes would allow collisions. | ||
const STRIDE: usize = 7; | ||
|
||
(0..((N + STRIDE - 1) / STRIDE)).map(move |i| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads like you want something like next_multiple_of? Also available for usize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I do. Fancy.
} | ||
|
||
fn into_iter(self) -> impl Iterator<Item = F> { | ||
// Chunks of 7 bytes since 8 bytes would allow collisions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dimdumon Please check with your work. Perhaps you can piggy-back on this, instead of re-implementing so much of your own packing?
plonky2/src/hash/hash_types.rs
Outdated
|
||
(0..((N + STRIDE - 1) / STRIDE)).map(move |i| { | ||
let mut arr = [0; 8]; | ||
let i = i * STRIDE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps use https://stackoverflow.com/questions/76783321/can-we-use-step-in-rust-slice instead?
&mut digests_buf[digests_buf_pos..(digests_buf_pos + num_tmp_digests)], | ||
tmp_cap_buf, | ||
&new_leaves[..], | ||
next_cap_height, | ||
|i, cap_hash| { | ||
H::hash_or_noop_iter(chain!(cap_hash.into_iter(), cur[i].iter().copied())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This smells like it wants to be a zip
or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zip
? Why? The goal is to concatenate them before hashing. Mirroring the previous code which used extend
) -> HashOut<F> { | ||
let mut elements = hash_n_to_m_no_pad_iter::<F, P, I>(inputs); | ||
HashOut { | ||
elements: core::array::from_fn(|_| elements.next().unwrap()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why with from_fn
here and via eg Self([(); N].map(|()| bytes.next().unwrap()))
elsewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just being silly, I guess.
let output = keccak(state_bytes.clone()).to_fixed_bytes(); | ||
state_bytes = output.to_vec(); | ||
output | ||
let hash_onion = (0..).scan(keccak(state_bytes), |state, _| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you use scan
here and, repeat_with
elsewhere?
(This one could be repeat_with
and take
. Or the other version could turn into a scan
.)
successors
might also work, and would probably be the cleanest, if it does.
pub fn flatten(&self) -> Vec<F> { | ||
self.0.iter().flat_map(|&h| h.to_vec()).collect() | ||
pub fn flatten(&self) -> impl Iterator<Item = F> + '_ { | ||
self.0.iter().flat_map(|h| h.into_iter()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably remove the closure and just hand over the function you want to call directly to flat_map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I cannot. h
is a reference and into_iter
takes by value, which works in the closure because h
is Copy
. I could put a copied
in here, but that seemed less readable.
assert_eq!(leaves.len(), digests_buf.len() / 2 + 1); | ||
if digests_buf.is_empty() { | ||
H::hash_or_noop(&leaves[0]) | ||
hash_fn(index, &leaves[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can do this without mucking around with indices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So... we could do it by passing in some aux data which we split using the existing split code. It's a little less versatile, but should handle current use cases.
|
||
/// Hash a message without any padding step. Note that this can enable length-extension attacks. | ||
/// However, it is still collision-resistant in cases where the input has a fixed length. | ||
fn hash_no_pad_iter<I: IntoIterator<Item = F>>(input: I) -> Self::Hash; | ||
|
||
/// Pad the message using the `pad10*1` rule, then hash it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, can we pad with pad10*
rule instead? That one is one element shorter on average.
(I don't think this is something introduced in this PR. So we can solve it in a new PR, too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know we totally can. We could even do a rule 1(0*1)?
, padding with 1
, 11
, or 10*1
as required.
This would break compatibility with any existing code because it changes the hash, so we probably can't upstream it.
Replace all allocation in hashing code with iterators.
This produces around a 7% gain in FMT performance for Poseidon and around a 20% gain for Keccak.
Doing this also required replacing
keccak-hash
with the library that library depends on, which istiny-keccak
.