Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support for Protocol.SecretSharing key size greater than 16 bytes #593

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

miketery
Copy link

@miketery miketery commented Jan 20, 2022

Created two (2) wrapper functions, Shamir.split_large and Shamir.combine_large that supports key sizes > 16 bytes in 16 byte increments.

Reference issue: #402

@miketery
Copy link
Author

Note: I thought about renaming the original functions (Shamir.split and Shamir.combine) to Shamir.split_block and Shamir.combine_block (as they support 16 byte blocks), and having the original functions support the greater key size. However I didn't know if this was risky, and also I didn't want to deal with the functions if inputs were integers.

Looking forward to your feedback. This is my first contribution, so apologies if I didn't do something right, let me know and happy to follow process or reference docs related to process.

Cheers!

@miketery miketery changed the title Added support for Protocol.SecretSharing size greater than 16 bytes Added support for Protocol.SecretSharing key size greater than 16 bytes Jan 20, 2022
@MarkusH
Copy link

MarkusH commented Jan 20, 2022

I don't think that's a secure approach. You're reducing the key size by orders of magnitude. E.g. a 256-bit key (32 bytes) is now not 2**256 bits long anymore but 2*2**128 = 2**129 bits.

@miketery
Copy link
Author

miketery commented Jan 20, 2022

Hmm @MarkusH - trying to think it through, I don't think that's true.

It basically changes from 2**256 to 2**128 x 2**128 so still preserved.

To imply that it's 2**129 means that breaking one side 2**128 we know that we've broken it. But that's not the case.

How would you break one on its own and know that you've broken it? Letting you move to the other half?

@Varbin
Copy link
Contributor

Varbin commented Jan 20, 2022

Hmm @MarkusH - trying to think it through, I don't think that's true.

It basically changes from 2**256 to 2**128 x 2**128 so still preserved.

To imply that it's 2**129 means that breaking one side 2**128 we know that we've broken it. But that's not the case.

How would you break one on its own and know that you've broken it? Letting you move to the other half?

If a 32 byte string is used as two 16 byte keys, each half can be attacked individually.

@miketery
Copy link
Author

If a 32 byte string is used as two 16 byte keys, each half can be attacked individually.

@Varbin, can we prove this to be true? It's only true if we can prove that we can attack one 16 byte key individually, how would you do that?

For example, what if we took this to the extreme and made Shamir block size equal 1 byte. Going by the attack individually logic, a key of size 32 bytes, would have a strength of 32 * 2**8 which is 2**13 == 8192, and relies on attack of 1 byte at a time.

How do we crack 1 byte at a time?

@miketery
Copy link
Author

Created this question here to help resolve this: https://crypto.stackexchange.com/questions/98243/is-it-secure-to-do-shamir-key-split-on-a-key-in-blocks-and-recombine

Looking forward to learning one way or the other.

If we assume the split key method is bad, do you recommend I proceed to try to change underlying code for _Element and other class / function to support greater size?

What gives me most concern is backwards compatibility especially as it relates to this polynomial defining field:
https://github.com/Legrandin/pycryptodome/blob/master/lib/Crypto/Protocol/SecretSharing.py#L81

@ryancdotorg
Copy link

For example, what if we took this to the extreme and made Shamir block size equal 1 byte.

This is, in fact, commonly how Shamir's Secret Sharing is implemented. For example, libgfshare uses GF(2^8). In secrets.js there are number of options for field size, but they're all quite small for efficiency reasons. I spit out my drink when I saw that pycryptodome is using GF(2^128). Why?

It's fine to split e.g. a 512 bit secret 8 bits at a time over a scheme using GF(2^8) because the scheme is information theoretic secure. You can't "attack" the chunks separately because there's no way to confirm, on an individual chunk basis whether one's solution is correct.

@Varbin
Copy link
Contributor

Varbin commented Jan 22, 2022

@miketery @ryancdotorg Thank you for correcting me.

@vyznev
Copy link

vyznev commented Jan 22, 2022

@ryancdotorg One practical reason for preferring larger fields is that the field size limits the number of shares you can generate (since each share needs a distinct non-zero field point to evaluate the polynomial at). So for example with GF(2^8) you're limited to at most 255 shares per secret, which could be limiting for some use cases. But even so, something like GF(2^64) ought to be more than big enough for any imaginable purposes.

Copy link

@gsec gsec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor suggestions

lib/Crypto/Protocol/SecretSharing.py Outdated Show resolved Hide resolved
lib/Crypto/Protocol/SecretSharing.py Outdated Show resolved Hide resolved
lib/Crypto/Protocol/SecretSharing.py Outdated Show resolved Hide resolved
@MarkusH
Copy link

MarkusH commented Jan 24, 2022

It's fine to split e.g. a 512 bit secret 8 bits at a time over a scheme using GF(2^8) because the scheme is information theoretic secure. You can't "attack" the chunks separately because there's no way to confirm, on an individual chunk basis whether one's solution is correct.

Thanks for the enlightenment, @ryancdotorg. Much appreciated.

@gsec
Copy link

gsec commented Jan 25, 2022

@Varbin, can we prove this to be true? It's only true if we can prove that we can attack one 16 byte key individually, how would you do that?

I'd argue that the other way around is the case: It is only secure if we can prove it is secure ;)


Going over this in more detail, I see a problem after all. I'll try go through with you and we might have better conclusion:

for simplicity I'll assume n=k=2 , our secret being secret = b"Hello this is a superlong secret" having 32 bytes.
Then the current functions are split() and combine() and the new suggested ones are split_large() and combine_large()

This secret is too long, thus we now use the new functions to create the shares.

big_shares = split_large(2,2,secret)

If the claim holds, that the attacker can not attack blocks individually, he would have to go through the full key length. And only have a small advantage of possessing half of the keys (respective shares). We would still have to go through the other half - namely the remaining 16 bytes - of possibility space.

So now I create two smaller shares consisting of the first half of the big shares:

small1 = big_shares[0][1][:16]
small2 = big_shares[1][1][:16]
small_shares = [(1, small1), (2, small2)]

But since the combined shares are just concatenated, this works:

guess = combine(small_shares)
# guess: b'Hello this is a '

One can iterate for all blocks, and reconstruct the whole secret.

We now successfully broke the proposed expansion of the Shamir scheme exactly by attacking each 16 byte block individually.

@ryancdotorg
Copy link

@gsec If I am reading your comment correctly, you're claiming that if an attacker has e.g. the first 16 bytes of a quorum of shares for a 32 byte key, they can construct the first 16 bytes of the key?

If that's what you're claiming, I don't see how that's actually a problem. If the first 16 bytes of a 32 byte key leaked to an attacker, they'd have the same advantage.

@Varbin
Copy link
Contributor

Varbin commented Jan 25, 2022

@ryancdotorg @gsec
I just tried to verify if a secret cannot be partially restored if only parts of each secrets are known, even when loosing less than a full "block size". Well, it turns out even the "current" implementation does not have an "all-or-nothing" property.

# Cryptodome.__version__ == '3.10.1'
from Cryptodome.Protocol import SecretSharing

(_, a), (_, b) = SecretSharing.Shamir.split(2, 2, b'1234567890abcdef')
SecretSharing.Shamir.combine(((1, b'\00'*8+a[8:]), (2, b'\0'*8+b[8:])))
# b'\x00\x00\x00\x00\x00\x00\x00\x0090abcdef'
# Note: Sometimes one byte is not "correctly" combined,
# but the search space for this one byte is quite small.

Therefore I think it is safe to conclude using "multiple blocks" does not weaken the security at all.

@ryancdotorg
Copy link

Thanks for testing that, @Varbin. I'm not super familiar with how the math works for Shamir's Secret Sharing, though I know Lagrange Interpolation over a finite field is used. In essence, the shares are points on a plane, the actual secret is the point at x=0, and a polynomial is used to tie everything together.

Anyway, if someone's concerned about the issue about partial shares being able to be combined to get a partial key, they can wrap the actual key with a proper all-or-nothing transform, but I struggle to come up with an actual scenario in which this matters.

@gsec
Copy link

gsec commented Jan 26, 2022

Discussing this further with @MarkusH and others lead me to the conviction that the possibility of attacking blocks or bytes individually is irrelevant after all. Much as @ryancdotorg said, you can get any byte of the secret, for which you have k correct bytes of the shares (for a k-n-sharing scheme). The previous concern to attack blocks or bytes individually does not apply. The sharing scheme has perfect security and it is therefore irrelevant how "easy" it is to calculate the bytes, since all are equally probable. Some general information about sharing schemes can also be found in Chapter 8 of this guideline.
Thank you all for the fruitful contributions and @miketery for adding this feature.

fixed bad indent.
@kposen
Copy link

kposen commented Jul 24, 2023

Good morning. What is the status of this pull request?

@gsec
Copy link

gsec commented Jul 24, 2023

As far as I can see, this is good to go. Thanks for bringing it up again, thought it would have been merged by now.

@ryancdotorg
Copy link

@ryancdotorg One practical reason for preferring larger fields is that the field size limits the number of shares you can generate [...] even so, something like GF(2^64) ought to be more than big enough for any imaginable purposes.

I don't think I responded to this before, but as a practical matter, with a smaller field it's possible to precompute logarithms and exponentiations over each possible field element and build a lookup table. This can provide a significant speedup, especially if the lookup tables are small enough to fit comfortably in the CPU's L1 cache - which they ought to be able to up to about GF(2^12).

Ref: https://github.com/jcushman/libgfshare/blob/master/src/gfshare_maketable.c

That optimization probably isn't needed here, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants