Replies: 14 comments 45 replies
-
I would definitely like a one-way function. In principle this would be possible (unlike say, a VDF, where the 100-million-x speedup of an electronic computer would make the whole exercise pointless :P). Sha256 for example can be used for this, and there was some crazy guy who did a blockhash back in the early 2010s, without even having volvelles, and it took him only a day or two. My inclination would be to look at chacha20, which is way simpler than sha256 -- glancing at the secp codebase it looks like it's around 150 32-bit operations, which optimistically would be ~1000 gf32 operations, which optimistically is around 4 hours. I think if we worked at it and volvelle-ized enough operations we could get than under an hour, which I think would be appropriate for this kind of use case. One complication is that no checksum can survive a one-way-function, so you would need to do this multiple times and check the results against each other. If you mess this up when initially deriving your seed you are in for a world of hurt trying to exactly reproduce your mistake during recovery.. |
Beta Was this translation helpful? Give feedback.
-
This was my comment for how k independent air gapped master secrets can be backed up by codex32. Basically a 128-bit codex32 backup stores k * 128-bits of entropy. The other shares derive from this entropy. Now that you're talking about derived bip32 seeds for accounts, rather than adding a one way function, you could use the payload at the unused share indexes. If a codex32 backup has 10 shares and threshold 3, there can be a bip32 seed derived from the payload of any of the 22 shares that aren't recorded. Every unused index (potential seed) can be restored by any 3 of the 10 shares. |
Beta Was this translation helpful? Give feedback.
-
Regarding brain wallets: I thought it would be useful for recovery if one share to my codex32 backups could be derived from a memorized passphrase. This avoids the weaknesses of a pure brain wallet scheme: potential low seed entropy, no fallback for forgetten passphrases. If the passphrase is lost a threshold of shares restores the seed without it. I used the default parameters of scrypt and k+identifier+len(seed) as a salt Finally, I proposed that the This isn't paper computer friendly unlike slip39 wordlist encoding an entire share but it allows memorizing shorter passphrases
|
Beta Was this translation helpful? Give feedback.
-
Giving it a try I just completed a paper round of a Chacha20 algorithm modified to run over 16-bit words (hence a 256-bit state) using hexadecimal computation. I came up with a draft for the two required worksheets, as well as a simple table that helps me figure bitwise operations (but too tedious for production). I cross-checked my results regularly using a small program. I made many mistakes, but I was improving & can tell it'll work well with the proper worksheets & volvelles. I think we'll need hexadecimal because it makes additions so much easier. I believe a serious practitioner would end up filling them from memory by the mere virtue of repetition. Possibly the XORing table too. Rotations, maybe not because there are so few. Two of the rotations are aligned on the 4-bit grid and can then be performed by shifting the characters themselves. The two other rotations use the same 2-bit left rotation. While it's weird to have to switch to a different alphabet, I suspect it's going to be twice as fast as the same algorithm using 20bits words & the bech32 alphabet. We have to run it at least twice, and I believe it compensates for the conversion effort. I would like to hear your opinion on that, and also whether you considered something like a codex16 before, and if there's a particular reason for using 20 bits words rather than 16? Another reason we might want to use hexadecimal is that 16-bit words scale Chacha20 to the perfect state size. We can input either a secret+salt+counter, or 2*secrets+counter, which allows for both 128-bit and 240-bit setups. It's also possible to input secret+password+counter (240bits too). Output is then 256-bit, out of which we can discard half until a future derivation standard offers to make use of the whole entropy. As far as I can tell, XOR and ADD behave the same regardless of word size. Out of the four rotation constants, three are even and can then remain functionally equivalent by dividing them by 2. I tweaked the last one, which was odd, and it would require further analysis to figure out what it's worth. I'm not sure yet how to assert the robustness of this variation, but the changes are so minimal that I can't see where it'd cause a flaw. We'd need confirmation that we can do that though, do you have expertise in that domain? In terms of computations, a quarter-round takes 40 operations:
Each of these operations uses two inputs for one output, so using volvelles they'll take about the same time as codex32 additions. With some practice, I estimate that about half of the operations will feel comfortable to carry by memory. A checksum worksheet has about 250 operations, then it's possible to complete 1.5 rounds as fast as we complete the checksum, maybe over 2 rounds with experience. If we adopt this function, we'll have to complete 8 rounds twice for cross-checking. If one completes the checksum sheet in 20 minutes, then that'll be around 3 hours to derive a new BIP32 key, plus about 1 hour to recover the secret seed in the first place. A bit long, but this can be done. :D I plan to spend part of next week moving forward with this project. Maybe I'd like to draft & print volvelles & worksheets to give it a try that way. Please let me know if you see anything wrong so far, or if you have a suggestion. |
Beta Was this translation helpful? Give feedback.
-
Security of PaperChaDraft 1 is a naive adaptation of ChaCha8 for pen&paper computation. The finality of PaperCha is the deterministic derivation of a paper seed into a sequence of X < 2^16 keys, without ever entering said seed into an electronic device. Derived keys are meant to serve as BIP32 keys. Differences from Chacha8:
Questions:
Relevant cryptanalysis metrics:
Attack model: (draft)
[1]:
[2]: This models a scenario where paper seed is mainstream, and a leading wallet provider dumps/leaks the BIP32 keys of all its users. While we could imagine different scenarios, few of them will realistically offer such a beneficial setup. |
Beta Was this translation helpful? Give feedback.
-
Blinding Shares?I am wondering: is there a one-way trick to "blind" derived shares so they can't act as share anymore, and that can't easily be reverted? For example, if I take a blinding share It might not be a good example, but maybe there's a smart trick of that sort that can be used to implement a specialized one-way function faster than PaperCha, that might produce a checksumable output as well? |
Beta Was this translation helpful? Give feedback.
-
You won't derive useless garbage, you will leak information about the relation between derived shares if multiple OTP cipher-texts with key reuse are compromised.
It is likely
This doesn't allow more than k-1 cipher shares to be compromised before all of them are. Similar to my idea but encrypting original shares with OTP |
Beta Was this translation helpful? Give feedback.
-
Random Deletion PatternsConcept: Use case: Deletion Patterns: Application: Source of randomness: Expected security: Ergonomy: Limitations: Other options: Brute-forcing: Recovering the share using brute-force requires iterating over the 2^20 possible deletion patterns. Moreover, for each deleted position, there are 2^5 possible values. In practice, the attacker would iterate on 9 of the 13 deleted position (2^45 combinations), and fill the 4 remaining position with placeholders. For each iteration, the attacker would then use the BCH code to find values for the 4 placeholder. That's a total of 2^65 iterations, assuming that both header and the share index are known. When using a cipher-share, the share index will be randomized, which makes it 2^70 iterations to brute-force assuming the attackers knows the header. If the attacker wants to exploit a couple of leaked cipher shares using brute-force, they have to iterate both shares and try a derivation, which gets us to 2^140 iterations. This doesn't account for the cost of checking for success, which could require deriving a BIP39 key (2^11 sha512), plus a lookup on the bitcoin database for used addresses. |
Beta Was this translation helpful? Give feedback.
-
Is EC Mult still the bottle neck for achieving sufficiently fast hand calculations to make it feasible for production? And how likely is it to remain the bottle neck in the future? Maybe a mechanical computer specifically designed to do EC Mult calculations could be designed. Assuming the EC Mult calculations could be done quickly, how much time (ballpark) would the it take to calculate a receive address in the HD Wallet including verification, and how much to do a spend including signature calculation and verification. In the stack exchange article from last July, Andrew mentioned 36 weeks to do a full calculation with only lookup tables, is that still the accurate best guess? |
Beta Was this translation helpful? Give feedback.
-
Yep.
I would suggest "pretty-much guaranteed". In addition to ecmult we would want to be able to do scalar addition and multiplication, which are around 1/1500th the difficulty of EC mults (an ecmult is ~128 group additions which are each ~11 scalar multiplications). We maaybe also want to be able to do sha256 hashes, which empirically seem to be around 1/100th the difficulty of ecmults, but personally I don't think this is critical because only public data ever gets hashed, so you can allow computers to do this part (and cross-check with multiple computers if you're worried the result will be wrong). Except in BIP32 hardened derivation, where you are hashing secret data.
I hope so, but I've spent a little bit of time on it and I'm not sure where to start. The difficulty is that we're working with a large prime field so the multiplications can't be "decomposed" into smaller operations involving human-scale gear ratios or the like. At some point, maybe when my kids are old enough to appreciate it, I'll get a Meccano set and start experimenting.
If you want to do BIP32 hardened derivation without letting computers see your xprivs, then it's still a lot of time, because of the hashing. But if we assume that we make up some human-computable hash, say one that takes an hour, and that we only need to do one step of this, then our address derivation consists of:
You can see that the time is dominated by the hashing/hardened-derivation step which is hard to predict because we haven't seriously investigated this. My feeling has been that hashing is "probably" much much easier than ecmult, but since it's not really useful at all until we've cracked ecmult, I haven't tried to work on it.
Verification is 2 ecmults and an ec-add (1/128th of a mult). So in our world where ecmult is a 5-minute crank operation, I guess this is 10 minutes. Signing is one ecmult, a scalar mult (1/1500th of a ecmult, so a few hours by hand but presumably seconds by machine, if we have the mechanical chops to do an ecmult) and a scalar add (~20 minutes by hand, but probably we'd fold this operation into the scalar-mult machine). So again let's say "10 minutes". Verifying a generated address is as simple as signing a dummy message then verifying it. So basically, having an ecmult machine totally solves everything :).
I think so. Every so often I revisit that calculation and find that it's really sensitive to the assumptions I made about how fast primitive operations are. So maybe "36-72 weeks" is a better answer. |
Beta Was this translation helpful? Give feedback.
-
Amazing, just 1 design away from electronic-less secret keeping. :) Maybe, this would be a starting point, *note I haven't fully vetted the project yet, Open-source and open-hardware scientific RPN calculator located here: https://github.com/apoluekt/OpenRPNCalc?tab=readme-ov-file I think creating a mechanical machine to do ec mult is above my pay grade, but this project is fascinating and I'm going to put some brainstorming into it. |
Beta Was this translation helpful? Give feedback.
-
OpenRPNCalc is a scientific calculator based on STM32 microcontroller
And we can’t verify the circuit of the microcontroller without destroying it and a microscope. (That works for an autopsy before sending funds, but then you need to find another safe one next time.)
That’s why I was suggesting a hand assembled computer, like the plan is with gears but with transistors we can see and verify match the open source diagram for an ec mult circuit.
Are there open source ec mult ASICs that could be built by hand with relays or transistors? Or at least part of the circuit if construction estimates exceed the paper and pencil estimates? I would trust pallets full of 2N2222 NPN transistors over the STM32 for storing trillions. Both have side channels but a mechanical computer will too from sound and vibration. I figure ignore them since they can be jammed or faraday caged. Key is verifiable computer hardware.
A perk of building computers is the labor isn’t on secret data.
…On Sat, Feb 24, 2024 at 23:39, The Don ***@***.***(mailto:On Sat, Feb 24, 2024 at 23:39, The Don <<a href=)> wrote:
> So basically, having an ecmult machine totally solves everything :).
Amazing, just 1 design away from electronic-less secret keeping. :)
Maybe, this would be a starting point, *note I haven't fully vetted the project yet,
Open-source and open-hardware scientific RPN calculator located here: https://github.com/apoluekt/OpenRPNCalc?tab=readme-ov-file
I think creating a mechanical machine to do ec mult is above my pay grade, but this project is fascinating and I'm going to put some brainstorming into it.
—
Reply to this email directly, [view it on GitHub](#55 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/ARQZ6F4FYQY72UXLACDOS3DYVLFCBAVCNFSM6AAAAAAZ7AGTLOVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM4DKOBQHA2TA).
You are receiving this because you commented.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Seeing some activity reminded me that I still had to drop a little update here. I kept researching for a while and came up with something based on chacha that use more sophisticated operations instead of XOR/ADD (faster mixing). My estimate was that 128-bit of data could be hashed by hand in about 30 minutes, but the drawback is that it would be non-standard & require its own cryptanalysis which is why I started looking for another solution. So this is definitely doable, there's ample room for optimization vs. standard functions, but it'll take some serious work. |
Beta Was this translation helpful? Give feedback.
-
450 sq feet is just for the simple multiplication circuit, I expect you would need a similar size or greater for the modulo circuit, and then you would need a 256 big logic gate register to keep track of the result, assuming the result is fed back into the input when you are doing 1408 times for each ec mult. But logic registers wouldn't take that much more space, since you could just daisy chain together 256 registers together. If the result is not fed back into the input, you can discard the need for the logic gate registers, but we would still need to do the operation 1408 times. So you would need 12 levels instead of 6.
If we can use FPGA's this task would be somewhat trivial I think using MyHDL or a similar library to load in a python script to do the calculation. If we are going this route, I'd recommend imputing the data serially, so we wouldn't need to daisy chain a bunch of FPGA's together. It would just be one FPGA with an input line, where you could move the input pin to high/low voltage and press an external switch that would be tied to a different pin indicating the next bit. In this case you could load both
I'm not aware of any other non-programmable tech to do this operation other than logic gates so we would still need 450 sq feet x 2 = 900 sq feet of breadboard space unless we can use the algorithms that we were talking about earlier in this discussion. But even so, I don't think the algorithms would make it a size that is manageable. The amount of wires that the final product would require would make it quite a rats nest that would make even the most astute EE's wary of it, unless there was a big bounty or something. But I'll keep brainstorming, maybe a different idea will pop up. |
Beta Was this translation helpful? Give feedback.
-
I wonder if we can have a (paper) one-way function that lets us derive any number of BIP32 keys from the paper master seed? Or at least a few dozen keys? I tried to combine existing operations in several ways but alas I didn't find a secure solution.
Ideally, we would have:
This would be functionally similar to the BIP39 hardened parent secret -> child secret derivation. In my opinion, the use case is clear, compelling and is something that only a paper computer can fully deliver:
The second point gets increasingly important as people keep migrating/diversifying between various hardware wallets. The last point offers an elegant answer to a recurring question in the space.
Finally, something is fitting in keeping that paper seed off electronics - why do we go through all the trouble to then enter it into a less trusted device? This wonderful paper computer deserves a use case of its own. But then again: how to derive securely?
Beta Was this translation helpful? Give feedback.
All reactions