[Design]: Redesign of Distributed Key Generation #1286

djordon · 2025-01-30T05:00:17Z

Design - Redesign of Distributed Key Generation

1. Summary

We should redesign Distributed Key Generation (DKG) so that it always finishes successfully when all signers are behaving honestly, while removing the role of a coordinator.

2. Context & Purpose

Right now, Distributed Key Generation (DKG) is not something that can be done reliably. There are two main issues:

There is a timing issue where the arrival of a bitcoin block during DKG leads to failure.
There is another timing issue where some signers might not receive critical information from the coordinator (the signer leading DKG) “in time”. This also leads to DKG failure.

Issue (1) is quite bad, because it could lead to a partial failure. It would be better to have all signers fail or succeed at DKG. Lastly, we would like to remove any special privileges of the coordinator. There is nothing inherently wrong with having a coordinator, but removing the need for one during DKG allows for some simplification and probably enhances security.

Relevant Research Discussions

External Resources

The current DKG flow follows what is laid out in the FROST protocol RFC 9591. The FROST paper, found in https://eprint.iacr.org/2020/852, speaks about a coordinator algorithm, but does not actually require a coordinator participant for either DKG or signing rounds.

3. Design

3.1 Proposed Component Design

The proposal is to:

Remove the role of the coordinator during DKG.
Create a more strict state machine for progressing through DKG.
Require a configurable threshold of participants to send and receive acknowledgment messages before starting DKG.
Tie the start of DKG to the block height of the canonical bitcoin blockchain, and accept DKG related messages from any signer until DKG either fails or succeeds.

The above proposals can be refined into several “small” proposals for the new DKG design.

Wait for all signers to observe the same bitcoin block before beginning DKG.
Add in a bitcoin block height/ block hash column(s) to the dkg_shares table. Run DKG if there is no row in the dkg_shares table with block height greater than or equal to the dkg_min_bitcoin_block_height parameter. Remove the dkg_target_rounds config parameter.
After observing a bitcoin block and writing blockchain data to the database, we have enough information to tell whether running DKG is allowed. Store that information in the SignerState object. Also store the current canonical bitcoin chain tip in the state object.
Design and implement a new DKG-only state machine that is a combination of our SignerStateMachine and CoordinatorStateMachine.
Allow any non-DkgBegin messages from other signers at any time, but only feed it to the state machine if the dkg_id aligns with a currently ongoing DKG round and DKG is allowed.
Use the chain tip height of the canonical bitcoin blockchain as the dkg_id when starting DKG rounds.
Allow at most one DKG-only state machine.
Change DKG to progress past the DkgBegin state after receiving dkg_begin_threshold distinct DkgBegin messages.
Only store DGK shares after receiving dkg_begin_threshold distinct DkgEnd(Success) messages.
Remove the DkgPrivateBegin and DkgEndBegin messages.

Some notes:

With (1), we are more likely to succeed with DKG since we know that all signers will be using the same dkg_id.
With (2-3) the signers can easily tell whether they should accept a DkgBegin message from another signer. We currently do a database look up for each of these messages.
With (4) we get rid of the requirement of having a coordinator.
With (5-7), we allow for DKG to span multiple blocks, fixing the bitcoin block race condition.
With (8), we fix the race condition issue where some signers don’t have a state machine set up at the right time.
With (9), the signers are more confident that DKG ended in success for everyone before they write to their databases.

The above looks like a lot of work, but we can do 1, 5, 6, and 8 and be happy with fixing our known bugs.

3.1.1 Design Diagram

Below is a protocol diagram for how DKG will progress.

3.1.2 Considerations & Alternatives

There are some alternatives to solving our two main issues. Let’s consider the issues and the alternatives.

There is a timing issue where the arrival of a bitcoin block during DKG leads to failure.
1. For this we only need to do (9). If a bitcoin block arrives in the middle of DKG, it fails and then we simply try DKG again.
There is another timing issue where some signers might not receive information from the coordinator (the signer leading DKG) “in time”. This also leads to DKG failure.
1. One alternative for this timing issue is to embed DKG shares in some of the coordinator’s messages. That was the work done in Embed public and private shares into DkgBegin messages #927 and Embed public and private shares into DkgPrivateBegin and DkgEndBegin Trust-Machines/wsts#98. The downside of this proposal is that it makes the system more reliant of the coordinator, which probably decreases the security of the system since the coordinator is more critical.

Closing Checklist

The design proposed in this issue is clearly documented in the description of this ticket.
Everyone necessary has reviewed the resolution and agrees with the proposal.
This ticket has or links all the information necessary to familiarize a contributor with the design decision, why it was made, and how it'll be included.

The text was updated successfully, but these errors were encountered:

matteojug · 2025-02-05T14:09:10Z

Overall it makes sense, some questions:

Wait for all signers to observe the same bitcoin block before beginning DKG.

How do we want to do this? Like, on the sending part it’s easy, it’s the receiving part that may have some gotchas.

Require a configurable threshold of participants

Why we want a different param for this, and not reuse the DKG participation threshold?

(5-7)

To confirm, the idea is that as soon as everybody agree on a bitcoin block where we can run dkg, we set dkg_id, instantiate the state machine and stick with it from that moment until dkg succeed?
How do we handle a case when a signers agree on a block, start DKG but then dies: depending on the details it may not be able to "rejoin" the process, and if the others do not have a bail mechanisms we would be stuck.

djordon · 2025-02-06T04:19:06Z

How do we want to do this? Like, on the sending part it’s easy, it’s the receiving part that may have some gotchas.

One idea that came to mind was to have some new event loop that is always alive and listening for "chain-tip messages". When it receives such a message it compares the bitcoin block hash in the message to the one stored in the signer's state and keeps it in some map if it's okay to run DKG. We'd have to create a new type for sending signer chain-tip messages.

The event loop would need to listen to internal block observer messages and create a state machine with the right DKG ID if it's okay run DKG. After the state machine is created, it would send the "chain-tip message" to it's peers. When the event loop receives enough "chain-tip" messages the state machine begins DKG. With the above sequence we know that each state machine is created before the first DKG message has been broadcast.

Why do we want a different param for this, and not reuse the DKG participation threshold?

Yeah we probably want to use the same DKG participation threshold, at least to start.

To confirm, the idea is that as soon as everybody agree on a bitcoin block where we can run dkg, we set dkg_id, instantiate the state machine and stick with it from that moment until dkg succeed?
How do we handle a case when a signers agree on a block, start DKG but then dies: depending on the details it may not be able to "rejoin" the process, and if the others do not have a bail mechanisms we would be stuck.

Yeah, I've been assuming that DKG has a wall clock timeout like we currently. If DKG times out we destroy the state machine. If the timeout is sufficiently long, like, say, 10 minutes, we don't have to worry too much about the timeout expiring in the middle of an almost-complete DKG where it finishes for some and not for others.

djordon added design making a design decision. key rotation The functionality to rotate a private key for a signer in sBTC-v1. sbtc signer binary The sBTC Bootstrap Signer. signer coordination The actions executed by the signer coordinator. labels Jan 30, 2025

djordon added this to the sBTC: Nice to have milestone Jan 30, 2025

djordon added this to sBTC Jan 30, 2025

github-project-automation bot moved this to Needs Triage in sBTC Jan 30, 2025

This was referenced Jan 30, 2025

feat: consensus on successful DKG prior to rotate-keys submission #1285

Open

[Feature]: Generate recoverable private shares during DKG #1303

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Design]: Redesign of Distributed Key Generation #1286

[Design]: Redesign of Distributed Key Generation #1286

djordon commented Jan 30, 2025

matteojug commented Feb 5, 2025

djordon commented Feb 6, 2025

[Design]: Redesign of Distributed Key Generation #1286

[Design]: Redesign of Distributed Key Generation #1286

Comments

djordon commented Jan 30, 2025

Design - Redesign of Distributed Key Generation

1. Summary

2. Context & Purpose

3. Design

3.1 Proposed Component Design

3.1.1 Design Diagram

3.1.2 Considerations & Alternatives

matteojug commented Feb 5, 2025

djordon commented Feb 6, 2025