|
| 1 | +--- |
| 2 | +NEP: 514 |
| 3 | +Title: Reducing the number of Block Producer Seats in `testnet` |
| 4 | +Authors: Nikolay Kurtov <[email protected]> |
| 5 | +Status: New |
| 6 | +DiscussionsTo: https://github.com/nearprotocol/neps/pull/514 |
| 7 | +Type: Protocol |
| 8 | +Version: 1.0.0 |
| 9 | +Created: 2023-10-25 |
| 10 | +LastUpdated: 2023-10-25 |
| 11 | +--- |
| 12 | + |
| 13 | + |
| 14 | +## Summary |
| 15 | + |
| 16 | +This proposal aims to adjust the number of block producer seats on `testnet` in |
| 17 | +order to ensure a positive number of chunk-only producers present in `testnet` |
| 18 | +at all times. |
| 19 | + |
| 20 | +## Motivation |
| 21 | + |
| 22 | +The problem is that important code paths are not exercised in `testnet`. This |
| 23 | +makes `mainnet` releases more risky than they have to be, and greatly slows |
| 24 | +down development of features related to chunk-only producers, such as State |
| 25 | +Sync. |
| 26 | + |
| 27 | +That is because `testnet` has fewer validating nodes than the number of block |
| 28 | +producer seats configured. |
| 29 | + |
| 30 | +The number of validating nodes on `testnet` is somewhere in the range of |
| 31 | +[26, 46], which means that all validating nodes are block producers and none of |
| 32 | +them are chunk-only producers. [Grafana](https://nearinc.grafana.net/goto/7Kh81P7IR?orgId=1). |
| 33 | + |
| 34 | +`testnet` configuration is currently the following: |
| 35 | + |
| 36 | +* `"num_block_producer_seats": 100,` |
| 37 | +* `"num_block_producer_seats_per_shard": [ 100, 100, 100, 100 ],` |
| 38 | +* `"num_chunk_only_producer_seats": 200,` |
| 39 | + |
| 40 | +It's evident that the 100 block producer seats significantly outnumber the |
| 41 | +validating nodes in `testnet`. |
| 42 | + |
| 43 | +An alternative solution to the problem stated above can be the following: |
| 44 | + |
| 45 | +1. Encourage the community to run more `testnet` validating nodes |
| 46 | +1. Release owners or developers of features start a lot of validating nodes to |
| 47 | +1. ensure `testnet` gets some chunk-only producing nodes. |
| 48 | +1. Exercise the unique code paths in a separate chain, a-la `localnet`. |
| 49 | + |
| 50 | +Let's consider each of these options. |
| 51 | + |
| 52 | +### More community nodes |
| 53 | + |
| 54 | +This would be the ideal perfect situation. More nodes joining will make |
| 55 | +`testnet` more similar to `mainnet`, which will have various positive effects |
| 56 | +for protocol developers and dApp developers. |
| 57 | + |
| 58 | +However, this option is expensive, because running a validating node costs |
| 59 | +money, and most community members can't afford spending that amount of money for |
| 60 | +the good of the network. |
| 61 | + |
| 62 | +### More protocol developer nodes |
| 63 | + |
| 64 | +While this option may seem viable, it poses significant financial challenges for |
| 65 | +protocol development. The associated computational expenses are exorbitantly |
| 66 | +high, making it an impractical choice for sustainable development. |
| 67 | + |
| 68 | +### Test in separate chains |
| 69 | + |
| 70 | +That is the current solution, and it has significant drawbacks: |
| 71 | + |
| 72 | +* Separate chains are short-lived and may miss events critical to the unique |
| 73 | + code paths of chunk-only producers |
| 74 | +* Separate chains need special attention to be configured in a way that |
| 75 | + accommodates for chunk-only producers. Most test cases are not concerned about |
| 76 | + them, and don't exercise the unique code paths. |
| 77 | +* Separate chains can't process real transaction traffic. The traffic must |
| 78 | + either be synthetic or "inspired" by real traffic. |
| 79 | +* Each such test has a significant cost of running multiple nodes, in some |
| 80 | + cases, tens of nodes. |
| 81 | + |
| 82 | +## Specification |
| 83 | + |
| 84 | +The proposal suggests altering the number of block producer seats to ensure that |
| 85 | +a portion of the `testnet` validating nodes become chunk-only producers. |
| 86 | + |
| 87 | +The desired `testnet` configuration is the following: |
| 88 | + |
| 89 | +* `"num_block_producer_seats": 20,` |
| 90 | +* `"num_block_producer_seats_per_shard": [ 20, 20, 20, 20 ],` |
| 91 | +* `"num_chunk_only_producer_seats": 100,` |
| 92 | + |
| 93 | +I suggest to implement the change for all networks that are not `mainnet` and |
| 94 | +have `use_production_config` in the genesis file. `use_production_config` is a |
| 95 | +sneaky parameter in `GenesisConfig` that lets protocol upgrades to change |
| 96 | +network's `GenesisConfig`. |
| 97 | + |
| 98 | +I don't have a solid argument for lowering the number of chunk producer seats, |
| 99 | +but that reflects the reality that we don't expect a lot of nodes joining |
| 100 | +`testnet`. It also makes it easier to test the case of too many validating nodes |
| 101 | +willing to join a network. |
| 102 | + |
| 103 | +## Reference Implementation |
| 104 | + |
| 105 | +[#9563](https://github.com/near/nearcore/pull/9563) |
| 106 | + |
| 107 | +If `use_production_config`, check whether `chain_id` is eligible, then change |
| 108 | +the configuration as specified above. |
| 109 | + |
| 110 | +## Security Implications |
| 111 | + |
| 112 | +The block production in `testnet` becomes more centralized. It's not a new |
| 113 | +concern as 50% of stake is already owned by nodes operated by the protocol |
| 114 | +developers. |
| 115 | + |
| 116 | +## Alternatives |
| 117 | + |
| 118 | +See above. |
| 119 | + |
| 120 | +## Future possibilities |
| 121 | + |
| 122 | +Adjust the number of block and chunk producer seats according to the development |
| 123 | +of the number of `testnet` validating nodes. |
| 124 | + |
| 125 | +## Consequences |
| 126 | + |
| 127 | +### Positive |
| 128 | + |
| 129 | +* Chunk-only production gets tested in `testnet` |
| 130 | +* Development of State Sync and other features related to chunk-only producers accelerates |
| 131 | + |
| 132 | +### Neutral |
| 133 | + |
| 134 | +* `testnet` block production becomes more centralized |
| 135 | + |
| 136 | +### Negative |
| 137 | + |
| 138 | +* Any? |
| 139 | + |
| 140 | +### Backwards Compatibility |
| 141 | + |
| 142 | +During the protocol upgrade, some nodes will become chunk-only producers. |
| 143 | + |
| 144 | +The piece of code that updates `testnet` configuration value will need to be |
| 145 | +kept in the database in case somebody wants to generate `EpochInfo` compatible |
| 146 | +with the protocol versions containing the implementation of this NEP. |
| 147 | + |
| 148 | +## Changelog |
| 149 | + |
| 150 | +### 1.0.0 - Initial Version |
| 151 | + |
| 152 | +The Protocol Working Group members approved this NEP on Oct 26, 2023. |
| 153 | + |
| 154 | +[Zulip link](https://near.zulipchat.com/#narrow/stream/297873-pagoda.2Fnode/topic/How.20to.20test.20a.20chunk-only.20producer.20node.20in.20testnet.3F/near/396090090) |
| 155 | + |
| 156 | +#### Benefits |
| 157 | + |
| 158 | +See [Consequences](#consequences). |
| 159 | + |
| 160 | +#### Concerns |
| 161 | + |
| 162 | +See [Consequences](#consequences). |
| 163 | + |
| 164 | +## Copyright |
| 165 | + |
| 166 | +Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
0 commit comments