Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: improve type.hashTreeRoot() using batch #409

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

twoeths
Copy link
Contributor

@twoeths twoeths commented Oct 15, 2024

Motivation

  • improve type.hashTreeRoot() using batch

Description

  • instead of getRoots() and compute root from there, this PR implement getChunkBytes()
    • this compute root using merkleizeInto() which use batch there
    • reuse chunkBytesBuffer memory in type, almost no Uint8Array allocations in the middle
  • new hashTreeRootInto() api. This is needed in case consumers want to reuse memory allocation there
  • use allocUnsafe() of as-sha256 where it makes sense

cherry picked from #378

Copy link

github-actions bot commented Oct 15, 2024

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 958f392 Previous: 089daed Ratio
digestTwoHashObjects 50023 times 48.489 ms/op 48.350 ms/op 1.00
digest2Bytes32 50023 times 55.152 ms/op 55.248 ms/op 1.00
digest 50023 times 53.842 ms/op 53.804 ms/op 1.00
input length 32 1.1910 us/op 1.2030 us/op 0.99
input length 64 1.3580 us/op 1.3190 us/op 1.03
input length 128 2.2830 us/op 2.2560 us/op 1.01
input length 256 3.3690 us/op 3.4580 us/op 0.97
input length 512 5.5830 us/op 5.5370 us/op 1.01
input length 1024 10.762 us/op 10.624 us/op 1.01
digest 1000000 times 862.11 ms/op 855.80 ms/op 1.01
hashObjectToByteArray 50023 times 1.2301 ms/op 1.2288 ms/op 1.00
byteArrayToHashObject 50023 times 1.6322 ms/op 1.6959 ms/op 0.96
digest64 200092 times 216.98 ms/op 214.43 ms/op 1.01
hash 200092 times using batchHash4UintArray64s 249.47 ms/op 237.83 ms/op 1.05
digest64HashObjects 200092 times 193.28 ms/op 193.96 ms/op 1.00
hash 200092 times using batchHash4HashObjectInputs 201.55 ms/op 200.26 ms/op 1.01
getGindicesAtDepth 3.5700 us/op 3.5040 us/op 1.02
iterateAtDepth 6.5360 us/op 6.6550 us/op 0.98
getGindexBits 373.00 ns/op 390.00 ns/op 0.96
gindexIterator 862.00 ns/op 867.00 ns/op 0.99
HashComputationLevel.push then loop 28.812 ms/op 27.633 ms/op 1.04
HashComputation[] push then loop 49.565 ms/op 51.581 ms/op 0.96
hash 2 Uint8Array 500000 times - hashtree 233.75 ms/op 221.38 ms/op 1.06
hashTwoObjects 500000 times - hashtree 231.82 ms/op 214.67 ms/op 1.08
executeHashComputations - hashtree 10.484 ms/op 10.224 ms/op 1.03
hash 2 Uint8Array 500000 times - as-sha256 562.61 ms/op 564.44 ms/op 1.00
hashTwoObjects 500000 times - as-sha256 511.86 ms/op 508.10 ms/op 1.01
executeHashComputations - as-sha256 46.463 ms/op 49.681 ms/op 0.94
hash 2 Uint8Array 500000 times - noble 1.2745 s/op 1.2783 s/op 1.00
hashTwoObjects 500000 times - noble 1.7178 s/op 1.7951 s/op 0.96
executeHashComputations - noble 36.780 ms/op 36.921 ms/op 1.00
getHashComputations 2.3089 ms/op 2.4606 ms/op 0.94
executeHashComputations 11.208 ms/op 12.179 ms/op 0.92
get root 16.785 ms/op 15.551 ms/op 1.08
getNodeH() x7812.5 avg hindex 12.350 us/op 12.213 us/op 1.01
getNodeH() x7812.5 index 0 7.4860 us/op 7.6500 us/op 0.98
getNodeH() x7812.5 index 7 7.5690 us/op 7.4900 us/op 1.01
getNodeH() x7812.5 index 7 with key array 6.3010 us/op 6.4290 us/op 0.98
new LeafNode() x7812.5 297.52 us/op 307.31 us/op 0.97
getHashComputations 250000 nodes 14.995 ms/op 14.982 ms/op 1.00
batchHash 250000 nodes 84.542 ms/op 87.008 ms/op 0.97
get root 250000 nodes 131.20 ms/op 119.91 ms/op 1.09
getHashComputations 500000 nodes 28.069 ms/op 28.527 ms/op 0.98
batchHash 500000 nodes 165.89 ms/op 159.26 ms/op 1.04
get root 500000 nodes 253.78 ms/op 237.70 ms/op 1.07
getHashComputations 1000000 nodes 66.089 ms/op 79.780 ms/op 0.83
batchHash 1000000 nodes 371.58 ms/op 357.80 ms/op 1.04
get root 1000000 nodes 502.70 ms/op 465.57 ms/op 1.08
multiproof - depth 15, 1 requested leaves 8.7910 us/op 7.8250 us/op 1.12
tree offset multiproof - depth 15, 1 requested leaves 19.323 us/op 17.647 us/op 1.09
compact multiproof - depth 15, 1 requested leaves 3.5710 us/op 2.9650 us/op 1.20
multiproof - depth 15, 2 requested leaves 12.791 us/op 11.635 us/op 1.10
tree offset multiproof - depth 15, 2 requested leaves 24.592 us/op 21.027 us/op 1.17
compact multiproof - depth 15, 2 requested leaves 3.6870 us/op 3.0480 us/op 1.21
multiproof - depth 15, 3 requested leaves 19.076 us/op 16.346 us/op 1.17
tree offset multiproof - depth 15, 3 requested leaves 32.037 us/op 27.046 us/op 1.18
compact multiproof - depth 15, 3 requested leaves 5.3750 us/op 3.7380 us/op 1.44
multiproof - depth 15, 4 requested leaves 24.498 us/op 21.927 us/op 1.12
tree offset multiproof - depth 15, 4 requested leaves 38.226 us/op 34.661 us/op 1.10
compact multiproof - depth 15, 4 requested leaves 4.5230 us/op 4.5710 us/op 0.99
packedRootsBytesToLeafNodes bytes 4000 offset 0 5.6120 us/op 5.7620 us/op 0.97
packedRootsBytesToLeafNodes bytes 4000 offset 1 5.9540 us/op 5.9510 us/op 1.00
packedRootsBytesToLeafNodes bytes 4000 offset 2 6.5170 us/op 6.2470 us/op 1.04
packedRootsBytesToLeafNodes bytes 4000 offset 3 6.3890 us/op 6.5230 us/op 0.98
subtreeFillToContents depth 40 count 250000 48.104 ms/op 51.450 ms/op 0.93
setRoot - gindexBitstring 20.757 ms/op 27.926 ms/op 0.74
setRoot - gindex 21.381 ms/op 27.176 ms/op 0.79
getRoot - gindexBitstring 2.5464 ms/op 2.8500 ms/op 0.89
getRoot - gindex 3.0315 ms/op 3.3816 ms/op 0.90
getHashObject then setHashObject 22.478 ms/op 24.123 ms/op 0.93
setNodeWithFn 19.750 ms/op 21.779 ms/op 0.91
getNodeAtDepth depth 0 x100000 279.48 us/op 280.07 us/op 1.00
setNodeAtDepth depth 0 x100000 2.5873 ms/op 2.6229 ms/op 0.99
getNodesAtDepth depth 0 x100000 313.13 us/op 312.08 us/op 1.00
setNodesAtDepth depth 0 x100000 890.87 us/op 767.40 us/op 1.16
getNodeAtDepth depth 1 x100000 342.48 us/op 343.08 us/op 1.00
setNodeAtDepth depth 1 x100000 8.8042 ms/op 9.7988 ms/op 0.90
getNodesAtDepth depth 1 x100000 437.65 us/op 436.13 us/op 1.00
setNodesAtDepth depth 1 x100000 6.6269 ms/op 7.5708 ms/op 0.88
getNodeAtDepth depth 2 x100000 785.58 us/op 742.14 us/op 1.06
setNodeAtDepth depth 2 x100000 15.573 ms/op 18.950 ms/op 0.82
getNodesAtDepth depth 2 x100000 19.294 ms/op 19.552 ms/op 0.99
setNodesAtDepth depth 2 x100000 21.899 ms/op 24.999 ms/op 0.88
tree.getNodesAtDepth - gindexes 8.6477 ms/op 9.8121 ms/op 0.88
tree.getNodesAtDepth - push all nodes 2.2861 ms/op 1.9273 ms/op 1.19
tree.getNodesAtDepth - navigation 311.23 us/op 311.56 us/op 1.00
tree.setNodesAtDepth - indexes 729.68 us/op 711.07 us/op 1.03
set at depth 8 778.00 ns/op 795.00 ns/op 0.98
set at depth 16 1.1080 us/op 1.2040 us/op 0.92
set at depth 32 1.9020 us/op 2.1040 us/op 0.90
iterateNodesAtDepth 8 256 14.096 us/op 14.214 us/op 0.99
getNodesAtDepth 8 256 3.6960 us/op 3.7510 us/op 0.99
iterateNodesAtDepth 16 65536 5.0277 ms/op 4.4323 ms/op 1.13
getNodesAtDepth 16 65536 2.0073 ms/op 1.6374 ms/op 1.23
iterateNodesAtDepth 32 250000 17.099 ms/op 16.168 ms/op 1.06
getNodesAtDepth 32 250000 5.5713 ms/op 4.5882 ms/op 1.21
iterateNodesAtDepth 40 250000 17.599 ms/op 15.618 ms/op 1.13
getNodesAtDepth 40 250000 5.1017 ms/op 4.4420 ms/op 1.15
250000 validators root getter 127.47 ms/op 116.60 ms/op 1.09
250000 validators batchHash() 118.62 ms/op 90.307 ms/op 1.31
250000 validators hashComputations 16.850 ms/op 14.267 ms/op 1.18
bitlist bytes to struct (120,90) 942.00 ns/op 1.0320 us/op 0.91
bitlist bytes to tree (120,90) 3.5840 us/op 3.8140 us/op 0.94
bitlist bytes to struct (2048,2048) 1.1220 us/op 1.1700 us/op 0.96
bitlist bytes to tree (2048,2048) 4.3710 us/op 4.5660 us/op 0.96
ByteListType - deserialize 8.3365 ms/op 8.3780 ms/op 1.00
BasicListType - deserialize 16.753 ms/op 18.521 ms/op 0.90
ByteListType - serialize 7.8058 ms/op 8.2961 ms/op 0.94
BasicListType - serialize 10.511 ms/op 11.094 ms/op 0.95
BasicListType - tree_convertToStruct 29.916 ms/op 30.512 ms/op 0.98
List[uint8, 68719476736] len 300000 ViewDU.getAll() + iterate 4.9752 ms/op 5.1116 ms/op 0.97
List[uint8, 68719476736] len 300000 ViewDU.get(i) 4.3798 ms/op 4.3844 ms/op 1.00
Array.push len 300000 empty Array - number 7.0564 ms/op 7.2562 ms/op 0.97
Array.set len 300000 from new Array - number 2.2187 ms/op 2.2219 ms/op 1.00
Array.set len 300000 - number 7.2161 ms/op 6.8323 ms/op 1.06
Uint8Array.set len 300000 495.73 us/op 512.42 us/op 0.97
Uint32Array.set len 300000 546.15 us/op 553.43 us/op 0.99
Container({a: uint8, b: uint8}) getViewDU x300000 25.182 ms/op 44.307 ms/op 0.57
ContainerNodeStruct({a: uint8, b: uint8}) getViewDU x300000 10.513 ms/op 12.499 ms/op 0.84
List(Container) len 300000 ViewDU.getAllReadonly() + iterate 215.46 ms/op 198.52 ms/op 1.09
List(Container) len 300000 ViewDU.getAllReadonlyValues() + iterate 267.73 ms/op 244.38 ms/op 1.10
List(Container) len 300000 ViewDU.get(i) 6.8988 ms/op 6.4275 ms/op 1.07
List(Container) len 300000 ViewDU.getReadonly(i) 6.7950 ms/op 6.5868 ms/op 1.03
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonly() + iterate 39.806 ms/op 37.865 ms/op 1.05
List(ContainerNodeStruct) len 300000 ViewDU.getAllReadonlyValues() + iterate 4.8991 ms/op 5.0735 ms/op 0.97
List(ContainerNodeStruct) len 300000 ViewDU.get(i) 6.1218 ms/op 6.2390 ms/op 0.98
List(ContainerNodeStruct) len 300000 ViewDU.getReadonly(i) 6.0023 ms/op 6.2559 ms/op 0.96
Array.push len 300000 empty Array - object 6.1365 ms/op 6.4447 ms/op 0.95
Array.set len 300000 from new Array - object 2.1278 ms/op 2.0458 ms/op 1.04
Array.set len 300000 - object 5.9011 ms/op 6.7783 ms/op 0.87
cachePermanentRootStruct no cache 3.5330 us/op 5.7670 us/op 0.61
cachePermanentRootStruct with cache 184.00 ns/op 211.00 ns/op 0.87
epochParticipation len 250000 rws 7813 2.2923 ms/op 2.2967 ms/op 1.00
Deneb BeaconBlock.hashTreeRoot(), numTransaction=200 5.5391 ms/op
BeaconState ViewDU hashTreeRoot() vc=200000 116.03 ms/op 111.76 ms/op 1.04
BeaconState ViewDU recursive hash - commit step vc=200000 4.3905 ms/op 4.5206 ms/op 0.97
BeaconState ViewDU validator tree creation vc=10000 38.403 ms/op 38.036 ms/op 1.01
BeaconState ViewDU batchHashTreeRoot vc=200000 107.97 ms/op 102.55 ms/op 1.05
BeaconState ViewDU hashTreeRoot - commit step vc=200000 93.759 ms/op 88.798 ms/op 1.06
BeaconState ViewDU hashTreeRoot - hash step vc=200000 18.929 ms/op 17.716 ms/op 1.07
deserialize Attestation - tree 3.4640 us/op 3.7050 us/op 0.93
deserialize Attestation - struct 1.7920 us/op 1.9470 us/op 0.92
deserialize SignedAggregateAndProof - tree 4.7290 us/op 4.9660 us/op 0.95
deserialize SignedAggregateAndProof - struct 2.8870 us/op 3.1100 us/op 0.93
deserialize SyncCommitteeMessage - tree 1.3240 us/op 1.4330 us/op 0.92
deserialize SyncCommitteeMessage - struct 1.0410 us/op 1.1130 us/op 0.94
deserialize SignedContributionAndProof - tree 3.0690 us/op 2.9250 us/op 1.05
deserialize SignedContributionAndProof - struct 2.7420 us/op 2.3670 us/op 1.16
deserialize SignedBeaconBlock - tree 287.73 us/op 273.24 us/op 1.05
deserialize SignedBeaconBlock - struct 120.58 us/op 119.44 us/op 1.01
BeaconState vc 300000 - deserialize tree 660.15 ms/op 662.96 ms/op 1.00
BeaconState vc 300000 - serialize tree 181.12 ms/op 163.96 ms/op 1.10
BeaconState.historicalRoots vc 300000 - deserialize tree 950.00 ns/op 926.00 ns/op 1.03
BeaconState.historicalRoots vc 300000 - serialize tree 770.00 ns/op 674.00 ns/op 1.14
BeaconState.validators vc 300000 - deserialize tree 599.46 ms/op 611.21 ms/op 0.98
BeaconState.validators vc 300000 - serialize tree 116.61 ms/op 109.11 ms/op 1.07
BeaconState.balances vc 300000 - deserialize tree 26.495 ms/op 27.243 ms/op 0.97
BeaconState.balances vc 300000 - serialize tree 4.0852 ms/op 3.9178 ms/op 1.04
BeaconState.previousEpochParticipation vc 300000 - deserialize tree 871.13 us/op 909.53 us/op 0.96
BeaconState.previousEpochParticipation vc 300000 - serialize tree 338.19 us/op 337.72 us/op 1.00
BeaconState.currentEpochParticipation vc 300000 - deserialize tree 872.90 us/op 910.20 us/op 0.96
BeaconState.currentEpochParticipation vc 300000 - serialize tree 338.49 us/op 342.43 us/op 0.99
BeaconState.inactivityScores vc 300000 - deserialize tree 21.251 ms/op 26.116 ms/op 0.81
BeaconState.inactivityScores vc 300000 - serialize tree 4.0716 ms/op 4.2558 ms/op 0.96
hashTreeRoot Attestation - struct 12.688 us/op 18.002 us/op 0.70
hashTreeRoot Attestation - tree 9.7790 us/op 9.0700 us/op 1.08
hashTreeRoot SignedAggregateAndProof - struct 17.835 us/op 27.115 us/op 0.66
hashTreeRoot SignedAggregateAndProof - tree 14.718 us/op 13.542 us/op 1.09
hashTreeRoot SyncCommitteeMessage - struct 4.4310 us/op 6.9330 us/op 0.64
hashTreeRoot SyncCommitteeMessage - tree 4.0290 us/op 3.9680 us/op 1.02
hashTreeRoot SignedContributionAndProof - struct 10.318 us/op 16.259 us/op 0.63
hashTreeRoot SignedContributionAndProof - tree 10.140 us/op 9.1960 us/op 1.10
hashTreeRoot SignedBeaconBlock - struct 851.64 us/op 1.4076 ms/op 0.61
hashTreeRoot SignedBeaconBlock - tree 885.29 us/op 800.00 us/op 1.11
hashTreeRoot Validator - struct 4.8610 us/op 8.0410 us/op 0.60
hashTreeRoot Validator - tree 7.5600 us/op 6.8640 us/op 1.10
BeaconState vc 300000 - hashTreeRoot tree 2.4226 s/op 2.3003 s/op 1.05
BeaconState vc 300000 - batchHashTreeRoot tree 4.3922 s/op 4.1698 s/op 1.05
BeaconState.historicalRoots vc 300000 - hashTreeRoot tree 1.2160 us/op 998.00 ns/op 1.22
BeaconState.validators vc 300000 - hashTreeRoot tree 2.6453 s/op 2.4694 s/op 1.07
BeaconState.balances vc 300000 - hashTreeRoot tree 41.221 ms/op 34.638 ms/op 1.19
BeaconState.previousEpochParticipation vc 300000 - hashTreeRoot tree 4.7786 ms/op 4.2524 ms/op 1.12
BeaconState.currentEpochParticipation vc 300000 - hashTreeRoot tree 4.6876 ms/op 4.2329 ms/op 1.11
BeaconState.inactivityScores vc 300000 - hashTreeRoot tree 39.865 ms/op 34.725 ms/op 1.15
hash64 x18 9.5680 us/op 9.6140 us/op 1.00
hashTwoObjects x18 8.9700 us/op 8.0670 us/op 1.11
hash64 x1740 898.31 us/op 798.95 us/op 1.12
hashTwoObjects x1740 829.06 us/op 748.46 us/op 1.11
hash64 x2700000 1.4139 s/op 1.2288 s/op 1.15
hashTwoObjects x2700000 1.2882 s/op 1.1581 s/op 1.11
get_exitEpoch - ContainerType 413.00 ns/op 244.00 ns/op 1.69
get_exitEpoch - ContainerNodeStructType 432.00 ns/op 275.00 ns/op 1.57
set_exitEpoch - ContainerType 437.00 ns/op 276.00 ns/op 1.58
set_exitEpoch - ContainerNodeStructType 427.00 ns/op 262.00 ns/op 1.63
get_pubkey - ContainerType 1.4660 us/op 1.0490 us/op 1.40
get_pubkey - ContainerNodeStructType 410.00 ns/op 244.00 ns/op 1.68
hashTreeRoot - ContainerType 686.00 ns/op 378.00 ns/op 1.81
hashTreeRoot - ContainerNodeStructType 673.00 ns/op 349.00 ns/op 1.93
createProof - ContainerType 7.2120 us/op 3.6640 us/op 1.97
createProof - ContainerNodeStructType 35.922 us/op 20.192 us/op 1.78
serialize - ContainerType 2.2000 us/op 1.6440 us/op 1.34
serialize - ContainerNodeStructType 2.2140 us/op 1.3800 us/op 1.60
set_exitEpoch_and_hashTreeRoot - ContainerType 3.8280 us/op 2.5930 us/op 1.48
set_exitEpoch_and_hashTreeRoot - ContainerNodeStructType 10.522 us/op 7.3570 us/op 1.43
Array - for of 6.0930 us/op 5.6860 us/op 1.07
Array - for(;;) 6.1080 us/op 5.5880 us/op 1.09
basicListValue.readonlyValuesArray() 6.6152 ms/op 4.2120 ms/op 1.57
basicListValue.readonlyValuesArray() + loop all 5.6112 ms/op 4.3320 ms/op 1.30
compositeListValue.readonlyValuesArray() 35.068 ms/op 27.013 ms/op 1.30
compositeListValue.readonlyValuesArray() + loop all 31.357 ms/op 31.819 ms/op 0.99
Number64UintType - get balances list 4.5447 ms/op 4.3351 ms/op 1.05
Number64UintType - set balances list 10.221 ms/op 9.9302 ms/op 1.03
Number64UintType - get and increase 10 then set 41.024 ms/op 44.843 ms/op 0.91
Number64UintType - increase 10 using applyDelta 16.857 ms/op 16.475 ms/op 1.02
Number64UintType - increase 10 using applyDeltaInBatch 17.126 ms/op 16.751 ms/op 1.02
tree_newTreeFromUint64Deltas 20.947 ms/op 22.391 ms/op 0.94
unsafeUint8ArrayToTree 38.866 ms/op 38.551 ms/op 1.01
bitLength(50) 238.00 ns/op 250.00 ns/op 0.95
bitLengthStr(50) 240.00 ns/op 231.00 ns/op 1.04
bitLength(8000) 257.00 ns/op 223.00 ns/op 1.15
bitLengthStr(8000) 280.00 ns/op 258.00 ns/op 1.09
bitLength(250000) 249.00 ns/op 235.00 ns/op 1.06
bitLengthStr(250000) 331.00 ns/op 293.00 ns/op 1.13
merkleize 32 chunks 16.659 us/op
merkleizeBlocksBytes 32 chunks 3.7200 us/op
merkleizeBlockArray 32 chunks 6.7950 us/op
merkleize 128 chunks 66.537 us/op
merkleizeBlocksBytes 128 chunks 8.1710 us/op
merkleizeBlockArray 128 chunks 19.505 us/op
merkleize 512 chunks 266.89 us/op
merkleizeBlocksBytes 512 chunks 23.684 us/op
merkleizeBlockArray 512 chunks 67.083 us/op
merkleize 1024 chunks 538.02 us/op
merkleizeBlocksBytes 1024 chunks 43.430 us/op
merkleizeBlockArray 1024 chunks 128.44 us/op
floor - Math.floor (53) 1.2436 ns/op 1.2476 ns/op 1.00
floor - << 0 (53) 1.2429 ns/op 1.2435 ns/op 1.00
floor - Math.floor (512) 1.2424 ns/op 1.2442 ns/op 1.00
floor - << 0 (512) 1.2452 ns/op 1.2451 ns/op 1.00
fnIf(0) 1.5540 ns/op 1.5543 ns/op 1.00
fnSwitch(0) 2.1742 ns/op 2.1771 ns/op 1.00
fnObj(0) 1.5782 ns/op 1.5612 ns/op 1.01
fnArr(0) 1.5537 ns/op 1.5542 ns/op 1.00
fnIf(4) 2.1970 ns/op 2.1744 ns/op 1.01
fnSwitch(4) 2.1765 ns/op 2.1900 ns/op 0.99
fnObj(4) 1.5697 ns/op 1.5607 ns/op 1.01
fnArr(4) 1.5536 ns/op 1.5567 ns/op 1.00
fnIf(9) 3.1089 ns/op 3.1105 ns/op 1.00
fnSwitch(9) 2.2004 ns/op 2.1769 ns/op 1.01
fnObj(9) 1.5598 ns/op 1.5777 ns/op 0.99
fnArr(9) 1.5638 ns/op 1.5544 ns/op 1.01
Container {a,b,vec} - as struct x100000 124.56 us/op 124.57 us/op 1.00
Container {a,b,vec} - as tree x100000 902.04 us/op 559.89 us/op 1.61
Container {a,vec,b} - as struct x100000 155.55 us/op 156.41 us/op 0.99
Container {a,vec,b} - as tree x100000 560.18 us/op 560.39 us/op 1.00
get 2 props x1000000 - rawObject 318.50 us/op 311.32 us/op 1.02
get 2 props x1000000 - proxy 73.498 ms/op 74.039 ms/op 0.99
get 2 props x1000000 - customObj 311.42 us/op 311.95 us/op 1.00
Simple object binary -> struct 602.00 ns/op 947.00 ns/op 0.64
Simple object binary -> tree_backed 1.7690 us/op 2.4820 us/op 0.71
Simple object struct -> tree_backed 2.4040 us/op 2.8430 us/op 0.85
Simple object tree_backed -> struct 1.6490 us/op 2.4990 us/op 0.66
Simple object struct -> binary 1.0060 us/op 1.2110 us/op 0.83
Simple object tree_backed -> binary 1.3620 us/op 1.7360 us/op 0.78
aggregationBits binary -> struct 540.00 ns/op 678.00 ns/op 0.80
aggregationBits binary -> tree_backed 2.5480 us/op 2.7350 us/op 0.93
aggregationBits struct -> tree_backed 2.8190 us/op 3.1540 us/op 0.89
aggregationBits tree_backed -> struct 1.1860 us/op 1.2780 us/op 0.93
aggregationBits struct -> binary 715.00 ns/op 835.00 ns/op 0.86
aggregationBits tree_backed -> binary 886.00 ns/op 1.1010 us/op 0.80
List(uint8) 100000 binary -> struct 1.7035 ms/op 1.7192 ms/op 0.99
List(uint8) 100000 binary -> tree_backed 328.98 us/op 275.31 us/op 1.19
List(uint8) 100000 struct -> tree_backed 1.4472 ms/op 1.3975 ms/op 1.04
List(uint8) 100000 tree_backed -> struct 1.1765 ms/op 1.1459 ms/op 1.03
List(uint8) 100000 struct -> binary 1.0905 ms/op 1.1435 ms/op 0.95
List(uint8) 100000 tree_backed -> binary 115.88 us/op 111.57 us/op 1.04
List(uint64Number) 100000 binary -> struct 1.5243 ms/op 1.3440 ms/op 1.13
List(uint64Number) 100000 binary -> tree_backed 5.0719 ms/op 4.4901 ms/op 1.13
List(uint64Number) 100000 struct -> tree_backed 7.4616 ms/op 6.5066 ms/op 1.15
List(uint64Number) 100000 tree_backed -> struct 2.8613 ms/op 2.6367 ms/op 1.09
List(uint64Number) 100000 struct -> binary 1.5506 ms/op 1.6063 ms/op 0.97
List(uint64Number) 100000 tree_backed -> binary 1.6451 ms/op 1.2282 ms/op 1.34
List(Uint64Bigint) 100000 binary -> struct 4.3723 ms/op 4.4331 ms/op 0.99
List(Uint64Bigint) 100000 binary -> tree_backed 5.1731 ms/op 4.8654 ms/op 1.06
List(Uint64Bigint) 100000 struct -> tree_backed 8.0250 ms/op 7.9153 ms/op 1.01
List(Uint64Bigint) 100000 tree_backed -> struct 5.6914 ms/op 5.1174 ms/op 1.11
List(Uint64Bigint) 100000 struct -> binary 2.1096 ms/op 2.0749 ms/op 1.02
List(Uint64Bigint) 100000 tree_backed -> binary 1.7328 ms/op 1.4173 ms/op 1.22
Vector(Root) 100000 binary -> struct 39.109 ms/op 39.591 ms/op 0.99
Vector(Root) 100000 binary -> tree_backed 43.704 ms/op 43.458 ms/op 1.01
Vector(Root) 100000 struct -> tree_backed 55.091 ms/op 54.542 ms/op 1.01
Vector(Root) 100000 tree_backed -> struct 53.071 ms/op 52.560 ms/op 1.01
Vector(Root) 100000 struct -> binary 3.3867 ms/op 2.7957 ms/op 1.21
Vector(Root) 100000 tree_backed -> binary 6.6173 ms/op 6.2213 ms/op 1.06
List(Validator) 100000 binary -> struct 121.54 ms/op 107.68 ms/op 1.13
List(Validator) 100000 binary -> tree_backed 371.71 ms/op 376.74 ms/op 0.99
List(Validator) 100000 struct -> tree_backed 389.75 ms/op 397.50 ms/op 0.98
List(Validator) 100000 tree_backed -> struct 227.77 ms/op 225.28 ms/op 1.01
List(Validator) 100000 struct -> binary 29.793 ms/op 29.560 ms/op 1.01
List(Validator) 100000 tree_backed -> binary 118.27 ms/op 113.67 ms/op 1.04
List(Validator-NS) 100000 binary -> struct 118.37 ms/op 120.94 ms/op 0.98
List(Validator-NS) 100000 binary -> tree_backed 174.54 ms/op 181.62 ms/op 0.96
List(Validator-NS) 100000 struct -> tree_backed 219.87 ms/op 213.35 ms/op 1.03
List(Validator-NS) 100000 tree_backed -> struct 179.51 ms/op 170.59 ms/op 1.05
List(Validator-NS) 100000 struct -> binary 30.806 ms/op 28.769 ms/op 1.07
List(Validator-NS) 100000 tree_backed -> binary 38.031 ms/op 34.762 ms/op 1.09
get epochStatuses - MutableVector 96.341 us/op 115.09 us/op 0.84
get epochStatuses - ViewDU 204.28 us/op 212.71 us/op 0.96
set epochStatuses - ListTreeView 2.3612 ms/op 2.1502 ms/op 1.10
set epochStatuses - ListTreeView - set() 462.39 us/op 472.51 us/op 0.98
set epochStatuses - ListTreeView - commit() 847.61 us/op 830.74 us/op 1.02
bitstring 516.08 ns/op 514.25 ns/op 1.00
bit mask 13.366 ns/op 13.867 ns/op 0.96
struct - increase slot to 1000000 933.43 us/op 932.63 us/op 1.00
UintNumberType - increase slot to 1000000 28.404 ms/op 28.312 ms/op 1.00
UintBigintType - increase slot to 1000000 217.57 ms/op 172.44 ms/op 1.26
UintBigint8 x 100000 tree_deserialize 7.7126 ms/op 5.8397 ms/op 1.32
UintBigint8 x 100000 tree_serialize 1.1393 ms/op 1.1299 ms/op 1.01
UintBigint16 x 100000 tree_deserialize 7.3242 ms/op 5.5272 ms/op 1.33
UintBigint16 x 100000 tree_serialize 1.9874 ms/op 1.3855 ms/op 1.43
UintBigint32 x 100000 tree_deserialize 7.7263 ms/op 5.6201 ms/op 1.37
UintBigint32 x 100000 tree_serialize 2.4656 ms/op 1.8839 ms/op 1.31
UintBigint64 x 100000 tree_deserialize 8.0561 ms/op 6.0277 ms/op 1.34
UintBigint64 x 100000 tree_serialize 3.3957 ms/op 2.5647 ms/op 1.32
UintBigint8 x 100000 value_deserialize 436.85 us/op 435.37 us/op 1.00
UintBigint8 x 100000 value_serialize 1.2741 ms/op 789.63 us/op 1.61
UintBigint16 x 100000 value_deserialize 466.81 us/op 467.59 us/op 1.00
UintBigint16 x 100000 value_serialize 1.3815 ms/op 839.45 us/op 1.65
UintBigint32 x 100000 value_deserialize 499.38 us/op 497.67 us/op 1.00
UintBigint32 x 100000 value_serialize 1.4398 ms/op 869.33 us/op 1.66
UintBigint64 x 100000 value_deserialize 563.73 us/op 561.69 us/op 1.00
UintBigint64 x 100000 value_serialize 1.6443 ms/op 1.0797 ms/op 1.52
UintBigint8 x 100000 deserialize 4.0365 ms/op 3.6284 ms/op 1.11
UintBigint8 x 100000 serialize 1.8402 ms/op 1.8198 ms/op 1.01
UintBigint16 x 100000 deserialize 4.1477 ms/op 3.1410 ms/op 1.32
UintBigint16 x 100000 serialize 2.0062 ms/op 1.5117 ms/op 1.33
UintBigint32 x 100000 deserialize 4.2397 ms/op 3.1881 ms/op 1.33
UintBigint32 x 100000 serialize 3.0976 ms/op 2.8694 ms/op 1.08
UintBigint64 x 100000 deserialize 4.9399 ms/op 4.1369 ms/op 1.19
UintBigint64 x 100000 serialize 1.6433 ms/op 1.6342 ms/op 1.01
UintBigint128 x 100000 deserialize 5.9101 ms/op 5.3990 ms/op 1.09
UintBigint128 x 100000 serialize 15.579 ms/op 14.347 ms/op 1.09
UintBigint256 x 100000 deserialize 8.9218 ms/op 8.0526 ms/op 1.11
UintBigint256 x 100000 serialize 45.626 ms/op 43.221 ms/op 1.06
Slice from Uint8Array x25000 1.5491 ms/op 1.3258 ms/op 1.17
Slice from ArrayBuffer x25000 17.245 ms/op 16.751 ms/op 1.03
Slice from ArrayBuffer x25000 + new Uint8Array 19.540 ms/op 16.154 ms/op 1.21
Copy Uint8Array 100000 iterate 2.6830 ms/op 2.6558 ms/op 1.01
Copy Uint8Array 100000 slice 142.81 us/op 108.81 us/op 1.31
Copy Uint8Array 100000 Uint8Array.prototype.slice.call 142.97 us/op 110.74 us/op 1.29
Copy Buffer 100000 Uint8Array.prototype.slice.call 146.27 us/op 107.64 us/op 1.36
Copy Uint8Array 100000 slice + set 269.21 us/op 218.60 us/op 1.23
Copy Uint8Array 100000 subarray + set 139.15 us/op 106.86 us/op 1.30
Copy Uint8Array 100000 slice arrayBuffer 142.93 us/op 106.16 us/op 1.35
Uint64 deserialize 100000 - iterate Uint8Array 2.2722 ms/op 1.9314 ms/op 1.18
Uint64 deserialize 100000 - by Uint32A 2.2543 ms/op 1.9477 ms/op 1.16
Uint64 deserialize 100000 - by DataView.getUint32 x2 2.2378 ms/op 2.0050 ms/op 1.12
Uint64 deserialize 100000 - by DataView.getBigUint64 5.3180 ms/op 5.0519 ms/op 1.05
Uint64 deserialize 100000 - by byte 41.009 ms/op 40.703 ms/op 1.01

by benchmarkbot/action

@twoeths
Copy link
Contributor Author

twoeths commented Oct 21, 2024

tested this on feat1, see ChainSafe/lodestar#7171 (comment)
ready to review

@twoeths twoeths marked this pull request as ready for review October 21, 2024 03:18
@twoeths twoeths requested a review from a team as a code owner October 21, 2024 03:18
@twoeths twoeths marked this pull request as draft October 31, 2024 03:39
@twoeths
Copy link
Contributor Author

twoeths commented Oct 31, 2024

sha256 works in blocks, each is 64 bytes so perhaps it's more meaningful to reflect that for chunkBytesBuffer variable

also with holesky, there are 1.7M validators. For every 8 deposits we have to reallocate the whole 1.7M * 8 bytes = 13.6MB for BeaconState.balances which is not ideal. Need to instead allocate another 64 bytes in this case. This applies for all list types.

Update:

  • the hash of BeaconState.balances and everything inside BeaconState work through ViewDU so it's not revelant
  • it's more related to the hash of BeaconBlock, for example transaction data and lists like number of transactions

@twoeths twoeths force-pushed the te/improve_type_dot_hash_tree_root branch 2 times, most recently from 9e32c5c to 7ed3ced Compare November 9, 2024 02:19
@philknows philknows modified the milestones: v1.0, v1.1 Jan 22, 2025
@nazarhussain nazarhussain force-pushed the te/improve_type_dot_hash_tree_root branch from d3821ee to cbb30a2 Compare February 12, 2025 12:18
@nazarhussain nazarhussain marked this pull request as ready for review February 12, 2025 14:05
Copy link
Member

@matthewkeil matthewkeil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a few comments but I think this PR really needs to be reviewed by @wemeetagain


blocksBuffer.set(value);
const valueLen = value.length;
const blockByteLen = Math.ceil(valueLen / 64) * 64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be helpful to break out a helper function so its clear why this is happening everywhere.

export function getPaddedByte32Count(buf: ArrayBuffer): number {
    return Math.ceil(buf.length / 32);
}

export function getPaddedByte64Count(buf: ArrayBuffer): number {
    return Math.ceil(buf.length / 64);
}

@nazarhussain nazarhussain requested a review from matthewkeil March 4, 2025 12:21
@@ -30,11 +30,18 @@ export abstract class BasicType<V> extends Type<V> {
}

hashTreeRoot(value: V): Uint8Array {
// TODO: Optimize
const uint8Array = new Uint8Array(32);
// cannot use allocUnsafe() here because hashTreeRootInto() may not fill the whole 32 bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this comment correct? it looks like hashTreeRootInto is filling all 32 bytes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove that comment @nazarhussain . I remember I tried allocUnsafe() at some points and got "Invalid state root" issue. If we want to apply it, need to audit carefully an apply to other places as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants