feat(cid): raw size helpers in Cid library#266
Conversation
|
This doesn't require an update of PDPVerifier but it will require a submodule update in filecoin-services so we could consider tagging it for that purpose alone without a deploy, or just submodule to an untagged commit, which would also be fine but slightly less clean. |
| // Creates a CommPv2 CID from a raw size and hash digest according to FRC-0069. | ||
| // leafCountToRawSize gives an upper bound on raw bytes from a leaf count. Use rawPieceSize | ||
| // when padding and height are available; this is for aggregate counts (e.g. data set total). | ||
| // Overestimates by up to ~31 bytes per piece because per-piece leaf counts round up to 32. |
There was a problem hiding this comment.
It can overestimate by up to 126 in case of a 1-byte piece (height=2, padding=126), it would have 4 leaves.
There was a problem hiding this comment.
Adding a 1-byte piece (which we've not enabled in the stack yet for various reasons, though PDPVerifier supports it): addOnePiece runs Cids.leafCount(126, 2) = 4 - ((128*126)/127 >> 5) = 4 - 3 = 1, so the contract records 1 data-bearing leaf. leafCountToRawSize(1) = (1*32*127)/128 = 31 against an actual 1 byte gives overestimate 30.
rawPieceSize stays exact (1*127 - 126 = 1) because FRC-0069 keeps the padding count in the CID, even though the data itself is zero-padded up to 127 for hashing.
I guess the 126 reading would land if you fed 2^height = 4 (the merkle tree's leaf count) in instead, but the contract has no tree-level accounting, only Cids.leafCount. We don't expect anyone to have reason to call with a notion of a tree. So per-piece overestimate is still bounded at 31 bytes, not 126.
Unless you're seeing something I'm missing here?
There was a problem hiding this comment.
updated the comment for a little more clarity
There was a problem hiding this comment.
Ahhh, I confused the leafcount of the CommP tree and leafcount stored in PDP/FWSS
Kubuxu
left a comment
There was a problem hiding this comment.
LGTM apart from the comment nit
…471) `updatePaymentRates` was multiplying leafCount by 32 to derive total bytes, which is the Fr32-expanded size, not the raw payload size. Switch to the new `Cids.leafCountToRawSize` to apply the 127/128 ratio, removing the ~0.787% per-byte overcharge. Also marks `getDataSetSizeInBytes` deprecated and to-be-removed; it now delegates to the same helper rather than computing Fr32 size. Drops the `BYTES_PER_LEAF` constant. Bumps lib/pdp submodule to FilOzone/pdp#266. Fixes: #451
Ref: FilOzone/filecoin-services#451