Skip to content
This repository has been archived by the owner on Mar 24, 2023. It is now read-only.

Add share namespace ID field #45

Merged
merged 3 commits into from
Jul 1, 2020
Merged

Conversation

adlerjohn
Copy link
Member

  • Add namespace ID field to share data structure.
  • Add note on serialization of shares to only use raw data.

Note: I intentionally did not fix the conflicting use of "message" for NMTs. That'll be handled in #41.

Rendered:

@adlerjohn adlerjohn added bug Something isn't working documentation Improvements or additions to documentation labels Jul 1, 2020
@adlerjohn adlerjohn added this to the Pre-implementation draft milestone Jul 1, 2020
@adlerjohn adlerjohn requested a review from liamsi July 1, 2020 12:09
@adlerjohn adlerjohn self-assigned this Jul 1, 2020
| `rawData` | `byte[SHARE_SIZE]` | Raw share data. |
| name | type | description |
| ------------- | ---------------------------- | -------------------------- |
| `namespaceID` | [NamespaceID](#type-aliases) | Namespace ID of the share. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A LL message is associated with a namespaceID. What is the namespace of a share though? Can't a share contain the end of one message and the beginning of another one (with different NIDS)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. From the rationale doc (https://github.com/lazyledger/lazyledger-specs/blob/adlerjohn-share_namespace/rationale/message_block_layout.md):

  1. Data must be ordered by namespace ID. This makes queries into a NMT commitment of that data more efficient.
  2. Since non-message data are not naturally intended for particular namespaces, we assign reserved namespaces for them. A range of namespaces is reserved for this purpose, starting from the lowest possible namespace ID.
  3. By construction, the above two rules mean that non-message data always precedes message data in the row-major matrix, even when considering single rows or columns.
  4. Data with different namespaces must not be in the same share. This might cause a small amount of wasted block space, but makes the NMT easier to reason about in general since leaves are guaranteed to belong to a single namespace.

By (4) shares will only have a single namespace ID associated with them. The reason shares have namespace IDs is that the NMT has shares as its leaves, and we need to know the NIDs of each leaf (i.e. share).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand on that, the NMT doesn't care if it's hashing messages or transactions or evidence. It just sees shares. And each share needs to have a NID somehow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. And the order for account balance related Tx is preserved because they end up in the right order in their preserved namespace?


An example layout of the share's internal bytes is shown below. For non-parity shares _with a reserved namespace_, the first `SHARE_RESERVED_BYTES` bytes (`*` in the figure) is the starting byte of the first request in the share as an unsigned integer, or `0` if there is none. In this example, the first byte would be `80` (or `0x50` in hex). For shares _with a non-reserved namespace_ (and parity shares), the first `SHARE_RESERVED_BYTES` bytes have no special meaning and are simply used to store data like all the other bytes in the share.

![fig: Reserved share.](./figures/share.svg)

For non-parity shares, if there is insufficient request data to fill the share, the remaining bytes are padded with `0`.

### Share Serialization

Shares [canonically serialized](#serialization) using only the raw share data, i.e. `serialize(share) = serialize(share.rawData)`.
Copy link
Member

@liamsi liamsi Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding: In the context of the NMT this means, that a leaf (share.rawData) will get hashed in the following way:
nsid||nsid||hash(leafPrefix||leaf), where leaf = rawData
as opposed to:
nsid||nsid||hash(leafPrefix||leaf), where leaf = nid || rawData

https://github.com/liamsi/nmt-experiments/blob/70b02631b0e79986ec266b514e6da607e5c8e9d7/trillian_based/hasher.go#L47-L48

Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

I need to revisit the block layout (rationale and this again)

@adlerjohn adlerjohn merged commit 184c30c into master Jul 1, 2020
@adlerjohn adlerjohn deleted the adlerjohn-share_namespace branch July 1, 2020 12:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants