-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc: revises the namespace specifications and includes some clarifications #2124
Changes from all commits
e445aac
8de047a
29888be
421c441
8549b30
39afd72
b2e343a
1f46ff5
405a603
96e2140
ac657d1
713a160
27dd594
f25dee0
912b941
c49fd2f
c901967
05c587d
f48db89
09ed3bb
f171508
d39223b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,27 +4,40 @@ | |
|
||
## Abstract | ||
|
||
One of Celestia's core data structures is the namespace. When a user submits a `MsgPayForBlobs` transaction to Celestia they MUST associate each blob with exactly one namespace. After their transaction has been included in a block, the namespace enables users to take an interest in a subset of the blobs published to Celestia by allowing the user to query for blobs by namespace. | ||
One of Celestia's core data structures is the namespace. | ||
When a user submits a transaction encapsulating a `MsgPayForBlobs` message to Celestia, they MUST associate each blob with exactly one namespace. | ||
After their transaction has been included in a block, the namespace enables users to take an interest in a subset of the blobs published to Celestia by allowing the user to query for blobs by namespace. | ||
|
||
In order to enable efficient retrieval of blobs by namespace, Celestia makes use of a [Namespaced Merkle Tree](https://github.com/celestiaorg/nmt). See section 5.2 of the [LazyLedger whitepaper](https://arxiv.org/pdf/1905.09274.pdf) for more details. | ||
In order to enable efficient retrieval of blobs by namespace, Celestia makes use of a [Namespaced Merkle Tree](https://github.com/celestiaorg/nmt). | ||
See section 5.2 of the [LazyLedger whitepaper](https://arxiv.org/pdf/1905.09274.pdf) for more details. | ||
|
||
## Overview | ||
|
||
A namespace is composed of two fields: [version](#version) and [id](#id). A namespace is encoded as a byte slice with the version and id concatenated. Each [share](./shares.md) is prefixed with exactly one namespace. | ||
A namespace is composed of two fields: [version](#version) and [id](#id). | ||
A namespace is encoded as a byte slice with the version and id concatenated. | ||
|
||
![namespace](./figures/namespace.svg) | ||
|
||
### Version | ||
|
||
The namespace version is an 8-bit unsigned integer that indicates the version of the namespace. The version is used to determine the format of the namespace. The only supported user-specifiable namespace version is `0`. The version is encoded as a single byte. | ||
The namespace version is an 8-bit unsigned integer that indicates the version of the namespace. | ||
The version is used to determine the format of the namespace and | ||
is encoded as a single byte. | ||
A new namespace version MUST be introduced if the namespace format changes in a backwards incompatible way. | ||
|
||
Note: The `PARITY_SHARE_NAMESPACE` uses the namespace version `255` so that it can be ignored via the `IgnoreMaxNamespace` feature from [nmt](https://github.com/celestiaorg/nmt). The `TAIL_PADDING_NAMESPACE` uses the namespace version `255` so that it remains ordered after all blob namespaces even in the case a new namespace version is introduced. | ||
Below we explain supported user-specifiable namespace versions, | ||
however, we note that Celestia MAY utilize other namespace versions for internal use. | ||
For more details, see the [Reserved Namespaces](#reserved-namespaces) section. | ||
|
||
A namespace with version `0` must contain an id with a prefix of 18 leading `0` bytes. The remaining 10 bytes of the id are user-specified. | ||
#### Version 0 | ||
|
||
The only supported user-specifiable namespace version is `0`. | ||
A namespace with version `0` MUST contain an id with a prefix of 18 leading `0` bytes. | ||
The remaining 10 bytes of the id are user-specified. | ||
Below, we provide examples of valid and invalid encoded user-supplied namespaces with version `0`. | ||
|
||
```go | ||
// Valid encoded namespaces | ||
0x0000000000000000000000000000000000000000000000000000000001 // transaction namespace | ||
staheri14 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
0x0000000000000000000000000000000000000001010101010101010101 // valid blob namespace | ||
0x0000000000000000000000000000000000000011111111111111111111 // valid blob namespace | ||
|
||
|
@@ -34,14 +47,26 @@ A namespace with version `0` must contain an id with a prefix of 18 leading `0` | |
0x1111111111111111111111111111111111111111111111111111111111 // invalid because it does not have version 0 | ||
``` | ||
|
||
A new namespace version MUST be introduced if the namespace format changes in a backwards incompatible way (i.e. the number of leading `0` bytes in the id prefix is reduced). | ||
Any change in the number of leading `0` bytes in the id of a namespace with version `0` is considered a backwards incompatible change and MUST be introduced as a new namespace version. | ||
|
||
### ID | ||
|
||
The namespace ID is a 28 byte identifier that uniquely identifies a namespace. The ID is encoded as a byte slice of length 28. | ||
The namespace ID is a 28 byte identifier that uniquely identifies a namespace. | ||
The ID is encoded as a byte slice of length 28. | ||
<!-- It may be useful to indicate the endianness of the encoding) --> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [question] is endianess applicable for a byte slice that doesn't inherently represent some value? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Can you provide further clarification on this specific section? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. AFAIK https://en.wikipedia.org/wiki/Endianness is important to clarify for types like a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Exactly, as soon as we want to read and use that byte slice, then the endianness becomes important. Imagine a namespace query with a namespace encoded in little-endian order (in its Protobuf format), while the receiver expects namespaces in big-endian order, so the query won't be resolved correctly, as there won't be any match in the tree for that namespace due to the mismatch endianness. |
||
|
||
## Reserved Namespaces | ||
|
||
Celestia reserves certain namespaces with specific meanings. | ||
Celestia makes use of the reserved namespaces to properly organize and order transactions and blobs inside the [data square](./data_square_layout.md). | ||
Applications MUST NOT use these reserved namespaces for their blob data. | ||
|
||
Below is a list of reserved namespaces, along with a brief description of each. | ||
In the table, you will notice that the `PARITY_SHARE_NAMESPACE` and `TAIL_PADDING_NAMESPACE` utilize the namespace version `255`, which differs from the supported user-specified versions. | ||
The reason for employing version `255` for the `PARITY_SHARE_NAMESPACE` is to enable more efficient proof generation within the context of [nmt](https://github.com/celestiaorg/nmt), where it is used in conjunction with the `IgnoreMaxNamespace` feature. | ||
staheri14 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
Similarly, the `TAIL_PADDING_NAMESPACE` utilizes the namespace version `255` to ensure that padding shares are always properly ordered and placed at the end of the Celestia data square even if a new namespace version is introduced. | ||
For additional information on the significance and application of the reserved namespaces, please refer to the [Data Square Layout](./data_square_layout.md) specifications. | ||
|
||
| name | type | value | description | | ||
|-------------------------------------|-------------|----------------------------------------------------------------|------------------------------------------------------------------------------------------------------| | ||
| `TRANSACTION_NAMESPACE` | `Namespace` | `0x0000000000000000000000000000000000000000000000000000000001` | Transactions: requests that modify the state. | | ||
|
@@ -54,11 +79,20 @@ The namespace ID is a 28 byte identifier that uniquely identifies a namespace. T | |
|
||
## Assumptions and Considerations | ||
|
||
Applications MUST refrain from using the [reserved namespaces](#reserved-namespaces) for their blob data. | ||
|
||
## Implementation | ||
|
||
See [pkg/namespace](../../../pkg/namespace). | ||
|
||
## Protobuf Definition | ||
|
||
<!-- TODO: Add protobuf definition for namespace --> | ||
Comment on lines
+88
to
+90
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does namespace actually have protobuf definitions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure either, it is a placeholder just in case we have, or we want to have a protobuf definition. Please see #2128. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Personally I don't see the use. I think it's adequate to have it represented as an array of bytes |
||
|
||
## References | ||
|
||
1. [ADR-014](../../../docs/architecture/adr-014-versioned-namespaces.md) | ||
1. [ADR-015](../../../docs/architecture/adr-015-namespace-id-size.md) | ||
1. [Namespaced Merkle Tree](https://github.com/celestiaorg/nmt) | ||
1. [LazyLedger whitepaper](https://arxiv.org/pdf/1905.09274.pdf) | ||
1. [Data Square Layout](./data_square_layout.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[question] semantic line breaks are new for me so I don't understand why a line break was introduced in the middle of this sentence.
Was it added to conform to maximum line length character requirements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line break is added here because the second part, "is encoded as a single byte," conveys a separate and independent message from the first part. By moving it to the next line, we ensure clarity and emphasize the distinction between the two ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for clarifying! TBH it's not immediately obvious to me when to introduce a semantic line break but I think it's safe to keep one here :)
I guess we could also rewrite this as two sentences:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general guideline, it is usually recommended to add a line break after each ending period. However, there are other instances where we may want to break a line in the middle. In such cases, a helpful mental model is to view it as commit messages. Similar to keeping each commit short and self-contained, we can break down different semantically independent messages of a line into separate lines, using line breaks.
Yes, Your suggested approach also works.