Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

namespace: PrefixedData/PrefixedData8 are easy to misuse #71

Open
elias-orijtech opened this issue Oct 7, 2022 · 1 comment
Open

namespace: PrefixedData/PrefixedData8 are easy to misuse #71

elias-orijtech opened this issue Oct 7, 2022 · 1 comment
Labels
API enhancement New feature or request

Comments

@elias-orijtech
Copy link
Contributor

The PrefixedData and PrefixedData8 are both byte slices:

// PrefixedData simply represents a slice of bytes which consists of
// a namespace.ID and raw data.
// The user has to guarantee that the bytes are valid namespace prefixed data.
// Go's type system does not allow enforcing the structure we want:
// [namespaceID, rawData ...], especially as this type does not expect any
// particular size for the namespace.
type PrefixedData []byte

// PrefixedData8 like PrefixedData is just a slice of bytes.
// It assumes that the slice it represents is at least 8 bytes.
// This assumption is not enforced by the type system though.
type PrefixedData8 []byte

The definitions are problematic, because (1) casting byte slices may violate the type invariants (as pointed out in the comments) (2) they're mutable (3) PrefixedData has no way to indicate the namespace identifier length. For an example where mutability is a problem, see

nmt/nmt.go

Lines 253 to 257 in 6274243

// Push adds data with the corresponding namespace ID to the tree.
// Returns an error if the namespace ID size of the input
// does not match the tree's NamespaceSize() or the leaves are not pushed in
// order (i.e. lexicographically sorted by namespace ID).
func (n *NamespacedMerkleTree) Push(namespacedData namespace.PrefixedData) error {

where it is not clear to the caller that the parameter namespacedData is retained by the Push method.

The missing namespace length is problematic for validateAndExtractNamespace,

nmt/nmt.go

Lines 336 to 340 in 6274243

func (n *NamespacedMerkleTree) validateAndExtractNamespace(ndata namespace.PrefixedData) (namespace.ID, error) {
nidSize := int(n.NamespaceSize())
if len(ndata) < nidSize {
return nil, fmt.Errorf("%w: got: %v, want >= %v", ErrMismatchedNamespaceSize, len(ndata), nidSize)
}

because it can only validate that the namespace+data total length is at least as long as the expected namespace length.

Perhaps the definitions of PrefixedData and PrefixedData8 are caused by wanting to support zero-copy and zero-allocation use, where the user generates PrefixedData values by slicing existing byte slices. If so, the behaviour of Push should document that argument values are retained.

Otherwise, I recommend changing the definition to hide the internal structure, and add validating constructors. There are several possibilites:

type PrefixedData struct {
    // namespace and data can be sliced from the same
    // underlying []byte allocation.
    namespace []byte
    data []byte
}

func NewPrefixedData(namespace, data []byte) PrefixedData {
    bytes := make([]byte, len(namespace)+len(data))
    d := PrefixedData{
        namespace: bytes[:len(namespace)],
        data: bytes[len(namespace)],
    }
    copy(d.namespace, namespace)
    copy(d.data, data)
    return d
}

or equivalently

type PrefixedData struct {
    namespaceAndData []byte
    namespaceLen int
}

@odeke-em

@liamsi liamsi added enhancement New feature or request API labels Dec 16, 2022
@liamsi
Copy link
Member

liamsi commented Dec 16, 2022

I think both options existed in the past. See:

nmt/namespace/data.go

Lines 3 to 6 in dd6191c

type PrefixedData struct {
namespaceLen int
prefixedData []byte
}

(I think I've never pushed the other version). I am not sure what the rationale was to make it just byte aliases. Probably because of the way it is consumed (so we do not have to convert back and forth between types maybe). Whoever tackles this should also check in core/app/node to see how this is used proper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants