Add digestable witnesses of binary codecs #34

craigfe · 2021-01-06T15:15:44Z

Fix #32.

Not to be merged until we have a sense of how this is used in Irmin – which won't be until after the 2.3.0 release.

src/repr/type_binary.ml

Ngoguey42 · 2021-02-02T13:06:46Z

src/repr/type_binary.ml

+        (fun c acc ->
+          let* acc = acc in
+          match c with
+          | C0 _ -> return (Empty :: acc)


If I am not mistaken, this means that the order of the elements in a Repr.enum is not reflected in a shape.

In general, a shape check cannot detect certain changes in the order of the cases and fields. This seems to be a problem.

The problem with trying to distinguish those cases is that they're not actually distinguished in the underlying binary codec, so this becomes a bit of a slippery slope of trying to forbid certain "unwanted" but valid transformations on types. For instance, re-ordering a pair of fields of the same type is equivalent to renaming each one individually, so forbidding the former requires forbidding the latter (in order to keep transitivity on the shape equivalence relation). I can't think of a good way around that problem.

The current approach tries to stay away from such questions by only capturing information contained in the types themselves. If two fields of the same shape can't be permuted, the answer is probably to give them different shapes (by versioning their semantic interpretation with Custom).

I was expecting the equivalence of shapes to imply the equivalence of the whole encode_bin/decode_bin process. I understand the subtle difference with the equivalent of codecs. I don't see a solution to this problem either.

Fortunately, such a change in the order of cases/fields should be fairly easy to detect during a code review.

Ngoguey42 · 2021-02-02T13:19:13Z

src/repr/type_binary.mli

+      decoders defined here. Shapes are represented canonically such that
+      equality of shapes of two types [t1] and [t2] implies equivalence of the
+      binary codecs derived from [t1] and [t2].


As the Shape module is disjoint from the Encode and Decode modules, there is no guarantee of this equivalence. There may be accidental discrepancies between the shape and the codecs.

One solution to avoid this problem would be to derive codecs from shapes.

Agreed. I'd like to do that, but it has a problem: deriving codecs from shapes requires Shape.t to be a GADT w/ witnesses for the algebraic cases – and we don't support type representations for GADTs. This would force us to write out comparisons / pretty-printing / serialisers for Shape.t by hand. Those functions would also have to unfold recursive loops (e.g. with the Unrolling.t monad), so this starts to look quite a lot like defining a third intermediate typed AST that sits between Type.t and (the current) Shape.t – I'm not sure it's worth the effort.

Maybe if shape was the core of Type.t it would be less of a problem to implement, but it might not be worth the effort indeed.

src/repr/type_binary.mli

Ngoguey42 · 2021-02-02T13:38:54Z

src/repr/type_binary.ml

+  (** Deriving shapes from type reps is straightforward, but requires some care
+      with recursive types. We convert recursive loops to use De Bruijn indexing
+      by unfolding each one with a fresh placeholder and tracking recursion
+      depth so that we can replace the placeholders with the right index.


I don't understand all the magic yet, but the code is interesting to read!

I'll try and make it clearer on a subsequent pass 🙂

Thank you for all your clarifications!

craigfe added 6 commits January 6, 2021 16:13

Initial implementation of digestable witnesses for binary codecs

70fef0a

extract type combinators to lower-level compilation unit

4aaccba

binary-shape: use type combinators for shape type

ff38545

binary-shape: use monadic traversal to track De Bruijn indices

3b4e29a

binary-shape: group codecs into larger equivalence classes

ba458da

binary-shape: improve description of handling of recursive types

6d44903

Ngoguey42 reviewed Feb 2, 2021

View reviewed changes

craigfe mentioned this pull request Oct 21, 2021

repr: use attributes for local function overrides #82

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add digestable witnesses of binary codecs #34

Add digestable witnesses of binary codecs #34

craigfe commented Jan 6, 2021

Ngoguey42 Feb 2, 2021

craigfe Feb 2, 2021

Ngoguey42 Feb 2, 2021

Ngoguey42 Feb 2, 2021

craigfe Feb 2, 2021

Ngoguey42 Feb 2, 2021

Ngoguey42 Feb 2, 2021

craigfe Feb 2, 2021

Ngoguey42 Feb 2, 2021

Add digestable witnesses of binary codecs #34

Are you sure you want to change the base?

Add digestable witnesses of binary codecs #34

Conversation

craigfe commented Jan 6, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment