repr: use attributes for local function overrides #82

craigfe · 2021-10-11T14:55:31Z

Currently, the functions Repr.{like,map} behave somewhat oddly when passed custom behaviour for one or more of the generic options: it evaluates all derivers on the typerep and packs the resulting functions into a Custom typerep containing those implementations.

This hides the structure of the type, but otherwise behaves sensibly as long as the derivers behave "linearly", e.g.:

derive (pair t1 t2) = derive (pair (partial ~derive:(derive t1) ()) (partial ~derive:(derive t2) ()))

but this is not always the case. An example is pre_hash which attempts to unbox the pre-hash representation of "simple" types and so behaves differently when complex types have simple components "pre-evaluated" as is currently the case.

This commit changes the implementation of Repr.{like,map} to instead add custom implementations as type attributes on the original type. These attributes don't hide the internal structure of the type in a Custom typerep, making function overrides transparent to derivers that don't care about those overrides.

CC @Ngoguey42. This doesn't quite solve our problem, since the user's custom pre_hash implementation may be implemented using pre_hash (and so sensitive to the "primitive" heuristic), but I think it gets closer.

Currently, the functions `Repr.{like,map}` behave somewhat oddly when passed custom behaviour for one or more of the generic options: it evaluates all derivers on the typerep and packs the resulting functions into a `Custom` typerep containing those implementations. This hides the structure of the type, but otherwise behaves sensibly as long as the derivers behave "linearly", e.g.: ```ocaml derive (pair t1 t2) = derive (pair (partial ~derive:(derive t1) ()) (partial ~derive:(derive t2) ())) ``` but this is not always the case. An example is `pre_hash` which attempts to unbox the pre-hash representation of "simple" types and so behaves differently when complex types have simple components "pre-evaluated" as is currently the case. This commit changes the implementation of `Repr.{like,map}` to instead add custom implementations as _type attributes_ on the original type. These attributes don't hide the internal structure of the type in a `Custom` typerep, making function overrides transparent to derivers that don't care about those overrides.

Ngoguey42

It took me a while to understand everything that is happening here. I too think that this is a step in the right direction 👍 .

Ngoguey42 · 2021-10-18T11:23:26Z

src/repr/type.ml

+    | None -> (
+        let rec is_prim : type a. a t -> bool = function
+          | Self s -> is_prim s.self_fix
+          | Attributes { attr_type; _ } -> is_prim attr_type


What if the attribute is a Type_pp.Attr_pp, a Type_pp.Attr_of_string or a new fancy attribute that is not yet part of repr?

I'm not sure, but it's a good point that this is a change of behaviour. It seems best to me to keep this is_prim logic as small as possible (and, if we're going to have it at all, put it in the derivers themselves). What do you think?

I think that this new connection between json and pp/of_string looks a lot like the connection between encode_bin and pre_hash. Both connections have a complicated semantic that may bite the repr user at unexpected moments.

One solution to fixing the problem of custom encode_bin shadowing custom pre_hash would be to remove that match bin with Some (x, _, _) -> Custom x | None -> Structural) case that can be found below.

Are you sure that introducing this new special case for json primitives is worth the trouble?

This new logic is mostly a copy of the old relationship between json and pp under like, trying to keep the behaviour as similar as possible. I agree that it's a complex semantic, but if we drop it then we should be careful to check for existing call-sites that rely on it.

icristescu · 2021-10-19T13:20:29Z

src/repr/type.ml

+    match pre_hash with
+    | Some x -> Custom x
+    | None -> (
+        match bin with Some (x, _, _) -> Custom x | None -> Structural)


This is similar to the previous pre_hash in partially_abstract ? (here https://github.com/mirage/repr/pull/82/files#diff-c7a2d1479dffeba7f7a7f341542999388b5c650e2392244de254809b44072e3aL336 ); why do we revert to encode_bin when there is not a pre_hash explicitly passed (instead of using the pre_hash of t as we do for instance for pp)?

Yep, I put this here to preserve the old behaviour.

I'd guess that this was added because (a) pre_hash is similar to encode_bin for most types and (b) a valid encode_bin has the properties we'd expect of pre_hash, since it must decode values property. e.g. if I have some complex type with an efficient binary encoding, I can use this for free to get an efficient pre_hash.

I'd like to make this connection between encode_bin / decode_bin / size_of / pre_hash more explicit in the future, perhaps as part of something like #34.

I see how it makes sense for most types, but in the case of structured keys, the pre_hash and the encode_bin are different for commits, no? (I'm just trying to understand that issue, we don't need to fix it here :) )

Hmm, yes; good point. Custom pre_hash implementations within the children will be lost this way.

Perhaps we want to do this only if there are no pre_hash attributes in the type? (... and the type contains no Custom _ nodes?) These cases are sort-of like having pre_hash = Some _, just not specified inline as an argument.

Ngoguey42 · 2021-10-19T12:06:46Z

src/repr/type_binary.ml

  let rec t : type a. a t -> a pre_hash = function
    | Self s -> self s
    | Custom c -> stage c.pre_hash
    | Map m -> map m
    | Boxed b -> t b
-    | Attributes { attr_type; _ } -> t attr_type
+    | Attributes { attrs; attr_type } -> (


Is the attribute check missing in the second definition of t at the end of the module?

Ngoguey42 · 2021-10-22T13:45:44Z

src/repr/type.ml

+    | None -> (
+        let rec is_prim : type a. a t -> bool = function
+          | Self s -> is_prim s.self_fix
+          | Attributes { attr_type; _ } -> is_prim attr_type


I think that this new connection between json and pp/of_string looks a lot like the connection between encode_bin and pre_hash. Both connections have a complicated semantic that may bite the repr user at unexpected moments.

One solution to fixing the problem of custom encode_bin shadowing custom pre_hash would be to remove that match bin with Some (x, _, _) -> Custom x | None -> Structural) case that can be found below.

Are you sure that introducing this new special case for json primitives is worth the trouble?

craigfe force-pushed the attributes-everywhere branch from a3ad58f to a21d498 Compare October 11, 2021 14:59

craigfe added 2 commits October 11, 2021 20:24

repr: derive custom ~pre_hash from ~encode_bin if passed

990f313

craigfe force-pushed the attributes-everywhere branch from a21d498 to 990f313 Compare October 11, 2021 19:24

Ngoguey42 approved these changes Oct 18, 2021

View reviewed changes

icristescu reviewed Oct 19, 2021

View reviewed changes

craigfe force-pushed the attributes-everywhere branch from 48b12dd to 990f313 Compare October 21, 2021 12:28

Ngoguey42 reviewed Oct 22, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

repr: use attributes for local function overrides #82

repr: use attributes for local function overrides #82

craigfe commented Oct 11, 2021 •

edited

Loading

Ngoguey42 left a comment

Ngoguey42 Oct 18, 2021

craigfe Oct 21, 2021

Ngoguey42 Oct 22, 2021

craigfe Oct 22, 2021

icristescu Oct 19, 2021

craigfe Oct 21, 2021

icristescu Oct 21, 2021

craigfe Oct 21, 2021

Ngoguey42 Oct 19, 2021

Ngoguey42 Oct 22, 2021

repr: use attributes for local function overrides #82

Are you sure you want to change the base?

repr: use attributes for local function overrides #82

Conversation

craigfe commented Oct 11, 2021 • edited Loading

Ngoguey42 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

craigfe commented Oct 11, 2021 •

edited

Loading