-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repr: use attributes for local function overrides #82
base: main
Are you sure you want to change the base?
Conversation
a3ad58f
to
a21d498
Compare
Currently, the functions `Repr.{like,map}` behave somewhat oddly when passed custom behaviour for one or more of the generic options: it evaluates all derivers on the typerep and packs the resulting functions into a `Custom` typerep containing those implementations. This hides the structure of the type, but otherwise behaves sensibly as long as the derivers behave "linearly", e.g.: ```ocaml derive (pair t1 t2) = derive (pair (partial ~derive:(derive t1) ()) (partial ~derive:(derive t2) ())) ``` but this is not always the case. An example is `pre_hash` which attempts to unbox the pre-hash representation of "simple" types and so behaves differently when complex types have simple components "pre-evaluated" as is currently the case. This commit changes the implementation of `Repr.{like,map}` to instead add custom implementations as _type attributes_ on the original type. These attributes don't hide the internal structure of the type in a `Custom` typerep, making function overrides transparent to derivers that don't care about those overrides.
a21d498
to
990f313
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand everything that is happening here. I too think that this is a step in the right direction 👍 .
| None -> ( | ||
let rec is_prim : type a. a t -> bool = function | ||
| Self s -> is_prim s.self_fix | ||
| Attributes { attr_type; _ } -> is_prim attr_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the attribute is a Type_pp.Attr_pp
, a Type_pp.Attr_of_string
or a new fancy attribute that is not yet part of repr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure, but it's a good point that this is a change of behaviour. It seems best to me to keep this is_prim
logic as small as possible (and, if we're going to have it at all, put it in the derivers themselves). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this new connection between json
and pp
/of_string
looks a lot like the connection between encode_bin
and pre_hash
. Both connections have a complicated semantic that may bite the repr user at unexpected moments.
One solution to fixing the problem of custom encode_bin
shadowing custom pre_hash
would be to remove that match bin with Some (x, _, _) -> Custom x | None -> Structural)
case that can be found below.
Are you sure that introducing this new special case for json primitives is worth the trouble?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This new logic is mostly a copy of the old relationship between json
and pp
under like
, trying to keep the behaviour as similar as possible. I agree that it's a complex semantic, but if we drop it then we should be careful to check for existing call-sites that rely on it.
match pre_hash with | ||
| Some x -> Custom x | ||
| None -> ( | ||
match bin with Some (x, _, _) -> Custom x | None -> Structural) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is similar to the previous pre_hash in partially_abstract
? (here https://github.com/mirage/repr/pull/82/files#diff-c7a2d1479dffeba7f7a7f341542999388b5c650e2392244de254809b44072e3aL336 ); why do we revert to encode_bin
when there is not a pre_hash
explicitly passed (instead of using the pre_hash
of t
as we do for instance for pp
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I put this here to preserve the old behaviour.
I'd guess that this was added because (a) pre_hash
is similar to encode_bin
for most types and (b) a valid encode_bin
has the properties we'd expect of pre_hash
, since it must decode values property. e.g. if I have some complex type with an efficient binary encoding, I can use this for free to get an efficient pre_hash
.
I'd like to make this connection between encode_bin
/ decode_bin
/ size_of
/ pre_hash
more explicit in the future, perhaps as part of something like #34.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how it makes sense for most types, but in the case of structured keys, the pre_hash
and the encode_bin
are different for commits, no? (I'm just trying to understand that issue, we don't need to fix it here :) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes; good point. Custom pre_hash
implementations within the children will be lost this way.
Perhaps we want to do this only if there are no pre_hash
attributes in the type? (... and the type contains no Custom _
nodes?) These cases are sort-of like having pre_hash = Some _
, just not specified inline as an argument.
48b12dd
to
990f313
Compare
let rec t : type a. a t -> a pre_hash = function | ||
| Self s -> self s | ||
| Custom c -> stage c.pre_hash | ||
| Map m -> map m | ||
| Boxed b -> t b | ||
| Attributes { attr_type; _ } -> t attr_type | ||
| Attributes { attrs; attr_type } -> ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the attribute check missing in the second definition of t
at the end of the module?
| None -> ( | ||
let rec is_prim : type a. a t -> bool = function | ||
| Self s -> is_prim s.self_fix | ||
| Attributes { attr_type; _ } -> is_prim attr_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this new connection between json
and pp
/of_string
looks a lot like the connection between encode_bin
and pre_hash
. Both connections have a complicated semantic that may bite the repr user at unexpected moments.
One solution to fixing the problem of custom encode_bin
shadowing custom pre_hash
would be to remove that match bin with Some (x, _, _) -> Custom x | None -> Structural)
case that can be found below.
Are you sure that introducing this new special case for json primitives is worth the trouble?
Currently, the functions
Repr.{like,map}
behave somewhat oddly when passed custom behaviour for one or more of the generic options: it evaluates all derivers on the typerep and packs the resulting functions into aCustom
typerep containing those implementations.This hides the structure of the type, but otherwise behaves sensibly as long as the derivers behave "linearly", e.g.:
but this is not always the case. An example is
pre_hash
which attempts to unbox the pre-hash representation of "simple" types and so behaves differently when complex types have simple components "pre-evaluated" as is currently the case.This commit changes the implementation of
Repr.{like,map}
to instead add custom implementations as type attributes on the original type. These attributes don't hide the internal structure of the type in aCustom
typerep, making function overrides transparent to derivers that don't care about those overrides.CC @Ngoguey42. This doesn't quite solve our problem, since the user's custom
pre_hash
implementation may be implemented usingpre_hash
(and so sensitive to the "primitive" heuristic), but I think it gets closer.