-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactoring of encode and decode to be zero-copy #57
Conversation
We should make sure whatever we do in |
Encode and Decode have been implemented (decode not completely). Right now we need to make sure that these are the types we want and the direction we want to take. A channel is potentially a stream without a fixed length so the unboxed version may be trickier to implement. |
Before going further I think we should see what kind of impact this API will have on its users, eg. index and irmin. For instance, for both of these use-cases the data to read and write will come through Also, is |
Actually that's my mistake, I believe decoding is just as (if not more) important as encoding during merges, as every |
To relate a discussion had with @mattiasdrp offline: the approach currently taken here isn't feasible as it forks the universe of typereps according to the channel types. (Repr.t values built for different functor instantiations would be incompatible, when the overall goal is to have a "canonical" type for typereps that can then be consumed uniformly.) This was motivated by wanting to make One path forward is to get a minimal working diff for an externally-allocated generic codec in Index (that simply fails in the corner cases), in order to demonstrate viable user code in Index and Irmin. As @samoht suggests, this is the main source of uncertainty (and fortunately these use-cases have simplifying assumptions that we can exploit for a proof of concept). |
bench/main.ml
Outdated
@@ -1,6 +1,28 @@ | |||
open Bechamel | |||
open Toolkit | |||
module T = Repr | |||
|
|||
module IO = struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure to understand why we need to abstract over IO here. I dont think it's needed for the irmin/index use-case. What's wrong with something like this:
type 'a encode_bin = 'a -> bytes -> off:int -> len:int -> int
type 'a decode_bin = bytes -> off:int -> len:int -> ('a * int)
(these are tentative signatures - the idea is to match the signatures of pread
and pwrite
as explained in my previous comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you would get rid of the functor and enforce the type of the output and input channel to Repr's users if I understand correctly?
If someone wanted to write directly in a different output they'd have to allocate a Byte
, encode values in it and write its content in their output, right?
I don't dislike this solution because it makes everything simpler from an implementation point of view but it makes it less generic over the IO the user is using (I'd say it's a one-copy solution instead of zero-copy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you would get rid of the functor and enforce the type of the output and input channel to Repr's users if I understand correctly?
yes, the only users are irmin
and index
, we don't need a generic solution. That library was called irmin-type
a few months ago, it was just pulled out to be shared between index
and irmin
. It's a non goal to cover all the generic usage of a dynamic type library.
I don't dislike this solution because it makes everything simpler from an implementation point of view but it makes it less generic over the IO the user is using (I'd say it's a one-copy solution instead of zero-copy)
which is already (at least) one copy less than it is today. Let's see how it performs on index/irmin before trying to build the perfect solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'm fine with that! Will do immediately :-)
Need to make sure that everything is actually ok
As discussed offline with @mattiasdrp we should focus on changing The API changes to |
This should be cancelled when compiling with flambda but due to the fact that it is not the default compiler it is better right now to manually externalise them to avoid extra allocations
No description provided.