Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MsgPack bin instead of base64 #2

Open
drewcrawford opened this issue Jul 23, 2014 · 4 comments
Open

Use MsgPack bin instead of base64 #2

drewcrawford opened this issue Jul 23, 2014 · 4 comments

Comments

@drewcrawford
Copy link

I maintain probably what is the only reasonable MsgPack library for ObjC (and, soon, Swift). I also support a large number of what you call "extension types", but in my case they go way beyond primitive types like URLs/dates into things like custom classes. What I'm saying is, I have spent a lot of time in this problem space.

To me the advantage of a non-JSON scheme is performance. Sure, you could define a set of extensions just for JSON, but why do that when alternate encoders like MsgPack are so much more efficient for non-JS implementations. We're on the same page there.

However, the decision of base64ing the bytes is a complete non-starter for me. It bloats the size and takes longer in transit, longer to encode/decode, etc. MsgPack v2 has a perfectly adequate, binary, non-string type for you to target. Efficient transport of byte arrays is actually the thing that motivated me to write an MsgPack library in the first place.

Sure, it means you have a difference between JSON/MsgPack representations but that's already the case for other types.

@timewald
Copy link
Contributor

The original plan was to use binary in msgpack, and we will return to that if we can. We moved away from it because some of the msgpack libs we're using do not (yet) distinguish binary and string types while reading, presumably aligned with (on interpretation of) the implementation guidance here: https://github.com/msgpack/msgpack/blob/master/spec.md#impl-upgrade. Once we get msgpack libs across the required platforms that implement the full string vs binary split, we will move back to binary data in msgpack in place of base64.

@drewcrawford
Copy link
Author

The problem is that these implementation changes will not be forthcoming. Or, to put it another way: I realized the existing implementations weren't going to implement the new spec back in 2013 and so I wrote my own. So far my hypothesis has largely been correct.

Transit didn't create this mess, but nobody else is going to clean it up. The existing MsgPack users are all pretty satisfied with how things are.

The trouble is that Transit doesn't have any users (outside of Cognitect?) and the burden it has to overcome to collect early adopters is that it has to be better than whatever duct tape approach the early adopters are using today.

My duct tape approach includes MsgPack bin support, and I've already got working implementations for the 2 languages I care about today.

A response of "Well, nobody is working on this for [language you don't use] so we can't support bin yet" will not convince me to put away my roll of duct tape and join forces with Transit. As long as the duct tape actually works better (!) than a more formal solution I'm likely to stick with it.

On Jul 24, 2014, at 9:52 AM, Tim Ewald [email protected] wrote:

The original plan was to use binary in msgpack, and we will return to that if we can. We moved away from it because some of the msgpack libs we're using do not (yet) distinguish binary and string types while reading, presumably aligned with (on interpretation of) the implementation guidance here: https://github.com/msgpack/msgpack/blob/master/spec.md#impl-upgrade. Once we get msgpack libs across the required platforms that implement the full string vs binary split, we will move back to binary data in msgpack in place of base64.


Reply to this email directly or view it on GitHub.

@jrus
Copy link

jrus commented Jun 28, 2015

Is this still the case a year later? Any plans to fix it ever? Seems to me like it makes transit-msgpack a significantly less desirable format for any kind of data with lots of raw data, e.g. images, audio, geospatial data, etc. etc.

Is there a list of the offending messagepack implementations somewhere, so we can go pester them to fix their shit, or even submit patches? Which ones are the “required platforms” from Cognitect’s perspective?

I’d love to have some kind of good generic base to build on whenever I need to make a new data serialization and exchange format, or even to transcode various existing files into a semantically equivalent but more easily parsed / more standard format, instead of writing and optimizing special-purpose parsers in every language where I need one. Transit-msgpack seemed like a decent basis for that kind of thing, but if every bit of raw data needs to be base64-encoded, that seems like a huge waste of both filesize and parsing time.

@Qqwy
Copy link

Qqwy commented Jun 24, 2022

What about adding support for Msgpack v2 as a separate encoding format option, just like both json and json_verbose are supported?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants