This page describes how the mapping between protobuf type system and OCaml is done.
- Basic Types
- Oneof fields
- Field rules
- Default values
- Message
- Enumerations
- File name
- Package
- Extensions
- Nested Types
- Maps
- Groups Services
.proto type | OCaml Type | Extensions | Notes |
---|---|---|---|
double | float | ||
float | float | ||
int32 | int32 | int | |
int64 | int64 | int | |
uint32 | int32 | int | |
uint64 | int64 | int | |
sint32 | int32 | int | |
sint64 | int64 | int | |
fixed32 | int32 | int | |
fixed64 | int64 | int | |
sfixed32 | int32 | ||
sfixed64 | int64 | ||
bool | bool | ||
string | string | ||
bytes | bytes |
oneof
fields are encoded as OCaml variant
. The variant name is the concatenation of the enclosing message name
and the oneof
field name.
Note that since it's not possible to encode the variant type without being part of a message, no encoding/decoding functions are generated.
optional
field will generate option
type in OCaml, while repeated
field will generate OCaml list
.
ocaml-protoc
supports the majority of the default values that can be specified in a .proto
file:
- double/float: Decimal notation (ie 12.345) is supported while scientific notation is not (ie 2E8 or -8e2).
nan
andinf
are not supported. - int types: Decimal notation (ie 123) is supported while hexadecimal is not (ie 0xFF)
- string: default ASCII strings are supported but not escaped byte notation (ie \001\002)
- bytes: not supported
Message are compiled to OCaml records
with all fields immutable, while oneof
fields are compiled to OCaml variant.
Oneof optimization
Note that if the protobuf message only contains a single oneof
field then a single variant
will be generated.
This simplify greatly the generated code; for instance:
message IntOrString {
oneof t {
int32 intVal = 1;
string stringVal = 2;
}
}
will generate the compact representation:
type int_or_string =
| Int_val of int32
| String_val of string
An additional simplification is done for empty message used in oneof field; in this case we simply generate a constant constructor simplifying greatly the type:
message string_some {
message none {
}
oneof t {
none none = 1;
string some = 2;
}
}
Will generate the compact OCaml type:
type string_some =
| None
| Some of string
Recursive message
Recursive message are supported and compiled to recursive type in OCaml. For instance the following protobuf:
message IntList {
message Nil { }
message Cons {
required int32 value = 1 [(ocaml_type) = int_t] ;
required IntList next = 2;
}
oneof t {
Cons cons = 1;
Nil nil = 2;
}
}
Will compile to the following OCaml type:
type int_list_cons = {
value : int;
next : int_list;
}
and int_list =
| Cons of int_list_cons
| Nil
Enumerations are fully supported and will map to OCaml variant with constant constructor.
For example:
enum Corpus {
UNIVERSAL = 0;
WEB = 1;
IMAGES = 2;
LOCAL = 3;
NEWS = 4;
PRODUCTS = 5;
VIDEO = 6;
}
Will generate:
type corpus =
| Universal
| Web
| Images
| Local
| News
| Products
| Video
ocaml-protoc
generate one OCaml file (module) for each protobuf file following a similar convention as protoc:
- <file name>_pb.mli
- <file name>_pb.ml
While ocaml-protoc
honors the package compilation rules it does not use the package name for the generated OCaml code. Therefore any package semantic or convention is lost in the OCaml code.
Extensions are parsed by ocaml-protoc
however they are ignored. The main reason is that I have not reached a conclusion as to how they should be represented.
Nested types are fully supported and generate records which name is the concatenation of the inner and outer messages.
For example:
message ma {
message mb {
required int32 bfield = 1;
}
required mb bfield = 1;
}
Willl generate:
type ma_mb = {
bfield : int32;
}
(* ... *)
type ma = {
bfield : ma_mb;
}
Maps is fully supported in ocaml-protoc
and the OCaml type to represent an associative container can be configurable with a field option.
By default a map<a, b> = 1
Protobuf field will generate an OCaml list: ('a * 'b) list
. When setting the (ocaml_container) = hashtbl
in the .proto
file then it will generate ('a, 'b) Hashtbl.t
.
example 1 (default):
message M {
map<string, string> s2s = 1;
}
will generate
type m = {
s2s : (string * string) list;
}
example 2 (Hashtbl.t):
message M {
map<string, string> s2s = 1 [(ocaml_container) = hashtbl];
}
will generate
type m = {
s2s : (string, string) Hashtbl.t;
}
Thanks to Laurent Mazare for the initial implementation of map fields.
Groups and Services are currently NOT supported.
While groups are most likely never going be supported since they are being deprecated, maps should be relatively easy to add. Services requires a lot more work though, but I think generating an Mirage Cohttp server would be pretty awesome.