From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: Mauricio Fernandez <mfp@acm.org>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Re: Serialisation of PXP DTDs
Date: Fri, 24 Oct 2008 00:18:50 +0200 [thread overview]
Message-ID: <1224800330.7340.65.camel@flake.lan.gerd-stolpmann.de> (raw)
In-Reply-To: <20081023210527.GB32611@NANA.localdomain>
Am Donnerstag, den 23.10.2008, 23:05 +0200 schrieb Mauricio Fernandez:
> I have been working for a while on a self-describing, compact, extensible
> binary protocol, along with an OCaml implementation which I intent to release
> in not too long.
>
> It differs from sexplib and that bin-prot in two main ways:
> * the data model is deliberately more limited, as the format is meant to be
> de/encodable in multiple languages.
> * it is extensible at several levels, achieving both forward and backward
> compatibility across changes in the data type
>
> You can think of it as an extensible Protocol Buffers[1] with a richer data
> model (albeit not in 1:1 accordance with OCaml's for the above mentioned
> reason).
Have you looked at ICEP (see zeroc.com)? It has bindings for many
languages, even for Ocaml (http://oss.wink.com/hydro/).
It is, however, not self-describing. Anyway, you may find there ideas
for portability.
Gerd
> In the criteria you gave in another message, namely
> (1) ease of use
> (2) "future-proofness"
> (3) portability
> (4) human-readability,
>
> it does fairly well at the 3 first ones --- especially at (2) and (3), which
> were poorly supported by existing solutions (I looked into bin-prot, sexplib,
> Google's Protocol Buffers, Thrift and XDR; I also referred to IIOP and ITU-T
> X.690 DER during the design). Being a binary format, it obviously doesn't do
> that well at (4), but it is possible to get a human-readable dump of the
> binary data even in the absence of the interface definition, making
> reverse-engineering no harder than sexplib (and arguably easier in some ways).
>
> For example, here's a bogus message definition to illustrate (2) and (4).
> This protocol definition is fed to the compiler, which generates the OCaml
> type definitions, as well as the encoders/decoders and pretty-printers (as you
> can see, the specification uses a mix of OCaml, Haskell and C++ syntax, but
> it's pretty clear IMO)
>
> type sum_type 'a 'b 'c = A 'a | B 'b | C 'c
>
> message complex_rtt =
> A {
> a1 : [(int * [|bool|])];
> a2 : [ sum_type<int, string, long> ]
> }
> | B {
> b1 : bool;
> b2 : (string * [int])
> }
>
> The protocol is extensible in the sense that you can add new constructors to a
> sum or message type, add new elements to a tuple, and replace any primitive
> type by a sum type including the original type. For instance, if at some point
> in time we find that the b1 field should have a different type, we can do
>
> type bool_or_something 'a = Orig unboxed_bool | New_constructor 'a
>
> and then
> ...
> | B { b1 : bool_or_something<some_type>; ... }
>
> This, along with a way to specify default values, allows both forward and
> backward compatibility.
>
> The compiler generates a pretty printer for these structures, useful for
> debugging. Here's a message generated randomly:
>
> {
> Complex_rtt.a1 =
> [ ((-5378), [| false; false; false; true; true |]);
> (3942717140522000971, [| false; true; true; true; false |]);
> ((-6535386320450295), [| false |]); ((-238860767206), [| |]);
> (1810196202, [| false; false; true; true |]) ];
> Complex_rtt.a2 =
> [ Sum_type.A (-13830); Sum_type.A 369334576; Sum_type.A 83;
> Sum_type.A (-3746796577167465774); Sum_type.A (-1602586945) ] }
>
> Now, this is the information decoded in the absence of the above definitions
> (iow., what you'd have to work with if you were reverse-engineering the
> protocol):
>
> T0 {
> T0 [
> T0 { Vint_t0 (-5378);
> T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 0; Vint_t0 (-1);
> Vint_t0 (-1)]};
> T0 { Vint_t0 3942717140522000971;
> T0 [ Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1); Vint_t0 (-1);
> Vint_t0 0]};
> T0 { Vint_t0 (-6535386320450295); T0 [ Vint_t0 0]};
> T0 { Vint_t0 (-238860767206); T0 [ ]};
> T0 { Vint_t0 1810196202;
> T0 [ Vint_t0 0; Vint_t0 0; Vint_t0 (-1); Vint_t0 (-1)]}];
> T0 [ T0 { Vint_t0 (-13830)}; T0 { Vint_t0 369334576}; T0 { Vint_t0 83};
> T0 { Vint_t0 (-3746796577167465774)}; T0 { Vint_t0 (-1602586945)}]}
>
> (I'm still changing some details so it might look better than this shortly.)
>
> It's not a drop-in solution like sexplib's "with sexp", by design (since it is
> meant to allow interoperability between different languages), but it's still
> fairly easy to use.
>
> If you're interested in this, tell me and I'll let you know when it's ready for
> serious usage.
>
> [1] http://code.google.com/p/protobuf/
>
--
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de
Phone: +49-6151-153855 Fax: +49-6151-997714
------------------------------------------------------------
next prev parent reply other threads:[~2008-10-23 22:17 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-22 20:11 Dario Teixeira
2008-10-22 23:05 ` Sylvain Le Gall
2008-10-23 15:34 ` [Caml-list] " Dario Teixeira
2008-10-23 16:37 ` Stefano Zacchiroli
2008-10-23 16:53 ` Markus Mottl
2008-10-23 19:26 ` Dario Teixeira
2008-10-23 21:05 ` Mauricio Fernandez
2008-10-23 22:18 ` Gerd Stolpmann [this message]
2008-10-23 22:50 ` Mauricio Fernandez
2008-10-23 22:21 ` Dario Teixeira
2008-10-23 23:36 ` Mauricio Fernandez
2008-10-24 9:11 ` Mikkel Fahnøe Jørgensen
2008-10-24 14:03 ` Markus Mottl
2008-10-25 18:58 ` Mauricio Fernandez
2008-10-26 18:15 ` Markus Mottl
2008-10-26 19:47 ` Mauricio Fernandez
2008-10-24 21:39 ` Mauricio Fernandez
2008-10-24 22:27 ` Mikkel Fahnøe Jørgensen
2008-10-25 19:19 ` Mauricio Fernandez
2008-10-23 16:46 ` Markus Mottl
2008-10-23 14:55 ` [Caml-list] " Gerd Stolpmann
2008-10-23 18:41 [Caml-list] " Dario Teixeira
2008-10-23 18:58 ` Markus Mottl
2008-10-23 20:04 ` Dario Teixeira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1224800330.7340.65.camel@flake.lan.gerd-stolpmann.de \
--to=info@gerd-stolpmann.de \
--cc=caml-list@inria.fr \
--cc=mfp@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox