* picking / marshaling to strings in ocaml-revision-stable way @ 2008-05-31 6:43 Luca de Alfaro 2008-05-31 7:24 ` [Caml-list] " asmadeus77 2008-05-31 8:43 ` Jacques Garrigue 0 siblings, 2 replies; 16+ messages in thread From: Luca de Alfaro @ 2008-05-31 6:43 UTC (permalink / raw) To: Inria Ocaml Mailing List [-- Attachment #1: Type: text/plain, Size: 838 bytes --] I need a way to convert data structures to strings, in a way that is robust with respect to different versions of Ocaml. What I need to translate are mostly mixes of tuples, lists and variant types. A typical example of data to marshal/pickle may look like: (3.4, [Move (4, 3, 5); Del (4, 2); Ins (4, 2)], "an example") I heard that the marshaling of the module Marshal is not robust with respect to changes in the version of Ocaml, and since I need to insert the data in a database for long-term use, this is a serious drawback. I need the marshaling and unmarshaling to be completely independent from the version of Ocaml, and from the particular architecture where the marshaling occurs. I could of course write my own solution, but I am wondering if there are any suitable modules available that I could use. Many thanks! Luca [-- Attachment #2: Type: text/html, Size: 912 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro @ 2008-05-31 7:24 ` asmadeus77 2008-05-31 8:43 ` Jacques Garrigue 1 sibling, 0 replies; 16+ messages in thread From: asmadeus77 @ 2008-05-31 7:24 UTC (permalink / raw) To: Luca de Alfaro; +Cc: Inria Ocaml Mailing List Hello, You can try ocaml sexplib, which use lisp-like structures to store data... And I don't think this will change anytime soon :) (I don't know if it can store anything "worse" than a 3-uple, but it works with pairs and triples, lists, and arrays, which should be enough for you) Regards, Dominique Martinet ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro 2008-05-31 7:24 ` [Caml-list] " asmadeus77 @ 2008-05-31 8:43 ` Jacques Garrigue 2008-05-31 9:38 ` Berke Durak 1 sibling, 1 reply; 16+ messages in thread From: Jacques Garrigue @ 2008-05-31 8:43 UTC (permalink / raw) To: luca; +Cc: caml-list From: "Luca de Alfaro" <luca@dealfaro.org> > I need a way to convert data structures to strings, in a way that is robust > with respect to different versions of Ocaml. > What I need to translate are mostly mixes of tuples, lists and variant > types. A typical example of data to marshal/pickle may look like: > > (3.4, [Move (4, 3, 5); Del (4, 2); Ins (4, 2)], "an example") > > I heard that the marshaling of the module Marshal is not robust with respect > to changes in the version of Ocaml, and since I need to insert the data in a > database for long-term use, this is a serious drawback. I need the > marshaling and unmarshaling to be completely independent from the version of > Ocaml, and from the particular architecture where the marshaling occurs. > I could of course write my own solution, but I am wondering if there are any > suitable modules available that I could use. AFAIK, ocaml's marshalling doesn't depend on the version, and is architecture independent (there is only a limitation with integer overflow when passing more than 31-bit integer values from 64-bit to 32-bit). So marshalling should be sufficient for your needs. It is however sensitive to the data format you use (the types for your data), and there is currently no way to verify that the type has not changed between two versions of a program. --------------------------------------------------------------------------- Jacques Garrigue Nagoya University garrigue at math.nagoya-u.ac.jp <A HREF=http://www.math.nagoya-u.ac.jp/~garrigue/>JG</A> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 8:43 ` Jacques Garrigue @ 2008-05-31 9:38 ` Berke Durak 2008-05-31 16:54 ` Luca de Alfaro 0 siblings, 1 reply; 16+ messages in thread From: Berke Durak @ 2008-05-31 9:38 UTC (permalink / raw) To: Jacques Garrigue; +Cc: luca, caml-list I second Luca's suggestion to use Sexplib. At the very least, use a plaintext format. Don't use Marshal for long-term storage of values. Avoid it if you can. Been there, done that. Why? (1) Not type-safe. Translation: your program *wil segfault* and you won't know why. (2) Not human-readable nor editable. (3) Not future-proof. What happens if you change your type definition? Your program will segfault. So you'll have to migrate your data. But how? You'll have to find the exact revision used to generate the binary data. Good luck with that. Did you put a revision number in your data? Are you sure it was up-to-date? Then you'll have to hand-write a converter that uses type declarations from the old and the new modules. I hope your dependencies are not too complex. Not fun *at all*. However, there are some situations where Marshal is appropriate : (1) Your data is not acyclic, contains closures, or needs sharing to be compact enough. Sexplib doesn't handle these. (2) The data won't live long anyway. As in: you're doing IPC between known versions of Ocaml programs. (3) You desperately need speed. As in: you're processing 200GB of Wikipedia data. Then I can understand. -- Berke Durak ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 9:38 ` Berke Durak @ 2008-05-31 16:54 ` Luca de Alfaro 2008-05-31 17:00 ` Robert Fischer 2008-05-31 17:06 ` Yaron Minsky 0 siblings, 2 replies; 16+ messages in thread From: Luca de Alfaro @ 2008-05-31 16:54 UTC (permalink / raw) To: Berke Durak; +Cc: Jacques Garrigue, caml-list [-- Attachment #1: Type: text/plain, Size: 2427 bytes --] Thanks for this insight... I imagined the lack of robustness of Marshaling, but without all the details you mentioned!... actually, I DO desperately need speed, as I am processing TB's of Wikipedia data, but precisely because the datasets are so large, I cannot afford having to recompute / convert them often, and so I want a robust format. Furthermore, I think the bottleneck for me is anyway the speed of mysql and the disk, not really the small amount of time that natively compiled Ocaml would take for the conversion (I have anyway to do more complex computation that converting a few lists and datatypes to ascii, unfortunately). Moreover, a plaintext format greatly helps debugging; it also helps that I can read the same data with other programming languages. Speaking of debugging, and said in passing, I cannot say enough how much I LOVE the ability of ocamldebug of executing code backwards. It is such a revelation. You simply go to the error, then back off a bit to see how you got there. But, this is a topic for another thread. Many thanks, Luca On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> wrote: > I second Luca's suggestion to use Sexplib. At the very least, use a > plaintext format. > Don't use Marshal for long-term storage of values. Avoid it if you > can. Been there, done that. > Why? > > (1) Not type-safe. Translation: your program *wil segfault* and you > won't know why. > (2) Not human-readable nor editable. > (3) Not future-proof. What happens if you change your type > definition? Your program > will segfault. So you'll have to migrate your data. But how? You'll > have to find > the exact revision used to generate the binary data. Good luck with > that. Did you put > a revision number in your data? Are you sure it was up-to-date? Then > you'll have to hand-write a converter that uses type declarations from > the old and the new modules. > I hope your dependencies are not too complex. Not fun *at all*. > > However, there are some situations where Marshal is appropriate : > > (1) Your data is not acyclic, contains closures, or needs sharing to > be compact enough. Sexplib doesn't handle these. > (2) The data won't live long anyway. As in: you're doing IPC between > known versions of Ocaml programs. > (3) You desperately need speed. As in: you're processing 200GB of > Wikipedia data. > Then I can understand. > -- > Berke Durak > [-- Attachment #2: Type: text/html, Size: 2930 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 16:54 ` Luca de Alfaro @ 2008-05-31 17:00 ` Robert Fischer 2008-05-31 17:24 ` Luca de Alfaro ` (3 more replies) 2008-05-31 17:06 ` Yaron Minsky 1 sibling, 4 replies; 16+ messages in thread From: Robert Fischer @ 2008-05-31 17:00 UTC (permalink / raw) To: caml-list How far is the reach from the Jane St S-exp library from producing JSON? I've not actually looked at it, but that'd be super nifty in the interoperation world. ~~ Robert. Luca de Alfaro wrote: > Thanks for this insight... I imagined the lack of robustness of Marshaling, > but without all the details you mentioned!... actually, I DO desperately > need speed, as I am processing TB's of Wikipedia data, but precisely because > the datasets are so large, I cannot afford having to recompute / convert > them often, and so I want a robust format. Furthermore, I think the > bottleneck for me is anyway the speed of mysql and the disk, not really the > small amount of time that natively compiled Ocaml would take for the > conversion (I have anyway to do more complex computation that converting a > few lists and datatypes to ascii, unfortunately). Moreover, a plaintext > format greatly helps debugging; it also helps that I can read the same data > with other programming languages. > > Speaking of debugging, and said in passing, I cannot say enough how much I > LOVE the ability of ocamldebug of executing code backwards. It is such a > revelation. You simply go to the error, then back off a bit to see how you > got there. But, this is a topic for another thread. > > Many thanks, > > Luca > > > On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> wrote: > >> I second Luca's suggestion to use Sexplib. At the very least, use a >> plaintext format. >> Don't use Marshal for long-term storage of values. Avoid it if you >> can. Been there, done that. >> Why? >> >> (1) Not type-safe. Translation: your program *wil segfault* and you >> won't know why. >> (2) Not human-readable nor editable. >> (3) Not future-proof. What happens if you change your type >> definition? Your program >> will segfault. So you'll have to migrate your data. But how? You'll >> have to find >> the exact revision used to generate the binary data. Good luck with >> that. Did you put >> a revision number in your data? Are you sure it was up-to-date? Then >> you'll have to hand-write a converter that uses type declarations from >> the old and the new modules. >> I hope your dependencies are not too complex. Not fun *at all*. >> >> However, there are some situations where Marshal is appropriate : >> >> (1) Your data is not acyclic, contains closures, or needs sharing to >> be compact enough. Sexplib doesn't handle these. >> (2) The data won't live long anyway. As in: you're doing IPC between >> known versions of Ocaml programs. >> (3) You desperately need speed. As in: you're processing 200GB of >> Wikipedia data. >> Then I can understand. >> -- >> Berke Durak >> > > > ------------------------------------------------------------------------ > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 17:00 ` Robert Fischer @ 2008-05-31 17:24 ` Luca de Alfaro 2008-05-31 22:18 ` Martin Jambon 2008-05-31 17:25 ` blue storm ` (2 subsequent siblings) 3 siblings, 1 reply; 16+ messages in thread From: Luca de Alfaro @ 2008-05-31 17:24 UTC (permalink / raw) To: Robert Fischer; +Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 3928 bytes --] Is there a standard way to represent variant types in Json? As in: type edit = Ins of int * int | Del of int * int | Mov of int * int * int Using a list of two elements the first the name of the variant, the second the encoding of the variant itself? Is this the standard way? Luca On Sat, May 31, 2008 at 10:00 AM, Robert Fischer <robert@fischerventure.com> wrote: > How far is the reach from the Jane St S-exp library from producing JSON? > I've not actually looked > at it, but that'd be super nifty in the interoperation world. > > ~~ Robert. > > Luca de Alfaro wrote: > > Thanks for this insight... I imagined the lack of robustness of > Marshaling, > > but without all the details you mentioned!... actually, I DO desperately > > need speed, as I am processing TB's of Wikipedia data, but precisely > because > > the datasets are so large, I cannot afford having to recompute / convert > > them often, and so I want a robust format. Furthermore, I think the > > bottleneck for me is anyway the speed of mysql and the disk, not really > the > > small amount of time that natively compiled Ocaml would take for the > > conversion (I have anyway to do more complex computation that converting > a > > few lists and datatypes to ascii, unfortunately). Moreover, a plaintext > > format greatly helps debugging; it also helps that I can read the same > data > > with other programming languages. > > > > Speaking of debugging, and said in passing, I cannot say enough how much > I > > LOVE the ability of ocamldebug of executing code backwards. It is such a > > revelation. You simply go to the error, then back off a bit to see how > you > > got there. But, this is a topic for another thread. > > > > Many thanks, > > > > Luca > > > > > > On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> > wrote: > > > >> I second Luca's suggestion to use Sexplib. At the very least, use a > >> plaintext format. > >> Don't use Marshal for long-term storage of values. Avoid it if you > >> can. Been there, done that. > >> Why? > >> > >> (1) Not type-safe. Translation: your program *wil segfault* and you > >> won't know why. > >> (2) Not human-readable nor editable. > >> (3) Not future-proof. What happens if you change your type > >> definition? Your program > >> will segfault. So you'll have to migrate your data. But how? You'll > >> have to find > >> the exact revision used to generate the binary data. Good luck with > >> that. Did you put > >> a revision number in your data? Are you sure it was up-to-date? Then > >> you'll have to hand-write a converter that uses type declarations from > >> the old and the new modules. > >> I hope your dependencies are not too complex. Not fun *at all*. > >> > >> However, there are some situations where Marshal is appropriate : > >> > >> (1) Your data is not acyclic, contains closures, or needs sharing to > >> be compact enough. Sexplib doesn't handle these. > >> (2) The data won't live long anyway. As in: you're doing IPC between > >> known versions of Ocaml programs. > >> (3) You desperately need speed. As in: you're processing 200GB of > >> Wikipedia data. > >> Then I can understand. > >> -- > >> Berke Durak > >> > > > > > > ------------------------------------------------------------------------ > > > > _______________________________________________ > > Caml-list mailing list. Subscription management: > > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > > Archives: http://caml.inria.fr > > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > > Bug reports: http://caml.inria.fr/bin/caml-bugs > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > [-- Attachment #2: Type: text/html, Size: 5453 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 17:24 ` Luca de Alfaro @ 2008-05-31 22:18 ` Martin Jambon 0 siblings, 0 replies; 16+ messages in thread From: Martin Jambon @ 2008-05-31 22:18 UTC (permalink / raw) To: Luca de Alfaro; +Cc: Robert Fischer, caml-list On Sat, 31 May 2008, Luca de Alfaro wrote: > Is there a standard way to represent variant types in Json? > As in: > type edit = Ins of int * int | Del of int * int | Mov of int * int * int > > Using a list of two elements the first the name of the variant, the second > the encoding of the variant itself? > Is this the standard way? For this type definition using json-static: type json t = A | B of int | C of int * int | D of (int * int) here's the mapping: A -> "A" B 1 -> [ "B", 1 ] C (1, 2) -> [ "C", 1, 2 ] D (1, 2) -> [ "D", [ 1, 2 ] ] See http://martin.jambon.free.fr/json-static-readme.txt for more. It is totally not standard, because the world of mainstream programming languages ignores the notion of variants. Note that the option type uses null for None and x for Some x, which is very handy for loading foreign data, but has the problem of representing both None and Some None as null. Overall json-static or the conventions used by json-static are not usable for arbitrary OCaml data supported by Marshal. The purpose is to be able to exchange data with other applications that use JSON as well. There are lots of them and the big advantage of JSON is its great simplicity. And finally you don't get nude pictures when you look for "json" in Google... ;-) Martin > On Sat, May 31, 2008 at 10:00 AM, Robert Fischer <robert@fischerventure.com> > wrote: > >> How far is the reach from the Jane St S-exp library from producing JSON? >> I've not actually looked >> at it, but that'd be super nifty in the interoperation world. >> >> ~~ Robert. >> >> Luca de Alfaro wrote: >>> Thanks for this insight... I imagined the lack of robustness of >> Marshaling, >>> but without all the details you mentioned!... actually, I DO desperately >>> need speed, as I am processing TB's of Wikipedia data, but precisely >> because >>> the datasets are so large, I cannot afford having to recompute / convert >>> them often, and so I want a robust format. Furthermore, I think the >>> bottleneck for me is anyway the speed of mysql and the disk, not really >> the >>> small amount of time that natively compiled Ocaml would take for the >>> conversion (I have anyway to do more complex computation that converting >> a >>> few lists and datatypes to ascii, unfortunately). Moreover, a plaintext >>> format greatly helps debugging; it also helps that I can read the same >> data >>> with other programming languages. >>> >>> Speaking of debugging, and said in passing, I cannot say enough how much >> I >>> LOVE the ability of ocamldebug of executing code backwards. It is such a >>> revelation. You simply go to the error, then back off a bit to see how >> you >>> got there. But, this is a topic for another thread. >>> >>> Many thanks, >>> >>> Luca >>> >>> >>> On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> >> wrote: >>> >>>> I second Luca's suggestion to use Sexplib. At the very least, use a >>>> plaintext format. >>>> Don't use Marshal for long-term storage of values. Avoid it if you >>>> can. Been there, done that. >>>> Why? >>>> >>>> (1) Not type-safe. Translation: your program *wil segfault* and you >>>> won't know why. >>>> (2) Not human-readable nor editable. >>>> (3) Not future-proof. What happens if you change your type >>>> definition? Your program >>>> will segfault. So you'll have to migrate your data. But how? You'll >>>> have to find >>>> the exact revision used to generate the binary data. Good luck with >>>> that. Did you put >>>> a revision number in your data? Are you sure it was up-to-date? Then >>>> you'll have to hand-write a converter that uses type declarations from >>>> the old and the new modules. >>>> I hope your dependencies are not too complex. Not fun *at all*. >>>> >>>> However, there are some situations where Marshal is appropriate : >>>> >>>> (1) Your data is not acyclic, contains closures, or needs sharing to >>>> be compact enough. Sexplib doesn't handle these. >>>> (2) The data won't live long anyway. As in: you're doing IPC between >>>> known versions of Ocaml programs. >>>> (3) You desperately need speed. As in: you're processing 200GB of >>>> Wikipedia data. >>>> Then I can understand. >>>> -- >>>> Berke Durak >>>> >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> Caml-list mailing list. Subscription management: >>> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >>> Archives: http://caml.inria.fr >>> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >>> Bug reports: http://caml.inria.fr/bin/caml-bugs >> >> _______________________________________________ >> Caml-list mailing list. Subscription management: >> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list >> Archives: http://caml.inria.fr >> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners >> Bug reports: http://caml.inria.fr/bin/caml-bugs >> > -- http://wink.com/profile/mjambon http://mjambon.com ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 17:00 ` Robert Fischer 2008-05-31 17:24 ` Luca de Alfaro @ 2008-05-31 17:25 ` blue storm 2008-05-31 21:34 ` Berke Durak 2008-06-02 11:13 ` Richard Jones 3 siblings, 0 replies; 16+ messages in thread From: blue storm @ 2008-05-31 17:25 UTC (permalink / raw) To: Robert Fischer; +Cc: caml-list On 5/31/08, Robert Fischer <robert@fischerventure.com> wrote: > How far is the reach from the Jane St S-exp library from producing JSON? > I've not actually looked > at it, but that'd be super nifty in the interoperation world. You may be interested in the json-static syntax extension from Martin Jambon : http://martin.jambon.free.fr/json-static.html ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 17:00 ` Robert Fischer 2008-05-31 17:24 ` Luca de Alfaro 2008-05-31 17:25 ` blue storm @ 2008-05-31 21:34 ` Berke Durak 2008-05-31 22:51 ` Stefano Zacchiroli 2008-06-01 11:14 ` Martin Jambon 2008-06-02 11:13 ` Richard Jones 3 siblings, 2 replies; 16+ messages in thread From: Berke Durak @ 2008-05-31 21:34 UTC (permalink / raw) To: Robert Fischer; +Cc: caml-list On Sat, May 31, 2008 at 7:00 PM, Robert Fischer <robert@fischerventure.com> wrote: > How far is the reach from the Jane St S-exp library from producing JSON? I've not actually looked at it, but that'd be super nifty in the interoperation world. If you just want JSON syntax, you can use Sexplib to convert an arbitrary type to a Sexp.t type t = Atom of string | List of t list and then output in Json format: let rec output_json oc = function | Atom u -> fprintf oc "%S" u | List xl -> fprintf oc "[%a]" (fun oc xl -> List.iter (fun x -> fprintf "%a," output_json x) xl) xl You can then do the same thing for parsing. -- Berke ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 21:34 ` Berke Durak @ 2008-05-31 22:51 ` Stefano Zacchiroli 2008-06-02 9:04 ` Berke Durak 2008-06-01 11:14 ` Martin Jambon 1 sibling, 1 reply; 16+ messages in thread From: Stefano Zacchiroli @ 2008-05-31 22:51 UTC (permalink / raw) To: caml-list On Sat, May 31, 2008 at 11:34:34PM +0200, Berke Durak wrote: > and then output in Json format: Aren't you being naive about escaping needs here? I don't know the details of the two involved languages, but it would mean to be very lucky if the escaping conventions are the same ... Cheers. -- Stefano Zacchiroli -*- PhD in Computer Science ............... now what? zack@{upsilon.cc,cs.unibo.it,debian.org} -<%>- http://upsilon.cc/zack/ (15:56:48) Zack: e la demo dema ? /\ All one has to do is hit the (15:57:15) Bac: no, la demo scema \/ right keys at the right time ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 22:51 ` Stefano Zacchiroli @ 2008-06-02 9:04 ` Berke Durak 2008-06-02 9:21 ` Stefano Zacchiroli 0 siblings, 1 reply; 16+ messages in thread From: Berke Durak @ 2008-06-02 9:04 UTC (permalink / raw) To: caml-list On Sun, Jun 1, 2008 at 12:51 AM, Stefano Zacchiroli <zack@upsilon.cc> wrote: > On Sat, May 31, 2008 at 11:34:34PM +0200, Berke Durak wrote: >> and then output in Json format: > > Aren't you being naive about escaping needs here? > > I don't know the details of the two involved languages, but it would > mean to be very lucky if the escaping conventions are the same ... Of course, the code was intended for illustrative purposes. JSON escapes characters based using 4-digit Unicode hex codes. Except for \n, \r, etc. sot it would probably work for ASCII. Martin Jambon: > You won't obtain anything useful if you treat Atoms as JSON strings and Lists as JSON arrays because JSON has also null, numbers, booleans and objects. The real issue is that records are mapped to lists of lists, making lookup difficult and cumbersome. But that's still JSON syntax, formally... You could theoretically write a piece of code to "recognize" a record and emit a Json object but that wouldn't be very elegant. > This is the JSON standard: http://www.json.org/ I know, we are using Json (but not Json-wheel) to pass the annotated syntax tree from the C legacy front-end to the Ocaml JVM backend. However we are using Sexplib for all the internal serialization needs (mostly for debugging) since it integrates so well with Ocaml. > And that is the concrete type used to represent JSON trees in the json-wheel library (which json-static uses): That's a blessing but also a curse. You retain more information on your format, but that also complexifies anything that manipulates syntax trees. The nice thing about Sexp is that its Path module provides a small manipulation language for migrating your data. If that's not enough, you can always load it in any Scheme or Lisp and spit it back. -- Berke Durak ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-06-02 9:04 ` Berke Durak @ 2008-06-02 9:21 ` Stefano Zacchiroli 0 siblings, 0 replies; 16+ messages in thread From: Stefano Zacchiroli @ 2008-06-02 9:21 UTC (permalink / raw) To: caml-list On Mon, Jun 02, 2008 at 11:04:58AM +0200, Berke Durak wrote: > > I don't know the details of the two involved languages, but it would > > mean to be very lucky if the escaping conventions are the same ... > Of course, the code was intended for illustrative purposes. JSON > escapes characters based using 4-digit Unicode hex codes. > Except for \n, \r, etc. sot it would probably work for ASCII. OK then, but the original question was "how far" is Sexplib to obtain JSON and the answer then should be "still a bit far", it is not just a function of a couple of lines. You have to write some code to obtain fully compliant JSON. Instead of asking people to write it over and over again, it would probably be a good idea to provide a patch for Sexplib adding serialization capabilities towards JSON. That assuming, of course, that Sexplib authors are interested in integrating such a feature. Cheers. -- Stefano Zacchiroli -*- PhD in Computer Science \ PostDoc @ Univ. Paris 7 zack@{upsilon.cc,pps.jussieu.fr,debian.org} -<>- http://upsilon.cc/zack/ I'm still an SGML person,this newfangled /\ All one has to do is hit the XML stuff is so ... simplistic -- Manoj \/ right keys at the right time ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 21:34 ` Berke Durak 2008-05-31 22:51 ` Stefano Zacchiroli @ 2008-06-01 11:14 ` Martin Jambon 1 sibling, 0 replies; 16+ messages in thread From: Martin Jambon @ 2008-06-01 11:14 UTC (permalink / raw) To: Berke Durak; +Cc: Robert Fischer, caml-list On Sat, 31 May 2008, Berke Durak wrote: > On Sat, May 31, 2008 at 7:00 PM, Robert Fischer > <robert@fischerventure.com> wrote: >> How far is the reach from the Jane St S-exp library from producing JSON? I've not actually looked at it, but that'd be super nifty in the interoperation world. > > If you just want JSON syntax, you can use Sexplib to convert an > arbitrary type to a > Sexp.t > > type t = Atom of string | List of t list > > and then output in Json format: > > let rec output_json oc = function > | Atom u -> fprintf oc "%S" u > | List xl -> fprintf oc "[%a]" (fun oc xl -> List.iter (fun x -> > fprintf "%a," output_json x) xl) xl > > You can then do the same thing for parsing. You won't obtain anything useful if you treat Atoms as JSON strings and Lists as JSON arrays because JSON has also null, numbers, booleans and objects. This is the JSON standard: http://www.json.org/ And that is the concrete type used to represent JSON trees in the json-wheel library (which json-static uses): type json_type = Object of (string * json_type) list | Array of json_type list | String of string | Int of int | Float of float | Bool of bool | Null Cheers, Martin -- http://wink.com/profile/mjambon http://mjambon.com ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 17:00 ` Robert Fischer ` (2 preceding siblings ...) 2008-05-31 21:34 ` Berke Durak @ 2008-06-02 11:13 ` Richard Jones 3 siblings, 0 replies; 16+ messages in thread From: Richard Jones @ 2008-06-02 11:13 UTC (permalink / raw) To: Robert Fischer; +Cc: caml-list On Sat, May 31, 2008 at 12:00:17PM -0500, Robert Fischer wrote: > How far is the reach from the Jane St S-exp library from producing > JSON? I've not actually looked at it, but that'd be super nifty in > the interoperation world. It's worth noting: http://martin.jambon.free.fr/json-wheel.html http://code.google.com/p/deriving/ and I guess maybe even: http://code.google.com/p/bitmatch/ if you wanted a way to generate and parse a stable binary format. Rich. -- Richard Jones Red Hat ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Caml-list] picking / marshaling to strings in ocaml-revision-stable way 2008-05-31 16:54 ` Luca de Alfaro 2008-05-31 17:00 ` Robert Fischer @ 2008-05-31 17:06 ` Yaron Minsky 1 sibling, 0 replies; 16+ messages in thread From: Yaron Minsky @ 2008-05-31 17:06 UTC (permalink / raw) To: Caml-list List [-- Attachment #1: Type: text/plain, Size: 3042 bytes --] If you're willing to sacrifice readability for speed and compactness, you might want to consider jane street's bin-prot library as well... Yaron Minsky On May 31, 2008, at 12:54 PM, Luca de Alfaro <luca@dealfaro.org> wrote: > Thanks for this insight... I imagined the lack of robustness of > Marshaling, but without all the details you mentioned!... actually, > I DO desperately need speed, as I am processing TB's of Wikipedia > data, but precisely because the datasets are so large, I cannot > afford having to recompute / convert them often, and so I want a > robust format. Furthermore, I think the bottleneck for me is anyway > the speed of mysql and the disk, not really the small amount of time > that natively compiled Ocaml would take for the conversion (I have > anyway to do more complex computation that converting a few lists > and datatypes to ascii, unfortunately). Moreover, a plaintext > format greatly helps debugging; it also helps that I can read the > same data with other programming languages. > > Speaking of debugging, and said in passing, I cannot say enough how > much I LOVE the ability of ocamldebug of executing code backwards. > It is such a revelation. You simply go to the error, then back off > a bit to see how you got there. But, this is a topic for another > thread. > > Many thanks, > > Luca > > > On Sat, May 31, 2008 at 2:38 AM, Berke Durak <berke.durak@gmail.com> > wrote: > I second Luca's suggestion to use Sexplib. At the very least, use a > plaintext format. > Don't use Marshal for long-term storage of values. Avoid it if you > can. Been there, done that. > Why? > > (1) Not type-safe. Translation: your program *wil segfault* and you > won't know why. > (2) Not human-readable nor editable. > (3) Not future-proof. What happens if you change your type > definition? Your program > will segfault. So you'll have to migrate your data. But how? You'll > have to find > the exact revision used to generate the binary data. Good luck with > that. Did you put > a revision number in your data? Are you sure it was up-to-date? Then > you'll have to hand-write a converter that uses type declarations from > the old and the new modules. > I hope your dependencies are not too complex. Not fun *at all*. > > However, there are some situations where Marshal is appropriate : > > (1) Your data is not acyclic, contains closures, or needs sharing to > be compact enough. Sexplib doesn't handle these. > (2) The data won't live long anyway. As in: you're doing IPC between > known versions of Ocaml programs. > (3) You desperately need speed. As in: you're processing 200GB of > Wikipedia data. > Then I can understand. > -- > Berke Durak > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs [-- Attachment #2: Type: text/html, Size: 4174 bytes --] ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2008-06-02 11:13 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-05-31 6:43 picking / marshaling to strings in ocaml-revision-stable way Luca de Alfaro 2008-05-31 7:24 ` [Caml-list] " asmadeus77 2008-05-31 8:43 ` Jacques Garrigue 2008-05-31 9:38 ` Berke Durak 2008-05-31 16:54 ` Luca de Alfaro 2008-05-31 17:00 ` Robert Fischer 2008-05-31 17:24 ` Luca de Alfaro 2008-05-31 22:18 ` Martin Jambon 2008-05-31 17:25 ` blue storm 2008-05-31 21:34 ` Berke Durak 2008-05-31 22:51 ` Stefano Zacchiroli 2008-06-02 9:04 ` Berke Durak 2008-06-02 9:21 ` Stefano Zacchiroli 2008-06-01 11:14 ` Martin Jambon 2008-06-02 11:13 ` Richard Jones 2008-05-31 17:06 ` Yaron Minsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox