* [Caml-list] String Problem @ 2004-06-09 9:58 Thomas Fischbacher 2004-06-09 10:35 ` Olivier Andrieu 0 siblings, 1 reply; 3+ messages in thread From: Thomas Fischbacher @ 2004-06-09 9:58 UTC (permalink / raw) To: caml-list Dear Caml hackers, I am just doing some quite large (string theory) calculation which basically runs through a huge tree and does some computation at every node in ocaml which I have to parallelize in an effective way. My present approach is to set an alarm for the process doing the calculation, then splitting into chunks and serializing all the work that corresponds to nodes that have been touched but for which the calculation has not yet been finished. The serialized strings are then compressed and sent out via the net to other machines to help with the calculation. I'd love to avoid temporary files, as these are not necessary, and my design is simpler and cleaner without having to worry about filesystem issues. Now I encounter the problem that ocaml can only serialize to strings, but these are limited to 16 MB in size. If my data set (which is structured in a complicated way, i.e. it would be quite some effort to write specialized readers and printers) gets large enough, this entire approach therefore breaks down. So, would it be that much of a problem to take the length information for strings out of the type word (I suppose that's the problem here) and use a proper 32-bit quantity on 32-bit machines here? I simply cannot believe it's not many more people experiencing similar difficulties with this 16 MB limitation on string lengths. -- regards, tf@cip.physik.uni-muenchen.de (o_ Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\ (lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_ (if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] String Problem 2004-06-09 9:58 [Caml-list] String Problem Thomas Fischbacher @ 2004-06-09 10:35 ` Olivier Andrieu 2004-06-09 11:05 ` Thomas Fischbacher 0 siblings, 1 reply; 3+ messages in thread From: Olivier Andrieu @ 2004-06-09 10:35 UTC (permalink / raw) To: Thomas.Fischbacher; +Cc: caml-list Thomas Fischbacher [Wed, 9 Jun 2004]: > > Dear Caml hackers, > > I am just doing some quite large (string theory) calculation which > basically runs through a huge tree and does some computation at > every node in ocaml which I have to parallelize in an effective > way. My present approach is to set an alarm for the process doing > the calculation, then splitting into chunks and serializing all the > work that corresponds to nodes that have been touched but for which > the calculation has not yet been finished. The serialized strings > are then compressed and sent out via the net to other machines to > help with the calculation. > > I'd love to avoid temporary files, as these are not necessary, and > my design is simpler and cleaner without having to worry about > filesystem issues. > > Now I encounter the problem that ocaml can only serialize to > strings, but these are limited to 16 MB in size. If my data set > (which is structured in a complicated way, i.e. it would be quite > some effort to write specialized readers and printers) gets large > enough, this entire approach therefore breaks down. It's quite easy to serialize to a Bigarray with a bit of C code (warning, not tested): ,---- | #include "intext.h" | #include "bigarray.h" | | CAMLprim value ml_marshal_to_bigarray(value v, value flags) | { | char *buf; | long len; | output_value_to_malloc(v, flags, &buf, &len); | return alloc_bigarray(BIGARRAY_UINT8 | BIGARRAY_C_LAYOUT | BIGARRAY_MANAGED, | 1, buf, &len); | } | | CAMLprim value ml_demarshal_from_bigarray(value b) | { | struct caml_bigarray *b_arr = Bigarray_val(b); | return input_value_from_block(b_arr->data, b_arr->dim[0]); | } `---- ,---- | open Bigarray | | external marshal_to_bigarray : | 'a -> Marshal.extern_flags list -> | (char, int8_unsigned_elt, c_layout) Array1.t | = "ml_marshal_to_bigarray" | | external demarshal_from_bigarray : | (char, int8_unsigned_elt, c_layout) Array1.t -> 'a | = "ml_demarshal_from_bigarray" `---- Alternatively, buy a 64 bits computer :) -- Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [Caml-list] String Problem 2004-06-09 10:35 ` Olivier Andrieu @ 2004-06-09 11:05 ` Thomas Fischbacher 0 siblings, 0 replies; 3+ messages in thread From: Thomas Fischbacher @ 2004-06-09 11:05 UTC (permalink / raw) To: Olivier Andrieu; +Cc: caml-list On Wed, 9 Jun 2004, Olivier Andrieu wrote: > It's quite easy to serialize to a Bigarray with a bit of C code > (warning, not tested): At least the language should provide out-of-the-box support for this. > Alternatively, buy a 64 bits computer :) How funny. The only reason why I am doing this in Ocaml is that I want to be able to abuse a few of the windows boxen here to help with the calculation. Otherwise, I'd have done it in LISP. -- regards, tf@cip.physik.uni-muenchen.de (o_ Thomas Fischbacher - http://www.cip.physik.uni-muenchen.de/~tf //\ (lambda (n) ((lambda (p q r) (p p q r)) (lambda (g x y) V_/_ (if (= x 0) y (g g (- x 1) (* x y)))) n 1)) (Debian GNU) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2004-06-09 11:05 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-06-09 9:58 [Caml-list] String Problem Thomas Fischbacher 2004-06-09 10:35 ` Olivier Andrieu 2004-06-09 11:05 ` Thomas Fischbacher
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox