* [Caml-list] Gripes with array @ 2004-09-09 2:10 Jon Harrop 2004-09-09 5:08 ` Ville-Pertti Keinonen ` (3 more replies) 0 siblings, 4 replies; 31+ messages in thread From: Jon Harrop @ 2004-09-09 2:10 UTC (permalink / raw) To: caml-list I'm increasingly finding the outrageously finite size limit of arrays to be a pain. In particular, I'm peeved that the size limit is itself a function of the type which, therefore, makes writing polymorphic functions over arrays nay-on impossible (e.g. to make an array of maximum-sized arrays). Can anything be done about this? Am I right in thinking that the maximum non-float array size on a 64-bit machine is 18,014,398,509,481,983? Also, can Array.init be made to fill the elements only once? This would make quite a few things twice as fast (Indeed, I'd always assumed that this was the point of having Array.init, having read some of Skaller's previous ramblings ;-). Array.copy could then be written more succinctly and efficiently in terms of Array.init as: let copy a = init (length a) (fun i -> a.(i)) Does anyone have any pointers to information about the origin of the size limit for arrays? I assume it is something to do with the garbage collector using a fixed-size tag instead of a variable-size one but I'd be interested in the details. Cheers, Jon. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 2:10 [Caml-list] Gripes with array Jon Harrop @ 2004-09-09 5:08 ` Ville-Pertti Keinonen 2004-09-09 7:17 ` Jean-Christophe Filliatre ` (2 subsequent siblings) 3 siblings, 0 replies; 31+ messages in thread From: Ville-Pertti Keinonen @ 2004-09-09 5:08 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop wrote: >Does anyone have any pointers to information about the origin of the size >limit for arrays? I assume it is something to do with the garbage collector >using a fixed-size tag instead of a variable-size one but I'd be interested >in the details. > > You're correct, see byterun/mlvalues.h for details. I suspect variable-length tags would probably increase complexity and decrease performance considerably...and personally I hope we'll all be migrating to 64-bit architectures soon, anyhow. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 2:10 [Caml-list] Gripes with array Jon Harrop 2004-09-09 5:08 ` Ville-Pertti Keinonen @ 2004-09-09 7:17 ` Jean-Christophe Filliatre 2004-09-09 8:23 ` Richard Jones 2004-09-09 9:37 ` Damien Doligez 2004-09-10 23:48 ` brogoff 3 siblings, 1 reply; 31+ messages in thread From: Jean-Christophe Filliatre @ 2004-09-09 7:17 UTC (permalink / raw) To: Jon Harrop; +Cc: caml-list Jon Harrop writes: > > Does anyone have any pointers to information about the origin of the size > limit for arrays? I assume it is something to do with the garbage collector > using a fixed-size tag instead of a variable-size one but I'd be interested > in the details. In ocaml sources, the file byterun/mlvalues.h gives all details about the block header structure. There you can see that, on 32 bits architecture, the block size (in words) is stored on 22 bits. And indeed Sys.max_array_length is equal to 2^22-1. But I must agree with you: this is definitely too small and we could imagine that, when the tag says a block is an array, the size is stored within the first (or the last) field instead. -- Jean-Christophe Filliâtre ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 7:17 ` Jean-Christophe Filliatre @ 2004-09-09 8:23 ` Richard Jones 2004-09-09 9:08 ` Olivier Andrieu 2004-09-09 10:42 ` Gerd Stolpmann 0 siblings, 2 replies; 31+ messages in thread From: Richard Jones @ 2004-09-09 8:23 UTC (permalink / raw) Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 1668 bytes --] On Thu, Sep 09, 2004 at 09:17:25AM +0200, Jean-Christophe Filliatre wrote: > > Jon Harrop writes: > > > > Does anyone have any pointers to information about the origin of the size > > limit for arrays? I assume it is something to do with the garbage collector > > using a fixed-size tag instead of a variable-size one but I'd be interested > > in the details. > > In ocaml sources, the file byterun/mlvalues.h gives all details about > the block header structure. There you can see that, on 32 bits > architecture, the block size (in words) is stored on 22 bits. And > indeed Sys.max_array_length is equal to 2^22-1. > > But I must agree with you: this is definitely too small and we could > imagine that, when the tag says a block is an array, the size is > stored within the first (or the last) field instead. I have a similar problem with the maximum size of strings. In practical terms, it limits the size of file uploads to COCANWIKI to around 6 MB (ie., not very much) [not the full 16 MB because of character escaping, but even 16 MB would be far too small]. Does the tag field need to be so wide? What does the tag mean if it has different values < No_scan_tag (251)? Agree with the comment about us all migrating to 64 bit architectures very soon, although I've been waiting to do this since around '92. Rich. -- Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment MOD_CAML lets you run type-safe Objective CAML programs inside the Apache webserver. http://www.merjis.com/developers/mod_caml/ [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 8:23 ` Richard Jones @ 2004-09-09 9:08 ` Olivier Andrieu 2004-09-09 12:08 ` Basile Starynkevitch [local] 2004-09-09 10:42 ` Gerd Stolpmann 1 sibling, 1 reply; 31+ messages in thread From: Olivier Andrieu @ 2004-09-09 9:08 UTC (permalink / raw) To: rich; +Cc: caml-list Richard Jones [Thu, 9 Sep 2004]: > On Thu, Sep 09, 2004 at 09:17:25AM +0200, Jean-Christophe Filliatre wrote: > > > > Jon Harrop writes: > > > > > > Does anyone have any pointers to information about the origin of the size > > > limit for arrays? I assume it is something to do with the garbage collector > > > using a fixed-size tag instead of a variable-size one but I'd be interested > > > in the details. > > > > In ocaml sources, the file byterun/mlvalues.h gives all details about > > the block header structure. There you can see that, on 32 bits > > architecture, the block size (in words) is stored on 22 bits. And > > indeed Sys.max_array_length is equal to 2^22-1. > > > > But I must agree with you: this is definitely too small and we could > > imagine that, when the tag says a block is an array, the size is > > stored within the first (or the last) field instead. > > I have a similar problem with the maximum size of strings. In > practical terms, it limits the size of file uploads to COCANWIKI to > around 6 MB (ie., not very much) [not the full 16 MB because of > character escaping, but even 16 MB would be far too small]. You can use Bigarrays: open Bigarray type bigstring = (char, int8_unsigned_elt, c_layout) Array1.t all you need is to write some blitting functions for conversions to and from regular strings. > Does the tag field need to be so wide? What does the tag mean if it > has different values < No_scan_tag (251)? it's for variants (with or without arguments) -- Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 9:08 ` Olivier Andrieu @ 2004-09-09 12:08 ` Basile Starynkevitch [local] 2004-09-09 12:31 ` Damien Doligez 0 siblings, 1 reply; 31+ messages in thread From: Basile Starynkevitch [local] @ 2004-09-09 12:08 UTC (permalink / raw) To: caml-list On Thu, Sep 09, 2004 at 11:08:50AM +0200, Olivier Andrieu wrote: > Richard Jones [Thu, 9 Sep 2004]: > > On Thu, Sep 09, 2004 at 09:17:25AM +0200, Jean-Christophe Filliatre wrote: > > > > > > Jon Harrop writes: > > > > > > > > Does anyone have any pointers to information about the origin of the size > > > > limit for arrays? [....] > > > > > > In ocaml sources, the file byterun/mlvalues.h gives all details about > > > the block header structure. [....] > > > > > > But I must agree with you: this is definitely too small and we could > > > imagine that, when the tag says a block is an array, the size is > > > stored within the first (or the last) field instead. I do agree that using Bigarrays is the way to go, until everyone has a 64 bits machine. For completeness, we could also consider the following scheme (which I am NOT volunteering to implement!) The header layout remains the same (so only 22 bits for size on 32 bits machine), but if the size is all bit ones, the block is actually a fixed block, and the real array size is the word before the header. However, I see another potential problem (which could already potentially appear today, with standard 0caml 3.08, on 64 bits machines with more than 1Gbyte of RAM). If you have a huge array of pointers, the garbage collector (even the minor one) has to scan the full array - and this scan is "atomic" in the sense that it is not interruptible (and I believe that designing a GC which incrementally scans big values by chunks is not trivial, given Ocaml GC performance needs). So for an array of say 300 million pointers, the GC has to scan it, which takes a significant time (I would guess several tenths of seconds at least, just scanning this single array). I am asking, do lucky people with a 64 bits machines and plenty of RAM did experiment some bizarre GC behavior when handling such monster pointer arrays in Ocaml - for example, let monster = Array.init 300_000_000 (fun i -> ((Printf.sprintf "int%d" i), i)) More practically, I would be curious to hear from people having run Ocaml programs (on rather big 64 bits hardware, with multigigabyte RAM) in processes of more than a gigabyte of Ocaml heap! Does the GC works well in its default setting, or have they to tune it? > > I have a similar problem with the maximum size of strings. In > > practical terms, it limits the size of file uploads to COCANWIKI to > > around 6 MB (ie., not very much) [not the full 16 MB because of > > character escaping, but even 16 MB would be far too small]. [....] FWIW, I had similar limitations in Poesia more than a year ago. I solved it by specifying that Poesia (a web filter) won't handle web content of more than ten million bytes. (Maybe an enhanced buffer package, representing buffer contents in array of strings, could help). > > > Does the tag field need to be so wide? What does the tag mean if it > > has different values < No_scan_tag (251)? > > it's for variants (with or without arguments) Apparently, people are less bitten by the maximal number of variants. I guess that most applications don't have big sum (ie variant) types with more than a hundred of non-empty choices (ie Choice of ... construct). Regarding very big data structures, I tend to believe that they should be more organized than just a single monster array (hence the current array limits on 32 bits machine is a sensible tradeoff), even on 64 bits irons. But I have no practical experience on these. -- Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr Project cristal.inria.fr - temporarily http://cristal.inria.fr/~starynke --- all opinions are only mine ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 12:08 ` Basile Starynkevitch [local] @ 2004-09-09 12:31 ` Damien Doligez 0 siblings, 0 replies; 31+ messages in thread From: Damien Doligez @ 2004-09-09 12:31 UTC (permalink / raw) To: caml users On Sep 9, 2004, at 14:08, Basile Starynkevitch [local] wrote: > The header layout remains the same (so only 22 bits for size on 32 > bits machine), but if the size is all bit ones, the block is > actually a fixed block, and the real array size is the word before > the header. The real size would have to be after the header, not before it, because you cannot store anything before the header. > If you have a huge array of > pointers, the garbage collector (even the minor one) has to scan the > full array - and this scan is "atomic" in the sense that it is not > interruptible (and I believe that designing a GC which incrementally > scans big values by chunks is not trivial, given Ocaml GC performance > needs). The minor GC will never scan such a huge array because you won't find it in the minor heap. What you say is still true of the major GC. > Regarding very big data structures, I tend to believe that they should > be more organized than just a single monster array (hence the current > array limits on 32 bits machine is a sensible tradeoff), even on 64 > bits irons. But I have no practical experience on these. I fixed a bug in a camlp4 lexer recently, where it needed an (extensible) array larger than 16 million entries. Implementing an array of arrays increased the code size by as much as 6 lines... -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 8:23 ` Richard Jones 2004-09-09 9:08 ` Olivier Andrieu @ 2004-09-09 10:42 ` Gerd Stolpmann 1 sibling, 0 replies; 31+ messages in thread From: Gerd Stolpmann @ 2004-09-09 10:42 UTC (permalink / raw) To: Richard Jones; +Cc: caml-list Am Don, 2004-09-09 um 10.23 schrieb Richard Jones: > On Thu, Sep 09, 2004 at 09:17:25AM +0200, Jean-Christophe Filliatre wrote: > > > > Jon Harrop writes: > > > > > > Does anyone have any pointers to information about the origin of the size > > > limit for arrays? I assume it is something to do with the garbage collector > > > using a fixed-size tag instead of a variable-size one but I'd be interested > > > in the details. > > > > In ocaml sources, the file byterun/mlvalues.h gives all details about > > the block header structure. There you can see that, on 32 bits > > architecture, the block size (in words) is stored on 22 bits. And > > indeed Sys.max_array_length is equal to 2^22-1. > > > > But I must agree with you: this is definitely too small and we could > > imagine that, when the tag says a block is an array, the size is > > stored within the first (or the last) field instead. > > I have a similar problem with the maximum size of strings. In > practical terms, it limits the size of file uploads to COCANWIKI to > around 6 MB (ie., not very much) [not the full 16 MB because of > character escaping, but even 16 MB would be far too small]. You can switch to OCamlnet as base library which does not have this limitation. One can directly write the uploaded data to disk. Gerd -- ------------------------------------------------------------ Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de ------------------------------------------------------------ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 2:10 [Caml-list] Gripes with array Jon Harrop 2004-09-09 5:08 ` Ville-Pertti Keinonen 2004-09-09 7:17 ` Jean-Christophe Filliatre @ 2004-09-09 9:37 ` Damien Doligez 2004-09-09 10:34 ` Jean-Christophe Filliatre ` (2 more replies) 2004-09-10 23:48 ` brogoff 3 siblings, 3 replies; 31+ messages in thread From: Damien Doligez @ 2004-09-09 9:37 UTC (permalink / raw) To: caml users On Sep 9, 2004, at 04:10, Jon Harrop wrote: > I'm increasingly finding the outrageously finite size limit of arrays > to be a > pain. How I would like to get rid of that limit! > Am I right in thinking that the maximum > non-float array size on a 64-bit machine is 18,014,398,509,481,983? That's correct. It's a good thing that 32-bitters are on their way out. > Also, can Array.init be made to fill the elements only once? No, that's impossible without breaking the GC invariants. > This would make quite a few things twice as fast Twice? I doubt it very much. > Array.copy could then be written more succinctly and > efficiently in terms of Array.init as: > > let copy a = init (length a) (fun i -> a.(i)) Exactly how it's written now, except that it's inlined by hand for performance reasons. > Does anyone have any pointers to information about the origin of the > size > limit for arrays? I assume it is something to do with the garbage > collector > using a fixed-size tag instead of a variable-size one but I'd be > interested > in the details. Yes, but the main use of the tag is not garbage collection. On Sep 9, 2004, at 09:17, Jean-Christophe Filliatre wrote: > But I must agree with you: this is definitely too small and we could > imagine that, when the tag says a block is an array, the size is > stored within the first (or the last) field instead. The last, really? On Sep 9, 2004, at 10:23, Richard Jones wrote: > I have a similar problem with the maximum size of strings. In > practical terms, it limits the size of file uploads to COCANWIKI to > around 6 MB (ie., not very much) [not the full 16 MB because of > character escaping, but even 16 MB would be far too small]. Maybe you should use bigarrays instead of strings. A general remark: for more details on the internal representation of O'Caml values, you can read < http://caml.inria.fr/ocaml/htmlman/manual032.html >. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 9:37 ` Damien Doligez @ 2004-09-09 10:34 ` Jean-Christophe Filliatre 2004-09-09 12:15 ` Igor Pechtchanski 2004-09-09 13:01 ` Brian Hurt 2004-09-09 16:58 ` [Caml-list] Gripes with array Jon Harrop 2 siblings, 1 reply; 31+ messages in thread From: Jean-Christophe Filliatre @ 2004-09-09 10:34 UTC (permalink / raw) To: Damien Doligez; +Cc: caml users Damien Doligez writes: > > > But I must agree with you: this is definitely too small and we could > > imagine that, when the tag says a block is an array, the size is > > stored within the first (or the last) field instead. > > The last, really? How stupid I am! I was thinking of not adding an extra addition to the array access, keeping the first array element at field 0 but it is of course ridiculous. -- Jean-Christophe ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 10:34 ` Jean-Christophe Filliatre @ 2004-09-09 12:15 ` Igor Pechtchanski 0 siblings, 0 replies; 31+ messages in thread From: Igor Pechtchanski @ 2004-09-09 12:15 UTC (permalink / raw) To: Jean-Christophe Filliatre; +Cc: caml-list On Thu, 9 Sep 2004, Jean-Christophe Filliatre wrote: > Damien Doligez writes: > > > > > But I must agree with you: this is definitely too small and we could > > > imagine that, when the tag says a block is an array, the size is > > > stored within the first (or the last) field instead. > > > > The last, really? > > How stupid I am! I was thinking of not adding an extra addition to the > array access, keeping the first array element at field 0 but it is of > course ridiculous. I believe the usual solution for this is to store the array size at negative offset from the array header, but that changes the heap layout slightly, and affects the GC logic, among other things... Igor -- http://cs.nyu.edu/~pechtcha/ |\ _,,,---,,_ pechtcha@cs.nyu.edu ZZZzz /,`.-'`' -. ;-;;,_ igor@watson.ibm.com |,4- ) )-,_. ,\ ( `'-' Igor Pechtchanski, Ph.D. '---''(_/--' `-'\_) fL a.k.a JaguaR-R-R-r-r-r-.-.-. Meow! "Happiness lies in being privileged to work hard for long hours in doing whatever you think is worth doing." -- Dr. Jubal Harshaw ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 9:37 ` Damien Doligez 2004-09-09 10:34 ` Jean-Christophe Filliatre @ 2004-09-09 13:01 ` Brian Hurt 2004-09-09 20:08 ` [Caml-list] 32-bit is sticking around Brandon J. Van Every 2004-09-09 16:58 ` [Caml-list] Gripes with array Jon Harrop 2 siblings, 1 reply; 31+ messages in thread From: Brian Hurt @ 2004-09-09 13:01 UTC (permalink / raw) To: Damien Doligez; +Cc: caml users On Thu, 9 Sep 2004, Damien Doligez wrote: > That's correct. It's a good thing that 32-bitters are on their way out. Absent the x86, 32-bitters are already out on servers. They're on their way out on desktops. But they are still huge in the embedded world, and likely to remain so for quite some time. -- "Usenet is like a herd of performing elephants with diarrhea -- massive, difficult to redirect, awe-inspiring, entertaining, and a source of mind-boggling amounts of excrement when you least expect it." - Gene Spafford Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* [Caml-list] 32-bit is sticking around 2004-09-09 13:01 ` Brian Hurt @ 2004-09-09 20:08 ` Brandon J. Van Every 2004-09-09 21:04 ` Jon Harrop 0 siblings, 1 reply; 31+ messages in thread From: Brandon J. Van Every @ 2004-09-09 20:08 UTC (permalink / raw) To: caml-list Brian Hurt wrote: > > Absent the x86, 32-bitters are already out on servers. > They're on their > way out on desktops. But they are still huge in the embedded > world, and likely to remain so for quite some time. Saying that 32-bit is "on the way out" on the desktop is tremendous hyperbole. What's really happening is soon we'll have a forward path on the x86 desktop *to* 64-bit. That ain't here yet in volume, and the whole point of the AMD architecture is assuring backwards compatibility for 32-bit. It's going to be 5 years before 64-bit is happening on the desktop in any dominant quantity, and 10 years before 32-bit actually goes away. To think otherwise is just moooing about what you want to happen rather than what is actually going to happen and has always happened. Consider that only in Longhorn will Microsoft finally kill *16*-bit! You guys can all dream on about your 64-bit machines. I mastered 64-bit on the DEC Alpha in 1996. Then Compaq and Intel killed it. I'm still sad about that, and I hate Intel ASM. The register poverty *sucks* ! The x87 FPU stack *sucks* ! SSE *sucks* ! Intel has always won by getting shoddy products to market quickly, never by producing anything that's any good. They don't call it "the Wintel hegemony" for nuthin', it's the hardware equivalent of Microsoft's business model. Cheers, www.indiegamedesign.com Brand*n Van Every S*attle, WA Praise Be to the caml-list Bayesian filter! It blesseth my postings, it is evil crap! evil crap! Bigarray! Unboxed overhead group! Wondering! chant chant chant... Is my technical content showing? // return an array of 100 packed tuples temps int $[tvar0][2*100]; // what the c function needs value $[tvar1]; // one int value $[tvar2]; // one tuple int $[tvar3] // loop control var oncePre eachPre $[cvar0]=&($[tvar0][0]); eachPost $[lvar0] = alloc(2*100, 0 /*NB: zero-tagged block*/ ); for(int $[tvar3]=0;$[tvar3]<100;$[tvar3]++) { $[tvar2] = alloc_tuple(2); $[tvar1] = Val_int($[cvar0][0+2*$[tvar3]]); Store_field($[tvar2],0,$[tvar1]); $[tvar1] = Val_int($[cvar0][1]); Store_field($[tvar2],1,$[tvar1+2*$[tvar3]]); Array_store($[lvar0],$[tvar3],$[tvar0]); } oncePost ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] 32-bit is sticking around 2004-09-09 20:08 ` [Caml-list] 32-bit is sticking around Brandon J. Van Every @ 2004-09-09 21:04 ` Jon Harrop 2004-09-11 15:30 ` Lars Nilsson 0 siblings, 1 reply; 31+ messages in thread From: Jon Harrop @ 2004-09-09 21:04 UTC (permalink / raw) To: Brandon J. Van Every; +Cc: caml-list On Thursday 09 September 2004 21:08, Brandon J. Van Every wrote: > Intel has always won by getting shoddy products to market quickly, never by > producing anything that's any good. StrongARM? Cheers, Jon. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] 32-bit is sticking around 2004-09-09 21:04 ` Jon Harrop @ 2004-09-11 15:30 ` Lars Nilsson 2004-09-11 16:24 ` [off topic] " David MENTRE [not found] ` <200409111656.11952.jon@jdh30.plus.com> 0 siblings, 2 replies; 31+ messages in thread From: Lars Nilsson @ 2004-09-11 15:30 UTC (permalink / raw) To: caml-list On Thu, 9 Sep 2004 22:04:02 +0100, Jon Harrop <jon@jdh30.plus.com> wrote: > On Thursday 09 September 2004 21:08, Brandon J. Van Every wrote: > > Intel has always won by getting shoddy products to market quickly, never by > > producing anything that's any good. > > StrongARM? > > Cheers, > Jon. You mean the technology they bought instead of develop it? http://en.wikipedia.org/wiki/StrongARM Lars ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* [off topic] Re: [Caml-list] 32-bit is sticking around 2004-09-11 15:30 ` Lars Nilsson @ 2004-09-11 16:24 ` David MENTRE 2004-09-11 17:52 ` Lars Nilsson [not found] ` <200409111656.11952.jon@jdh30.plus.com> 1 sibling, 1 reply; 31+ messages in thread From: David MENTRE @ 2004-09-11 16:24 UTC (permalink / raw) To: Lars Nilsson; +Cc: caml-list Lars Nilsson <chamaeleon@gmail.com> writes: > You mean the technology they bought instead of develop it? > > http://en.wikipedia.org/wiki/StrongARM And you haven't seen all the gory details (e.g. doubles are stored in little endian but long long in big endian). :) Yours, d. -- David Mentré <dmentre@linux-france.org> ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [off topic] Re: [Caml-list] 32-bit is sticking around 2004-09-11 16:24 ` [off topic] " David MENTRE @ 2004-09-11 17:52 ` Lars Nilsson 0 siblings, 0 replies; 31+ messages in thread From: Lars Nilsson @ 2004-09-11 17:52 UTC (permalink / raw) To: caml-list On Sat, 11 Sep 2004 18:24:37 +0200, David MENTRE <dmentre@linux-france.org> wrote: > Lars Nilsson <chamaeleon@gmail.com> writes: > > > You mean the technology they bought instead of develop it? > > > > http://en.wikipedia.org/wiki/StrongARM > > And you haven't seen all the gory details (e.g. doubles are stored in > little endian but long long in big endian). :) > > Yours, > d. Was it necessary to bring up the achilles heal of the ARM family (floating point)? :) Lars ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
[parent not found: <200409111656.11952.jon@jdh30.plus.com>]
* Re: [Caml-list] 32-bit is sticking around [not found] ` <200409111656.11952.jon@jdh30.plus.com> @ 2004-09-11 17:47 ` Lars Nilsson 0 siblings, 0 replies; 31+ messages in thread From: Lars Nilsson @ 2004-09-11 17:47 UTC (permalink / raw) To: caml-list On Sat, 11 Sep 2004 16:56:11 +0100, Jon Harrop <jon@jdh30.plus.com> wrote: > On Saturday 11 September 2004 16:30, Lars Nilsson wrote: > > > StrongARM? > > > > You mean the technology they bought instead of develop it? > > Intel have developed it quite substantially, IMHO. AFAIK, nobody else has made > such a superscalar/deeply-pipelined CPU with such a conditional instruction > set before, and that's not easy. > > Cheers, > Jon. Sure, they have probably done quite a bit of innovation with it by now. All I know is that back in 1987-88 I was happily doing assembly GUI programming just because it (ARM) was so darn easy to work with, while x86 made my stomach turn. Lars ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 9:37 ` Damien Doligez 2004-09-09 10:34 ` Jean-Christophe Filliatre 2004-09-09 13:01 ` Brian Hurt @ 2004-09-09 16:58 ` Jon Harrop 2004-09-10 5:56 ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli 2004-09-10 13:45 ` [Caml-list] Gripes with array Damien Doligez 2 siblings, 2 replies; 31+ messages in thread From: Jon Harrop @ 2004-09-09 16:58 UTC (permalink / raw) To: caml users; +Cc: Damien Doligez On Thursday 09 September 2004 10:37, Damien wrote: > ... > > Am I right in thinking that the maximum > > non-float array size on a 64-bit machine is 18,014,398,509,481,983? > > That's correct. It's a good thing that 32-bitters are on their way out. If I upgrade I'll surely end up playing Doom 3 and not get any work done... :-) > > Also, can Array.init be made to fill the elements only once? > > No, that's impossible without breaking the GC invariants. Could it not even be done by dragging Array.init inside the compiler, giving it the same status as Array.make? > > This would make quite a few things twice as fast > > Twice? I doubt it very much. That was an estimate based upon the assumption that, on large arrays (well, ~4M elements ;-), the time taken is limited by the filling of elements which is currently done twice but which only needs to be done once. I believe this is justified because of the high-cost of memory writes (to main memory for out-of-cache sized arrays), even sequential ones, compared to (trivial, inlineable) function calls, a single heap allocation etc. What is the bottleneck in the asymptotic limit? For measurements on 4,000,000 element int arrays (using the code at the end of this mail) I get: Array.make took 0.131528823272 secs. Array.init took 0.311059344899 secs. array_init took 0.179279577732 secs. Measuring memset from C gives me 0.0311secs. So element-setting must be at least 10% of Array.init. Also, the array_init function is surprisingly fast, presumably due to "f" not being inlined into Array.init but being inlined into array_init. This came up because my wavelet transform code in OCaml is within 15% of the performance of my equivalent C version excluding the cost of creating the array. Including that cost (even with calloc), the C version is twice as fast. Admittedly calloc will use memset, and not set the elements "properly", but even so... > > let copy a = init (length a) (fun i -> a.(i)) > > Exactly how it's written now, except that it's inlined by hand for > performance reasons. Array.copy took 0.298557505888 secs. array_copy took 0.315200943696 secs. This optimisation gives a <6% performance improvement (and this is really best-case for large arrays because the filling-function is trivial in this case). I'd have gone for five times less code in the array module and more code in the compiler... ;-) Perhaps the current versions are significantly faster on smaller data structures... Cheers, Jon. ----- let f i = 1+i let array_init l = if l = 0 then [||] else let res = Array.make l (f 0) in for i = 1 to pred l do Array.unsafe_set res i (f i) done; res let array_copy a = Array.init (Array.length a) (fun i -> Array.unsafe_get a i) let time f = let time = Unix.gettimeofday in let t = time () in ignore (f ()); (time ()) -. t let _ = let timings = Array.make 5 (0., 0) in let l = 4000000 in let a = Array.make l 0 in for i=0 to 100 do let entry = Random.int 5 in let t = time (match entry with 0 -> fun () -> Array.make l 0 | 1 -> fun () -> Array.init l f | 2 -> fun () -> array_init l | 3 -> fun () -> Array.copy a | 4 -> fun () -> array_copy a) in timings.(entry) <- let (ot, n) = timings.(entry) in (ot +. t, n+1); done; let entry = [| "Array.make"; "Array.init"; "array_init"; "Array.copy"; "array_copy" |] in for i=0 to 4 do print_endline (entry.(i)^": "^(string_of_int (snd timings.(i)))) done; let timings = Array.map (fun (t, n) -> string_of_float (t /. float_of_int n)) timings in for i=0 to 4 do print_endline (entry.(i)^" took "^timings.(i)^" secs.") done ----- ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Array.init (was [Caml-list] Gripes with array) 2004-09-09 16:58 ` [Caml-list] Gripes with array Jon Harrop @ 2004-09-10 5:56 ` Christophe Raffalli 2004-09-10 8:53 ` Richard Jones 2004-09-13 7:02 ` Christophe Raffalli 2004-09-10 13:45 ` [Caml-list] Gripes with array Damien Doligez 1 sibling, 2 replies; 31+ messages in thread From: Christophe Raffalli @ 2004-09-10 5:56 UTC (permalink / raw) To: Jon Harrop; +Cc: caml users, Damien Doligez [-- Attachment #1: Type: text/plain, Size: 1388 bytes --] > >>>Also, can Array.init be made to fill the elements only once? >> >>No, that's impossible without breaking the GC invariants. > This is not true if the runtime maintains a list of array in construction with the current position of the index. This list will stay rather small, but each addresse read by the GC will have to be checked for membership in the list. If the list is rather small it will be in the cache, but still I am afraid it will slow down noticably the GC. Did someone tried to implement such a list of partially initialized objects in the GC ? You could also lie in the tag about the size of array (if the way the runtime finds free block of memory does not use it). It will cost an increment of integer at each step in the initialisation process which should not be much since the beginning of array may stay in the cache if the initialisation function is simple and this will be neggligeable if not. -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Christophe.Raffalli@univ-savoie.fr www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature --------------------------------------------- [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Array.init (was [Caml-list] Gripes with array) 2004-09-10 5:56 ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli @ 2004-09-10 8:53 ` Richard Jones 2004-09-10 14:50 ` Damien Doligez 2004-09-13 7:02 ` Christophe Raffalli 1 sibling, 1 reply; 31+ messages in thread From: Richard Jones @ 2004-09-10 8:53 UTC (permalink / raw) Cc: caml users [-- Attachment #1: Type: text/plain, Size: 965 bytes --] On Fri, Sep 10, 2004 at 07:56:16AM +0200, Christophe Raffalli wrote: > > > > >>>Also, can Array.init be made to fill the elements only once? > >> > >>No, that's impossible without breaking the GC invariants. > > You could also lie in the tag about the size of array (if the way the > runtime finds free block of memory does not use it). It will cost an > increment of integer at each step in the initialisation process which > should not be much since the beginning of array may stay in the cache if > the initialisation function is simple and this will be neggligeable if not. Can one set the tag first to Custom_tag, then once array initialization is complete, set the tag again to 0 (or Array_double_tag as appropriate)? Rich. -- Richard Jones. http://www.annexia.org/ http://www.j-london.com/ Merjis Ltd. http://www.merjis.com/ - improving website return on investment http://www.winwinsales.co.uk/ - CRM improvement consultancy [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Array.init (was [Caml-list] Gripes with array) 2004-09-10 8:53 ` Richard Jones @ 2004-09-10 14:50 ` Damien Doligez 0 siblings, 0 replies; 31+ messages in thread From: Damien Doligez @ 2004-09-10 14:50 UTC (permalink / raw) To: caml users On Sep 10, 2004, at 10:53, Richard Jones wrote: > Can one set the tag first to Custom_tag, then once array > initialization is complete, set the tag again to 0 (or > Array_double_tag as appropriate)? No. If you do that, the pointers that you have already set will not be traced by the GC -> Bus Error. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: Array.init (was [Caml-list] Gripes with array) 2004-09-10 5:56 ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli 2004-09-10 8:53 ` Richard Jones @ 2004-09-13 7:02 ` Christophe Raffalli 1 sibling, 0 replies; 31+ messages in thread From: Christophe Raffalli @ 2004-09-13 7:02 UTC (permalink / raw) To: Christophe Raffalli; +Cc: Jon Harrop, caml users, Damien Doligez [-- Attachment #1: Type: text/plain, Size: 2454 bytes --] Christophe Raffalli wrote: > >> >>>> Also, can Array.init be made to fill the elements only once? >>> >>> >>> No, that's impossible without breaking the GC invariants. >> >> > > This is not true if the runtime maintains a list of array in > construction with the current position of the index. This list will stay > rather small, but each addresse read by the GC will have to be checked > for membership in the list. If the list is rather small it will be in > the cache, but still I am afraid it will slow down noticably the GC. > > Did someone tried to implement such a list of partially initialized > objects in the GC ? (I do not think it is worth it ?) > You could also lie in the tag about the size of array (if the way the > runtime finds free block of memory does not use it). It will cost an > increment of integer at each step in the initialisation process which > should not be much since the beginning of array may stay in the cache if > the initialisation function is simple and this will be neggligeable if not. > > This second solution is only for non copying GC, so not for OCaml And yet another solution would have to have a tag with the total size + the number of non pointers at the end of the block. This would be usefull not only for Arrays. It just makes again a (too) strong limit on the size of arrays. I am now wondering why the structure of a block does not allow a second tag word (and even a third tag word) for the tag in case of big objects ? A cons cell keeps a tag of only one word (that's still too much :-), but an array (or even only an array of size > 2^n) uses a bigger tag. Moreover this extra tag word could be at address (p-1) if p is the address of the object so that some code like unsafe.get do not need to access it. The Gc and few other function like Array.length, Array.get in the bound check,.. would have an extra test to do, but the GC would have less memory to scan (because it would have to scan not all word in a block). -- Christophe Raffalli Université de Savoie Batiment Le Chablais, bureau 21 73376 Le Bourget-du-Lac Cedex tél: (33) 4 79 75 81 03 fax: (33) 4 79 75 87 42 mail: Christophe.Raffalli@univ-savoie.fr www: http://www.lama.univ-savoie.fr/~RAFFALLI --------------------------------------------- IMPORTANT: this mail is signed using PGP/MIME At least Enigmail/Mozilla, mutt or evolution can check this signature --------------------------------------------- [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 252 bytes --] ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 16:58 ` [Caml-list] Gripes with array Jon Harrop 2004-09-10 5:56 ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli @ 2004-09-10 13:45 ` Damien Doligez 2004-09-11 1:43 ` skaller 2004-09-11 14:36 ` Jon Harrop 1 sibling, 2 replies; 31+ messages in thread From: Damien Doligez @ 2004-09-10 13:45 UTC (permalink / raw) To: caml users On Sep 9, 2004, at 18:58, Jon Harrop wrote: >>> Also, can Array.init be made to fill the elements only once? >> No, that's impossible without breaking the GC invariants. > > Could it not even be done by dragging Array.init inside the compiler, > giving > it the same status as Array.make? Not without implementing such horrors as Christophe described. I don't much like the idea of introducing lots of bugs while slowing down the whole GC, even for programs that don't use arrays. An intermediate solution would be to make a "Array.unsafe_make" primitive, which would use memset instead of initialising the array properly. If you enter this as a feature wish in the BTS, I'll look into it. > Measuring memset from C gives me 0.0311secs. So element-setting must > be at > least 10% of Array.init. Also, the array_init function is surprisingly > fast, > presumably due to "f" not being inlined into Array.init but being > inlined > into array_init. That would confirm my intuition that the calls to f dominates the initialisation time. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-10 13:45 ` [Caml-list] Gripes with array Damien Doligez @ 2004-09-11 1:43 ` skaller 2004-09-11 3:16 ` skaller 2004-09-11 14:36 ` Jon Harrop 1 sibling, 1 reply; 31+ messages in thread From: skaller @ 2004-09-11 1:43 UTC (permalink / raw) To: Damien Doligez; +Cc: caml users On Fri, 2004-09-10 at 23:45, Damien Doligez wrote: > An intermediate solution would > be to make a "Array.unsafe_make" primitive, which would use memset > instead of initialising the array properly. Yeah but that doesn't solve the problem of filling the array initially with some 'non-binary-zero' value. You'd still need two passes: the memset and the proper fill (hence paging all the memory in twice). AFAICS tracking how much of an array is initialised with an index the GC can use costs a single comparison when you're not initialising arrays. EG: the GC has a list of blocks undergoing initialisation, determining the list is empty should be a single machine instruction. If you're initialising an array without any recursion, that's two comparisons. So the only time there would be a serious impact on the GC would be if you're initialising many big arrays all at once (EG if you're making a matrix as an array of arrays), and that cost would be small compared to scanning twice. I can't predict the performance of the one extra comparison needed when no arrays are being initialised because it probably depends on the processor cache design -- how much can one machine instruction + one memory reference cost? -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-11 1:43 ` skaller @ 2004-09-11 3:16 ` skaller 0 siblings, 0 replies; 31+ messages in thread From: skaller @ 2004-09-11 3:16 UTC (permalink / raw) To: Damien Doligez; +Cc: caml users On Sat, 2004-09-11 at 11:43, skaller wrote: > On Fri, 2004-09-10 at 23:45, Damien Doligez wrote: > > > An intermediate solution would > > be to make a "Array.unsafe_make" primitive, which would use memset > > instead of initialising the array properly. > > AFAICS tracking how much of an array is initialised > with an index the GC can use costs a single > comparison when you're not initialising arrays. Specifically: I assume there is some GC 'object' containing the state data for the collector. Suppose we add a doubly linked list of the type struct pia { pia *next; pia *prev; caml_heap_block *block; unsigned long index; } *pial; to the GC state data. pial means 'pointer to initialising array list' In order to sweep a block *p for pointers, you'd need to do this: if (pial == NULL) usual_scan(p) else { unsigned long index = find_index(pias,p); if (index) scan_array(p,index)); else usual_scan(p); } usual_scan() is the usual scanning algorithm for a heap block, array_scan is a special one where the initialised block length is tracked. The index is incremented when filling in the array. If the fill is using a fixed value, every 4K elements, if a function, each element. All the scan calls are tail recurisve. I imagine usual_scan and array_scan would be the different entry points to the same routine. Clearly the array initialiser has to be able to add and remove a the pia descriptor from the list. In addition, the compactor would need to adjust the heap pointers in the list. So in the case we're not currently initialising an array, we require one memory fetch, a load of the value 'pial'. Perhaps 'pial' would reside in the cache, even if it does it is displacing one other value. So there is a definite cost for all code with this technique. BTW: the Felix collector has an array index count in every heap header block. The actual block size is independent of this count. As I understand it the block size indicator in Ocaml is *also* used to determine string and array sizes which prevents it being adjusted dynamically during initialisation? This would be in case there is an exception thrown and the array become prematurely unreachable, since otherwise the array can't be disposed until it is initialised, and the disparity between the dynamic and static length would not matter. The static length is required to free the memory. This suggests an alternate solution with zero impact on the sweep: swap the meaning of the length count in the heap block and pial list. This means the 'free' routine that disposes of unreachable heap memory would have to check the list to find the real memory length. -- John Skaller, mailto:skaller@users.sf.net voice: 061-2-9660-0850, snail: PO BOX 401 Glebe NSW 2037 Australia Checkout the Felix programming language http://felix.sf.net ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-10 13:45 ` [Caml-list] Gripes with array Damien Doligez 2004-09-11 1:43 ` skaller @ 2004-09-11 14:36 ` Jon Harrop 2004-09-11 20:53 ` Damien Doligez 1 sibling, 1 reply; 31+ messages in thread From: Jon Harrop @ 2004-09-11 14:36 UTC (permalink / raw) To: caml users On Friday 10 September 2004 14:45, Damien Doligez wrote: > Not without implementing such horrors as Christophe described. I don't > much > like the idea of introducing lots of bugs while slowing down the whole > GC, > even for programs that don't use arrays. Yes indeed. I like Christophe's last idea though: On Friday 10 September 2004 06:56, Christophe Raffalli wrote: > You could also lie in the tag about the size of array (if the way the > runtime finds free block of memory does not use it). It will cost an > increment of integer at each step in the initialisation process which > should not be much since the beginning of array may stay in the cache if > the initialisation function is simple and this will be neggligeable if not. What is the trickiest/most-error-prone part of doing this and could this be used to implement amortised extensible arrays within the compiler? I would like to see such a thing in the compiler (not that I distrust Markus ;-). On Friday 10 September 2004 14:45, Damien Doligez wrote: > An intermediate solution would > be to make a "Array.unsafe_make" primitive, which would use memset > instead > of initialising the array properly. If you enter this as a feature wish > in the BTS, I'll look into it. No, I don't think the performance improvement would justify your time or the loss of safety. Could you add a memset to String.create though? :-) > That would confirm my intuition that the calls to f dominates the > initialisation time. In the case of Array.init, yes. From my array_init, inlining and type-specialisation decrease the time taken by ~43% and, hence, these would be the most productive optimisations. An array_init2 function which specialises the array type but uses an external "f" function betters the time taken by Array.init by ~36% (perhaps not significantly different): let array_init2 l f = if l = 0 then [||] else let x : int = f 0 in let res = Array.make l x in for i = 1 to pred l do Array.unsafe_set res i (f i) done; res I'm not sure what the impact of the type-specialisation on inlining is though. I'd imagine that doing such partial specialisations in ocamlopt is arbitrarily dodgy because you don't get any feedback on the benfits of your results. Perhaps this is a challenge for Basile's JIT? :-) Cheers, Jon. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-11 14:36 ` Jon Harrop @ 2004-09-11 20:53 ` Damien Doligez 2004-09-12 15:33 ` Jon Harrop 0 siblings, 1 reply; 31+ messages in thread From: Damien Doligez @ 2004-09-11 20:53 UTC (permalink / raw) To: caml users On Sep 11, 2004, at 16:36, Jon Harrop wrote: > On Friday 10 September 2004 06:56, Christophe Raffalli wrote: >> You could also lie in the tag about the size of array (if the way the >> runtime finds free block of memory does not use it). It will cost an >> increment of integer at each step in the initialisation process which >> should not be much since the beginning of array may stay in the cache >> if >> the initialisation function is simple and this will be neggligeable >> if not. > > What is the trickiest/most-error-prone part of doing this and could > this be > used to implement amortised extensible arrays within the compiler? I > would > like to see such a thing in the compiler (not that I distrust Markus > ;-). The "if" part is false... I suppose it might be possible to change representations (true length in header and initialized part stored off-line / initialized part in header / true length off-line) when the GC switches between sweeping and marking. We would also need several additional primitives to create, update, and destroy the "off-line" part. Frankly, I find it hard to imagine that it would give a noticeable run-time improvement on any program. Most likely, it would make them all a fraction of a percent slower, except for the most synthetic of benchmarks. > On Friday 10 September 2004 14:45, Damien Doligez wrote: >> An intermediate solution would >> be to make a "Array.unsafe_make" primitive, which would use memset >> instead of initialising the array properly. If you enter this as >> a feature wish in the BTS, I'll look into it. > > No, I don't think the performance improvement would justify your time > or the > loss of safety. No loss of safety if we don't export that primitive and use it only in Array.init. After all, it only breaks type safety, and only if you use it wrong. > Could you add a memset to String.create though? :-) String.make > An array_init2 function which specialises the array type but uses an > external > "f" function betters the time taken by Array.init by ~36% (perhaps not > significantly different): That's from getting rid of the float/non-float test. Nothing to do with the cost of initializing twice. -- Damien ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-11 20:53 ` Damien Doligez @ 2004-09-12 15:33 ` Jon Harrop 2004-09-12 16:07 ` Basile Starynkevitch [local] 0 siblings, 1 reply; 31+ messages in thread From: Jon Harrop @ 2004-09-12 15:33 UTC (permalink / raw) To: caml users On Saturday 11 September 2004 21:53, Damien Doligez wrote: > ... Frankly, > I find it hard to imagine that it would give a noticeable run-time > improvement on any program. Most likely, it would make them all a > fraction > of a percent slower, except for the most synthetic of benchmarks. If it let you do extensible arrays safely then it could be used to reduce the asymptotic complexity of array append- or prepend-element from O(n) to amortized O(1) and "i"th insert from O(n) to amortized O(min{i, n-i}). > > No, I don't think the performance improvement would justify your time > > or the > > loss of safety. > > No loss of safety if we don't export that primitive and use it only in > Array.init. After all, it only breaks type safety, and only if you use > it wrong. I was assuming that you would export it. My discrete wavelet transform (DWT) code, for example, is easily proven to fill all array elements but it fills them out of order (I've considered ways to make it fill in-order but they all incur significant performance penalties elsewhere, e.g. lots of swapping). So I was thinking of using unsafe_make for that. Still, I don't think it would be worth your writing it. Having said that, it might be possible to abstract the array lookup and replace it with a clever function to work out where the "i"th element really is. Then you could give Array.init a continuation. Hmm... > > Could you add a memset to String.create though? :-) > > String.make I meant for safety reasons - either don't export String.create or make it safe, e.g. "let create n = make n '\000'". > > An array_init2 function which specialises the array type but uses an > > external > > "f" function betters the time taken by Array.init by ~36% (perhaps not > > significantly different): > > That's from getting rid of the float/non-float test. Nothing to do with > the cost of initializing twice. Yes. Optimising that would be much more productive. Am I right in thinking that ocamlopt doesn't hoist the test out of the loop? Such optimisations would be best done at run-time, e.g. by Basile's JIT. Where is he? ;-) Cheers, Jon. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-12 15:33 ` Jon Harrop @ 2004-09-12 16:07 ` Basile Starynkevitch [local] 0 siblings, 0 replies; 31+ messages in thread From: Basile Starynkevitch [local] @ 2004-09-12 16:07 UTC (permalink / raw) To: caml-list On Sun, Sep 12, 2004 at 04:33:38PM +0100, Jon Harrop wrote: > > Having said that, it might be possible to abstract the array lookup and > replace it with a clever function to work out where the "i"th element really > is. Then you could give Array.init a continuation. Hmm... > > > > Could you add a memset to String.create though? :-) > > > > String.make > > I meant for safety reasons - either don't export String.create or make it > safe, e.g. "let create n = make n '\000'". > > > > An array_init2 function which specialises the array type but uses an > > > external > > > "f" function betters the time taken by Array.init by ~36% (perhaps not > > > significantly different): > > > > That's from getting rid of the float/non-float test. Nothing to do with > > the cost of initializing twice. > > Yes. Optimising that would be much more productive. Am I right in thinking > that ocamlopt doesn't hoist the test out of the loop? > Such optimisations would be best done at run-time, e.g. by Basile's > JIT. Where is he? ;-) I'm here (not for very long, I'm going back in a few days to CEA working on other things, probably unrelated to Ocaml). However, this optimisation is not for OcamlJIT, which is an unoptimizing JIT. The main requirement of OcamlJIT was full compatibility with Ocamlrun without touching a single line in the ocaml/**/*.ml* sources (the ** is zsh notation) of Ocaml (actually, I had to cheat for the toplevel where I added a single line, now incorporated in 3.08). Of course, one could dream of substantially change the bytecode instruction set (for better JIT translation). But it won't be compatible with Ocaml. If you ask me, the one feature I would have liked in the bytecode instruction set, is to add a LABEL bytecode, followed by a useless word (used by ocamljit to store a pointer to machine code), and to change the bytecode generator so that every closure had its code pointer pointing to this LABEL bytecode. But the rule of the OcamlJIT game was to leave Ocaml as much as possible unchanged. BTW, I would be delighted to have feedback of people using OcamlJit with Ocaml 3.08. Regards -- Basile STARYNKEVITCH -- basile dot starynkevitch at inria dot fr other email : basile at starynkevitch dot net Project cristal.inria.fr - for a few days http://cristal.inria.fr/~starynke --- all opinions are only mine ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
* Re: [Caml-list] Gripes with array 2004-09-09 2:10 [Caml-list] Gripes with array Jon Harrop ` (2 preceding siblings ...) 2004-09-09 9:37 ` Damien Doligez @ 2004-09-10 23:48 ` brogoff 3 siblings, 0 replies; 31+ messages in thread From: brogoff @ 2004-09-10 23:48 UTC (permalink / raw) To: caml-list On Thu, 9 Sep 2004, Jon Harrop wrote: > I'm increasingly finding the outrageously finite size limit of arrays to be a > pain. In particular, I'm peeved that the size limit is itself a function of > the type which, therefore, makes writing polymorphic functions over arrays > nay-on impossible (e.g. to make an array of maximum-sized arrays). I'm surprised that there were no comments on this. Since the distinction for floats is a performance hack, wouldn't it make more sense to just have a separate module for fast float arrays? -- Brian ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 31+ messages in thread
end of thread, other threads:[~2004-09-13 7:02 UTC | newest] Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2004-09-09 2:10 [Caml-list] Gripes with array Jon Harrop 2004-09-09 5:08 ` Ville-Pertti Keinonen 2004-09-09 7:17 ` Jean-Christophe Filliatre 2004-09-09 8:23 ` Richard Jones 2004-09-09 9:08 ` Olivier Andrieu 2004-09-09 12:08 ` Basile Starynkevitch [local] 2004-09-09 12:31 ` Damien Doligez 2004-09-09 10:42 ` Gerd Stolpmann 2004-09-09 9:37 ` Damien Doligez 2004-09-09 10:34 ` Jean-Christophe Filliatre 2004-09-09 12:15 ` Igor Pechtchanski 2004-09-09 13:01 ` Brian Hurt 2004-09-09 20:08 ` [Caml-list] 32-bit is sticking around Brandon J. Van Every 2004-09-09 21:04 ` Jon Harrop 2004-09-11 15:30 ` Lars Nilsson 2004-09-11 16:24 ` [off topic] " David MENTRE 2004-09-11 17:52 ` Lars Nilsson [not found] ` <200409111656.11952.jon@jdh30.plus.com> 2004-09-11 17:47 ` Lars Nilsson 2004-09-09 16:58 ` [Caml-list] Gripes with array Jon Harrop 2004-09-10 5:56 ` Array.init (was [Caml-list] Gripes with array) Christophe Raffalli 2004-09-10 8:53 ` Richard Jones 2004-09-10 14:50 ` Damien Doligez 2004-09-13 7:02 ` Christophe Raffalli 2004-09-10 13:45 ` [Caml-list] Gripes with array Damien Doligez 2004-09-11 1:43 ` skaller 2004-09-11 3:16 ` skaller 2004-09-11 14:36 ` Jon Harrop 2004-09-11 20:53 ` Damien Doligez 2004-09-12 15:33 ` Jon Harrop 2004-09-12 16:07 ` Basile Starynkevitch [local] 2004-09-10 23:48 ` brogoff
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox