From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id BAA03414 for caml-redistribution; Sun, 10 Oct 1999 01:54:18 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id AAA00279 for ; Sun, 10 Oct 1999 00:17:48 +0200 (MET DST) Received: from miss.wu-wien.ac.at (miss.wu-wien.ac.at [137.208.107.17]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id AAA07595 for ; Sun, 10 Oct 1999 00:17:48 +0200 (MET DST) Received: (from mottl@localhost) by miss.wu-wien.ac.at (8.9.0/8.9.0) id AAA27378; Sun, 10 Oct 1999 00:17:37 +0200 (MET DST) From: Markus Mottl Message-Id: <199910092217.AAA27378@miss.wu-wien.ac.at> Subject: Re: Data structures in ocaml To: skaller@maxtal.com.au (skaller) Date: Sun, 10 Oct 1999 00:17:37 +0100 (MET DST) Cc: caml-list@inria.fr (OCAML) In-Reply-To: <37FE327A.ED47D428@maxtal.com.au> from "skaller" at Oct 9, 99 04:05:46 am X-Mailer: ELM [version 2.4 PL21] MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: weis > You see, it is NOT possible to implement the > 'give_room' function. Instead, you must provide special > code in EVERY function of the module to do the > reallocation, and you must put dummy values at the end > of the array, which will not be collected. This code > must be special purpose for each function, to find > a suitable dummy value: if there is no such value, > the array can be set to zero length: it CAN be done, > but it is a mess, it is inefficent, and it can > waste a lot of storage, particularly if, > as in my implementation, I chose to leave 'old' values > lying around in the unused part instead of resetting > them all to a single (shared) dummy value. My solution in the (soon to be released) resizable array module is as follows (I hope I haven't overlooked any problem with this approach): Elements of arrays which are not int- or float-arrays are always boxed. If we do not want to waste space in the "free" part of the array, we will have to initialize it with a special value. We could have the user pass one to us, but he would have to create it and this can cost memory if this value is very space-hungry and it is also not very elegant. Thus, we would like to have a default value for all kinds of types which does not need much space and needs not be known to the user. If I have understood the garbage collector correctly, it does not know anything about things going on in the type system, but it knows how to recognize the basic types (atoms). If we take a very "small" atom (e.g. an integer) and "cast" it to the required type (using Obj.magic), we can cheat and make the compiler believe that we have a correct value of the type of the array elements. The garbage collector on the other hand does not care about the "true" type of the array elements. Because they are boxed (otherwise the GC wouldn't scan through the array), it will believe that there are boxed integers in the "free" slots (more precisely, it's always the same integer which is "contained" in the box). With this trick we can initialize the array and (very important) have removed elements be reclaimed when we overwrite their "box" so that it points to the integer. > Yes, and this requires getting a dummy value to initialise > the whole array with, followed by immediately overwriting > that dummy value with the actual data, and thus is twice > as slow as C. In the case of unboxed arrays (the same can be said about strings), we need not do this "overwriting", because the values will not be reclaimed, anyway (wouldn't make much sense). Thus, for these kinds of "contiguous memory" it is ok to simply adjust the index to the last element. For strings, there is a create-function that doesn't initialize the memory. If there were a similar function for int- and float-arrays, I could greatly improve the speed of array creation in my module and (very important) for reallocation if the array is being resized. Maybe I will supply a C-function that allocates such arrays without initialization. But I will rather focus on functionality first... Regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl