Objective Caml 2.02

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* Objective Caml 2.02
@ 1999-03-05 10:41 ` Xavier Leroy
  1999-03-05 13:34   ` Camlp4 2.02 Daniel de Rauglaudre
                     ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Xavier Leroy @ 1999-03-05 10:41 UTC (permalink / raw)
  To: caml-list

Version 2.02 of Objective Caml is now available.  In addition to minor
bug fixes, this release adds some new library functions and a number
of simple optimizations in the native-code compiler that reduce both
execution time and code size.

Sources, documentation and binaries for Linux/RedHat and Windows are
available from:

        http://caml.inria.fr/ocaml/distrib.html
        ftp://ftp.inria.fr/lang/caml-light/

For general info on Objective Caml, see http://caml.inria.fr/ocaml/

Please direct queries and bug reports to caml-light@inria.fr
and general discussions to the mailing-list caml-list@inria.fr or the
comp.lang.ml newsgroup.

- Xavier Leroy, for the Objective Caml team.

Objective Caml 2.02:
--------------------

* Type system:
  - Check that all components of a signature have unique names.
  - Fixed bug in signature matching involving a type component and
    a module component, both sharing an abstract type.
  - Bug involving recursive classes constrained by a class type fixed.
  - Fixed bugs in printing class types and in printing unification errors.

* Compilation:
  - Changed compilation scheme for "{r with lbl = e}" when r has many fields
    so as to avoid code size explosion.

* Native-code compiler:
  - Better constant propagation in boolean expressions and in conditionals.
  - Removal of unused arguments during function inlining.
  - Eliminated redundant tagging/untagging in bit shifts.
  - Static allocation of closures for functions without free variables,
    reduces the size of initialization code.
  - Revised compilation scheme for definitions at top level of compilation
    units, so that top level functions have no free variables.
  - Coalesced multiple allocations of heap blocks inside one expression
    (e.g. x :: y :: z allocates the two conses in one step).
  - Ix86: better handling of large integer constants in instruction selection.
  - MIPS: fixed wrong asm generated for String.length "literal".

* Standard library:
  - Added the "ignore" primitive function, which just throws away its
    argument and returns "()".  It allows to write
    "ignore(f x); y" if "f x" doesn't have type unit and you don't
    want the warning caused by "f x; y".
  - Added the "Buffer" module (extensible string buffers).
  - Module Format: added formatting to buffers and to strings.
  - Added "mem" functions (membership test) to Hashtbl and Map.
  - Module List: added find, filter, partition.
    Renamed remove and removeq to remove_assoc and remove_assq.
  - Module Marshal: fixed bug in marshaling functions when passed functional
    values defined by mutual recursion with other functions.
  - Module Printf: added Printf.bprintf (print to extensible buffer);
    added %i format as synonymous for %d (as per the docs).
  - Module Sort: added Sort.array (Quicksort).

* Runtime system:
  - New callback functions for callbacks with arbitrary many arguments
    and for catching Caml exceptions escaping from a callback.

* The ocamldep dependency generator: now performs full parsing of the
    sources, taking into account the scope of module bindings.

* The ocamlyacc parser generator: fixed sentinel error causing wrong
    tables to be generated in some cases.

* The str library:
  - Added split_delim, full_split as variants of split that control
    more precisely what happens to delimiters.
  - Added replace_matched for separate matching and replacement operations.

* The graphics library:
  - Bypass color lookup for 16 bpp and 32 bpp direct-color displays.
  - Larger color cache.

* The thread library:
  - Bytecode threads: more clever use of non-blocking I/O, makes I/O
    operations faster.
  - POSIX threads: gcc-ism removed, should now compile on any ANSI C compiler.
  - Both: avoid memory leak in the Event module when a communication
    offer is never selected.

* The Unix library:
  - Fixed inversion of ctime and mtime in Unix.stat, Unix.fstat, Unix.lstat.
  - Unix.establish_connection: properly reclaim socket if connect fails.

* The DBM library: no longer crashes when calling Dbm.close twice.

* Emacs mode:
  - Updated with Garrigue and Zimmerman's latest version.
  - Now include an "ocamltags" script for using etags on OCaml sources.

* Win32 port:
  - Fixed end-of-line bug in ocamlcp causing problems with generated sources.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Camlp4 2.02
  1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
@ 1999-03-05 13:34   ` Daniel de Rauglaudre
  1999-03-05 15:11   ` Objective Caml 2.02 Pierpaolo Bernardi
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Daniel de Rauglaudre @ 1999-03-05 13:34 UTC (permalink / raw)
  To: caml-list

Version 2.02 of Camlp4 available. See info at:
     http://caml.inria.fr/camlp4/

-- 
Daniel de RAUGLAUDRE
daniel.de_rauglaudre@inria.fr
http://cristal.inria.fr/~ddr/

Camlp4 Version 2.02:
--------------------

* Parsing:
  - [Feb 27, 99] Fixed 2 bugs, resulting of incorrect Ocaml parsing of the
    program example: "type t = F(B).t"
  - [Jan 30, 99] Fixed bug "pa_op.ml", could not parse "parser | [<>] -> ()".
  - [Jan 16, 99] Added "define" and "undef" in "pa_ifdef.cmo".
  - [Dec 22, 98] Fixed precedence of "!=" in Ocaml syntax

* Printing:
  - [Mar 4, 99] Added pr_depend.cmo for printing file dependencies.
  - [Dec 28, 98] Fixed pretty printing of long strings starting with spaces;
    used to display "\\n<spaces>..." instead of "<spaces>\\n...".

* Camlp4:
  - [Feb 19, 99] Sort command line argument list in reverse order to
    avoid argument names conflicts when adding arguments.

* Olabl:
  - [Feb 26, 99] Started extensions for Olabl: directory "lablp4" and some
    changes in MLast. Olabl programs can be preprocessed by:
        camlp4 pa_labl.cma pr_ldump.cmo

* Internal:
  - Use of pr_depend.cmo instead of ocamldep for dependencies.




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Objective Caml 2.02
  1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
  1999-03-05 13:34   ` Camlp4 2.02 Daniel de Rauglaudre
@ 1999-03-05 15:11   ` Pierpaolo Bernardi
  1999-03-05 19:59   ` doligez
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Pierpaolo Bernardi @ 1999-03-05 15:11 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

On Fri, 5 Mar 1999, Xavier Leroy wrote:

> Version 2.02 of Objective Caml is now available.  In addition to minor
> bug fixes, this release adds some new library functions and a number
> of simple optimizations in the native-code compiler that reduce both
> execution time and code size.

The mind boggles!  8-)

Just a comment.  Neither on the WWW nor on the FTP site, there's the
.txt and .prn version of the documentation.

Please, oh please, don't stop providing the docs in text format!

I normally use the .prn version as an inline reference. 
It is way faster to have the .prn file in an emacs
buffer than navigate the html.

On monday I'll post the emacs functions that I use to lock at the
documentation.

Regards,
Pierpaolo Bernardi

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Objective Caml 2.02
  1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
  1999-03-05 13:34   ` Camlp4 2.02 Daniel de Rauglaudre
  1999-03-05 15:11   ` Objective Caml 2.02 Pierpaolo Bernardi
@ 1999-03-05 19:59   ` doligez
  1999-03-11  3:06   ` Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! Alexey Nogin
  1999-03-11 23:42   ` List.filter in Ocaml 2.02 Alexey Nogin
  4 siblings, 0 replies; 25+ messages in thread
From: doligez @ 1999-03-05 19:59 UTC (permalink / raw)
  To: caml-list


The MacOS version of Objective Caml 2.02 is now available at:

<ftp://ftp.inria.fr/lang/caml-light/ocaml-2.02-mac-1.0a4.sea.bin>


And the sources, packaged in a Mac archive:

<ftp://ftp.inria.fr/lang/caml-light/ocaml-2.02-source.sea.bin>


-- Damien




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Sort.array easily degenerates
@ 1999-03-06  0:27 Markus Mottl
  1999-03-09 10:44 ` Xavier Leroy
  0 siblings, 1 reply; 25+ messages in thread
From: Markus Mottl @ 1999-03-06  0:27 UTC (permalink / raw)
  To: OCAML

Hello,

I have played around with the new functions and modules in the new
OCAML-release. Besides other things I have tested the new function for
sorting arrays (Sort.array).

I am not sure where the problem in the implementation is, but the
"qsort"-function, which is applied in "Sort.array" degenerates easily
on pre-sorted and/or non-unique data. E.g.:

-----  SNIP -----
let _ =
  let size = 5000 in
  let ar = Array.create size 0 in
  for i = 0 to (size-1) do ar.(i) <- i done;
  Sort.array (>=) ar
-----  SNIP -----

The array to be sorted is initialized with its index. Then it is sorted
with descending order.

Running the same test with larger arrays clearly shows that we encounter
the worst-case behaviour (n^2) of quicksort.

Even worse:

If we initialize the array with the same number, the time complexity
stays at its worst-case but with an even higher constant factor.

Initializing the array with random integers (of a large range) shows
that in such cases "Sort.array" does not perform this badly (actually
quite well).

I have compared this to "qsort" in "stdlib.h", the standard library of
C. It is faster than the OCAML-version, as was to be expected. But in
contrast to OCAML it behaves very nicely on low-entropy data.

E.g.: (C-Code):

-----  SNIP -----
#include <stdlib.h>

int int_comp (const void * a, const void * b) {
  return *(int *) a - *(int *) b;
}

const int size = 5000;

int main () {
  int ar[size];
  for (int i = 0; i < size; ++i) ar[i] = i;
  qsort (ar, size, sizeof(int), int_comp);
}
-----  SNIP -----

It would probably be a good idea to change the implementation of
"Sort.array" so as to make it more unlikely to encounter such worst-case
behaviour. Especially with data, where elements may occur more than once,
the current implementation performs really badly.

So here a question: has someone already written a quicksort-function
with in-place modification for arrays which demonstrates nicer behaviour?

Best regards,
Markus

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Sort.array easily degenerates
  1999-03-06  0:27 Sort.array easily degenerates Markus Mottl
@ 1999-03-09 10:44 ` Xavier Leroy
  1999-03-09 23:03   ` doligez
  1999-03-10  0:28   ` Markus Mottl
  0 siblings, 2 replies; 25+ messages in thread
From: Xavier Leroy @ 1999-03-09 10:44 UTC (permalink / raw)
  To: Markus Mottl, OCAML

> I have played around with the new functions and modules in the new
> OCAML-release. Besides other things I have tested the new function for
> sorting arrays (Sort.array).
> I am not sure where the problem in the implementation is, but the
> "qsort"-function, which is applied in "Sort.array" degenerates easily
> on pre-sorted and/or non-unique data.

The Sort.array implementation is Quicksort with insertion sort for
small partitions, as suggested in Sedgewick.  I should know better
than take some code out of an algorithms textbook and expect that it
will work well... 

At any rate, any one is welcome to send me a better implementation.

- Xavier Leroy




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Sort.array easily degenerates
  1999-03-09 10:44 ` Xavier Leroy
@ 1999-03-09 23:03   ` doligez
  1999-03-10 13:58     ` Xavier Leroy
  1999-03-10  0:28   ` Markus Mottl
  1 sibling, 1 reply; 25+ messages in thread
From: doligez @ 1999-03-09 23:03 UTC (permalink / raw)
  To: OCAML

>From: Xavier Leroy <Xavier.Leroy@inria.fr>

>The Sort.array implementation is Quicksort with insertion sort for
>small partitions, as suggested in Sedgewick.  I should know better
>than take some code out of an algorithms textbook and expect that it
>will work well... 

There's no way to implement Sedgewick's quicksort with the interface
given in sort.mli.  You'd need two comparison functions, one for ">="
and one for "<=".  That explains the degenerate case when all the
elements are equal.

And it degenerates on already-sorted data because you swap the pivot
with the right-most element.  As a consequence, one of the subarrays
has its two highest elements in first and last positions, which makes
the median-of-three degenerate.  I think this bug is also in
Sedgewick's pseudo-code.

Also, you should recurse on the smallest subarray first, not the
largest.

I vote for Shellsort.

-- Damien

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Sort.array easily degenerates
  1999-03-09 10:44 ` Xavier Leroy
  1999-03-09 23:03   ` doligez
@ 1999-03-10  0:28   ` Markus Mottl
  1 sibling, 0 replies; 25+ messages in thread
From: Markus Mottl @ 1999-03-10  0:28 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: OCAML

> The Sort.array implementation is Quicksort with insertion sort for
> small partitions, as suggested in Sedgewick.  I should know better
> than take some code out of an algorithms textbook and expect that it
> will work well... 
> 
> At any rate, any one is welcome to send me a better implementation.

I have also compared it to the Sedgewick-version and wondered, what was
wrong with the implementation - it seems that the version in the book
doesn't hold what it promises...

Someone suggested via mail to me that "sort" as can be found in the STL
is very efficient. I took a look at it and it makes indeed a very good
impression. There is an excellent paper about it on the following page:

  http://www.cs.rpi.edu/~musser/gp/timing.html
  Name of paper: Introspective Sorting and Searching Algorithms

  download paper from:
  http://www.cs.rpi.edu/~musser/gp/introsort.ps

It's a kind of hybrid version of various sorting algorithms. It does not
only guarantee a worst-case bound of N*log(N), but it is also as fast as
quicksort in the average case. The constant factor compared to quicksort
is just a little bit larger so it seems to be a true alternative.

The implementation requires heap-algorithms. If someone has time, he could
try to implement the sort algorithm with a suitable heap-implementation
from Okasaki's purely functional data structures - some of them are very
efficient. Take a look at the paper and on the page

  http://miss.wu-wien.ac.at/~mottl/ocaml_sources/intro.html

and download "pure_fun.tar.gz". In chapter 3 you will find "LeftistHeap"
and in chapter 5 "SplayHeap". Both are quite efficient (SplayHeap
seems to be faster (garbage collection parameters can change the
behaviour significantly), but is a bit more complicated). With some minor
changes/additions it should be possible to use them for heap-sorting.

As it seems, a collection of such algorithms and data structures would
really come handy in the OCAML-standard-library...

Another question is, whether to also support "stable_sort" as in the
STL. It guarantees that elements which are already sorted will stay in
the same order. This is important with "order"-functions that consider
only a part of the data representation to be sorted.

Best regards,
Markus Mottl

-- 
Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Sort.array easily degenerates
  1999-03-09 23:03   ` doligez
@ 1999-03-10 13:58     ` Xavier Leroy
  0 siblings, 0 replies; 25+ messages in thread
From: Xavier Leroy @ 1999-03-10 13:58 UTC (permalink / raw)
  To: doligez, OCAML

I have revised my Quicksort implementation based on that found in
glibc 2, which contains some clever optimizations.  Interested parties
can see the code at

  http://camlcvs.inria.fr/cgi-bin/cvsweb.out/ocaml/stdlib/sort.ml

The behavior on extreme situations (input already sorted) is now much
better.

> There's no way to implement Sedgewick's quicksort with the interface
> given in sort.mli.  You'd need two comparison functions, one for ">="
> and one for "<=".  That explains the degenerate case when all the
> elements are equal.

It's true that ">=" vs. ">" can make a big difference, but >= is
easily definable from <= : a >= b is b <= a, a > b is not(a <= b),
a < b is not(b <= a).

> And it degenerates on already-sorted data because you swap the pivot
> with the right-most element.  As a consequence, one of the subarrays
> has its two highest elements in first and last positions, which makes
> the median-of-three degenerate.  I think this bug is also in
> Sedgewick's pseudo-code.

Sedgewick doesn't even give pseudo-code for the "median of three"
heuristic, but the glibc implementation avoids this swap altogether.

> Also, you should recurse on the smallest subarray first, not the
> largest.

Good point.

> I vote for Shellsort.

Well, I tried it, and it's noticeably slower than quicksort by almost
a factor of two on random input.  Perhaps because polymorphic array
access is quite expensive in OCaml.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
                     ` (2 preceding siblings ...)
  1999-03-05 19:59   ` doligez
@ 1999-03-11  3:06   ` Alexey Nogin
  1999-03-11  9:44     ` Xavier Leroy
  1999-03-11 23:42   ` List.filter in Ocaml 2.02 Alexey Nogin
  4 siblings, 1 reply; 25+ messages in thread
From: Alexey Nogin @ 1999-03-11  3:06 UTC (permalink / raw)
  To: Xavier Leroy, caml-list; +Cc: Jason Hickey, Alexei Kopylov

Hi,

After I upgraded OCaml & Camlp4 from 2.01 to 2.02, our the native code
of our program became much smaller (5124462 instead of 7219378), but it
also became a little slower (1.5% - 3.5% for various inputs). Do you
have an idea what could have caused it?

Also, I was doing some performance mesurements (using P6 performance
counter support patches for Linux by Erik Hendriks -
http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some
information (and can get more of it) on the performance counters for my
program under both 2.01 and 2.02. In particular, the number of requests
from the processor to the L1 data cache became 2%-3% bigger.

I am using RedHat Linux 5.2 with kernel 2.2.2 on a dual-PII/400(Xeon).

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-11  3:06   ` Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! Alexey Nogin
@ 1999-03-11  9:44     ` Xavier Leroy
  1999-03-11 23:59       ` Alexey Nogin
  1999-04-06  2:06       ` Alexey Nogin
  0 siblings, 2 replies; 25+ messages in thread
From: Xavier Leroy @ 1999-03-11  9:44 UTC (permalink / raw)
  To: Alexey Nogin, caml-list; +Cc: Jason Hickey, Alexei Kopylov

> After I upgraded OCaml & Camlp4 from 2.01 to 2.02, our the native code
> of our program became much smaller (5124462 instead of 7219378), but it
> also became a little slower (1.5% - 3.5% for various inputs). Do you
> have an idea what could have caused it?

In the past, I've observed speed variations by at least +/- 5% caused
exclusively by minor variations in code placement (such as adding or
deleting instructions that are never executed).  Almost any
modification in the code generator affects code placement.  If only
for this reason, speed variations of less than 5% are essentially
meaningless: there's no way to attribute them to a particular
otpimization or to good/bad luck in code placement.  (Makes you very
suspicious of those PLDI papers where they report 1% speedups...)

> Also, I was doing some performance mesurements (using P6 performance
> counter support patches for Linux by Erik Hendriks -
> http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some
> information (and can get more of it) on the performance counters for my
> program under both 2.01 and 2.02. In particular, the number of requests
> from the processor to the L1 data cache became 2%-3% bigger.

That's more meaningful.  The two new optimizations in 2.02 (closed
toplevel functions and allocation coalescing) should reduce the number
of memory accesses.  Allocation coalescing might increase register
pressure locally, causing other stuff to be spilled on the stack,
though.  Is there any way you could get a per-function profile of
memory requests? (like on the Alpha with the Digital Unix tools).

- Xavier Leroy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* List.filter in Ocaml 2.02
  1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
                     ` (3 preceding siblings ...)
  1999-03-11  3:06   ` Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! Alexey Nogin
@ 1999-03-11 23:42   ` Alexey Nogin
  1999-03-12 10:10     ` Wolfram Kahl
                       ` (2 more replies)
  4 siblings, 3 replies; 25+ messages in thread
From: Alexey Nogin @ 1999-03-11 23:42 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list

Hi,

The filter function implementation does not seem to be too efficient. I did some
testing once and it turned out that the most efficient (for my applications) way
to write the filter function was:

let rec filter f = function
   [] -> []
 | (h::t) as l ->
      if f h then
         let rem = filter f t in
         if rem == t then l else h::rem
      else
         filter f t

The main gain here is that we do not allocate any new memory for sublist (or the
whole list) that does not change as a result of the filtering.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-11  9:44     ` Xavier Leroy
@ 1999-03-11 23:59       ` Alexey Nogin
  1999-03-13 13:40         ` Anton Moscal
  1999-04-06  2:06       ` Alexey Nogin
  1 sibling, 1 reply; 25+ messages in thread
From: Alexey Nogin @ 1999-03-11 23:59 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list, Jason Hickey, Alexei Kopylov, Paul Stodghill

Xavier Leroy wrote:

> In the past, I've observed speed variations by at least +/- 5% caused
> exclusively by minor variations in code placement (such as adding or
> deleting instructions that are never executed).  Almost any
> modification in the code generator affects code placement.  If only
> for this reason, speed variations of less than 5% are essentially
> meaningless: there's no way to attribute them to a particular
> otpimization or to good/bad luck in code placement.  (Makes you very
> suspicious of those PLDI papers where they report 1% speedups...)
>
> > Also, I was doing some performance mesurements (using P6 performance
> > counter support patches for Linux by Erik Hendriks -
> > http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some
> > information (and can get more of it) on the performance counters for my
> > program under both 2.01 and 2.02. In particular, the number of requests
> > from the processor to the L1 data cache became 2%-3% bigger.
>
> That's more meaningful.  The two new optimizations in 2.02 (closed
> toplevel functions and allocation coalescing) should reduce the number
> of memory accesses.  Allocation coalescing might increase register
> pressure locally, causing other stuff to be spilled on the stack,
> though.

Well, in this case I should probably try to remove the allocation coalescing
and see what happens. Am I right assuming that in order to do that I have to
revert changes for versions 1.8 -> 1.9 and 1.10 -> 1.11 of the
asmcomp/selectgen.ml?


> Is there any way you could get a per-function profile of
> memory requests? (like on the Alpha with the Digital Unix tools).

I am not sure. I could probably write something gprof-like that would record
the values of the performance counters at each function call, but I am afraid
that's a lot of work. And I could probably get access to Alpha, but I do not
think I will see the same slowdown effect on Alpha as I see on x86, so the
Alpha memory access numbers would not help much.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-11 23:42   ` List.filter in Ocaml 2.02 Alexey Nogin
@ 1999-03-12 10:10     ` Wolfram Kahl
  1999-03-12 18:18       ` Alexey Nogin
  1999-03-13  2:43       ` David Monniaux
  1999-03-12 17:01     ` Jean-Francois Monin
       [not found]     ` <199903121011.LAA27611@lsun565.lannion.cnet.fr>
  2 siblings, 2 replies; 25+ messages in thread
From: Wolfram Kahl @ 1999-03-12 10:10 UTC (permalink / raw)
  To: nogin; +Cc: Xavier.Leroy, caml-list

Alexey Nogin <nogin@cs.cornell.edu> writes:

 > The filter function implementation does not seem to be too efficient.
 > I did some testing once and it turned out that the most efficient
 > (for my applications) way to write the filter function was:
 >
 > let rec filter f = function
 >    [] -> []
 >  | (h::t) as l ->
 >	 if f h then
 >	    let rem = filter f t in
 >	    if rem == t then l else h::rem
 >	 else
 >	    filter f t
 >
 > The main gain here is that we do not allocate any new memory for sublist
 > (or the whole list) that does not change as a result of the filtering.

The intended sharing here is not fully explicit, but partially implicit.
If this works as described, then it should not make a difference from:

let rec filter f = function
   [] as l -> l
   | ...

, where the sharing is now fully explicit.
The fact that this is reported to work anyway, implies
that the compiler shares these common subexpressions ``[]'',
and this gets me asking:

How far does this kind of common subexpression sharing extend?
Does it work for user-defined datatypes, too?
Does it work only for zero-ary constructors, or are some
more complicated constructions recognised, too?

Being curious...

Wolfram

P.S.: Does it work for ``filter f'', or is it useful to write
      (as I often do):

 > let filter f = 
 >  let rec f1 = function
 >     [] -> []
 >   | (h::t) as l ->
 >	 if f h then
 >	    let rem = f1 t in
 >	    if rem == t then l else h::rem
 >	 else
 >	    f1 t
 >  in f1

Will filter be expanded for short constant lists at compile time in
any way?
Or will e.g. List.fold_right or List.fold_left
(known to be primitively recursive at compile-time of user modules :-)
be expanded for short constant lists at compile time
by the inlining mechanism?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-11 23:42   ` List.filter in Ocaml 2.02 Alexey Nogin
  1999-03-12 10:10     ` Wolfram Kahl
@ 1999-03-12 17:01     ` Jean-Francois Monin
  1999-03-12 18:41       ` Alexey Nogin
       [not found]     ` <199903121011.LAA27611@lsun565.lannion.cnet.fr>
  2 siblings, 1 reply; 25+ messages in thread
From: Jean-Francois Monin @ 1999-03-12 17:01 UTC (permalink / raw)
  To: Alexey Nogin; +Cc: Xavier Leroy, caml-list

> let rec filter f = function
>    [] -> []
>  | (h::t) as l ->
>       if f h then
>          let rem = filter f t in
>          if rem == t then l else h::rem
>       else
>          filter f t

Here is a version without "==" and "as", with the same property on
memory allocation. Pierre Weis told me that the trick (called
"sharing transducers") is due to G. Huet and was already present in
Caml V3.1 pervasives more than ten years ago.

exception Identity

let share f x = try f x with Identity -> x

let filter f l =
 let rec fil = function
   | [] -> raise Identity
   | h :: t ->
      if f h then h :: fil t else share fil t in
 share fil l

-- 
Jean-Francois Monin, CNET DTL/MSV,          Tel +33 2 96 05 26 79
2 av. Pierre Marzin, 22307 Lannion, France  Fax +33 2 96 05 39 45
SANS TRAIT D'UNION :     JeanFrancois.Monin@cnet.francetelecom.fr




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-12 10:10     ` Wolfram Kahl
@ 1999-03-12 18:18       ` Alexey Nogin
  1999-03-13  2:43       ` David Monniaux
  1 sibling, 0 replies; 25+ messages in thread
From: Alexey Nogin @ 1999-03-12 18:18 UTC (permalink / raw)
  To: Wolfram Kahl; +Cc: caml-list

Wolfram Kahl wrote:

> Alexey Nogin <nogin@cs.cornell.edu> writes:
>
>  > The filter function implementation does not seem to be too efficient.
>  > I did some testing once and it turned out that the most efficient
>  > (for my applications) way to write the filter function was:
>  >
>  > let rec filter f = function
>  >    [] -> []
>  >  | (h::t) as l ->
>  >       if f h then
>  >          let rem = filter f t in
>  >          if rem == t then l else h::rem
>  >       else
>  >          filter f t
>  >
>  > The main gain here is that we do not allocate any new memory for sublist
>  > (or the whole list) that does not change as a result of the filtering.
>
> The intended sharing here is not fully explicit, but partially implicit.
> If this works as described, then it should not make a difference from:
>
> let rec filter f = function
>    [] as l -> l
>    | ...
>
> , where the sharing is now fully explicit.
> The fact that this is reported to work anyway, implies
> that the compiler shares these common subexpressions ``[]'',
> and this gets me asking:
>
> How far does this kind of common subexpression sharing extend?
> Does it work for user-defined datatypes, too?
> Does it work only for zero-ary constructors, or are some
> more complicated constructions recognised, too?

As far as I understand it, for unboxed values such as integers and zero-ary
constants (such as []) in user-defined datatypes == and = are equivalent. It
has nothing to do with the fact that they are common subexpressions - if you
write let x = [] in some module and let y = [] in another, it will still be
the case that x == y.

> P.S.: Does it work for ``filter f'', or is it useful to write
>       (as I often do):
>
>  > let filter f =
>  >  let rec f1 = function
>  >     [] -> []
>  >   | (h::t) as l ->
>  >       if f h then
>  >          let rem = f1 t in
>  >          if rem == t then l else h::rem
>  >       else
>  >          f1 t
>  >  in f1

This will allocate memory for the closure which is contrary to the main goal -
not allocating anything unless really necessary and not allocate anything at
all when list does not change.

> Will filter be expanded for short constant lists at compile time in
> any way?

I do not think so.

> Or will e.g. List.fold_right or List.fold_left
> (known to be primitively recursive at compile-time of user modules :-)
> be expanded for short constant lists at compile time
> by the inlining mechanism?

As far as I know, recursive functions are never inlined.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
       [not found]     ` <199903121011.LAA27611@lsun565.lannion.cnet.fr>
@ 1999-03-12 18:37       ` Alexey Nogin
  1999-03-15  9:06         ` Jean-Francois Monin
  0 siblings, 1 reply; 25+ messages in thread
From: Alexey Nogin @ 1999-03-12 18:37 UTC (permalink / raw)
  To: Jean-Francois Monin; +Cc: caml-list

Jean-Francois Monin wrote:

> > let rec filter f = function
> >    [] -> []
> >  | (h::t) as l ->
> >       if f h then
> >          let rem = filter f t in
> >          if rem == t then l else h::rem
> >       else
> >          filter f t
>
> Here is a version without "==" and "as", with the same property on
> memory allocation.
>
> exception No_change
> let rec filter f l =
>   try fil f l with No_change -> l
> and fil f = function
>      [] -> raise No_change
>    | h::t -> if f h then h :: fil f t else filter f t

If would be much slower. Here are the reasons:
1) Even when exception is not raised, the try ... with ... construction
executes lots of extra instructions and is quite expensive.
2) Even if the list is not changed, you first allocate the space and put
its copy there, then raise an exception and return the old value. That
means that you first waste time in memory allocation and then you waste
it in garbage collection.
3) "raise No_change" allocates 8 bytes (on x86), so you'll waste some
time in garbage collection.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-12 17:01     ` Jean-Francois Monin
@ 1999-03-12 18:41       ` Alexey Nogin
  0 siblings, 0 replies; 25+ messages in thread
From: Alexey Nogin @ 1999-03-12 18:41 UTC (permalink / raw)
  To: Jean-Francois Monin; +Cc: caml-list

Jean-Francois Monin wrote:

> > let rec filter f = function
> >    [] -> []
> >  | (h::t) as l ->
> >       if f h then
> >          let rem = filter f t in
> >          if rem == t then l else h::rem
> >       else
> >          filter f t
>
> Here is a version without "==" and "as", with the same property on
> memory allocation. Pierre Weis told me that the trick (called
> "sharing transducers") is due to G. Huet and was already present in
> Caml V3.1 pervasives more than ten years ago.
>
> exception Identity
>
> let share f x = try f x with Identity -> x
>
> let filter f l =
>  let rec fil = function
>    | [] -> raise Identity
>    | h :: t ->
>       if f h then h :: fil t else share fil t in
>  share fil l

This version has all the problems that the previous version had plus it
also allocates some memory for closure which makes things much slower
when lists are short.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-12 10:10     ` Wolfram Kahl
  1999-03-12 18:18       ` Alexey Nogin
@ 1999-03-13  2:43       ` David Monniaux
  1 sibling, 0 replies; 25+ messages in thread
From: David Monniaux @ 1999-03-13  2:43 UTC (permalink / raw)
  To: Wolfram Kahl; +Cc: Liste CAML

On 12 Mar 1999, Wolfram Kahl wrote:

> The fact that this is reported to work anyway, implies
> that the compiler shares these common subexpressions ``[]'',
> and this gets me asking:

If I'm not mistaken, the reason is much more trivial. [] is a 0-ary
constructor and is as such stored as an unboxed integer. Two unboxed
integers of the same value are "physically equal". :-)

Regards,
D. Monniaux




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-11 23:59       ` Alexey Nogin
@ 1999-03-13 13:40         ` Anton Moscal
  1999-03-24  4:20           ` Alexey Nogin
  0 siblings, 1 reply; 25+ messages in thread
From: Anton Moscal @ 1999-03-13 13:40 UTC (permalink / raw)
  To: Alexey Nogin
  Cc: Xavier Leroy, caml-list, Jason Hickey, Alexei Kopylov, Paul Stodghill

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: multipart/mixed; boundary="On Thu, 11 Mar 1999, Alexey Nogin wrote:", Size: 3139 bytes --]

On Thu, 11 Mar 1999, Alexey Nogin wrote:

Sorry, I forgot attach patch in my first message.

> > pressure locally, causing other stuff to be spilled on the stack,
> > though.
> 
> Well, in this case I should probably try to remove the allocation coalescing
> and see what happens. Am I right assuming that in order to do that I have to
> revert changes for versions 1.8 -> 1.9 and 1.10 -> 1.11 of the
> asmcomp/selectgen.ml?
> 
Also you can try to apply this patch (attached to this message) to
asmcomp/selectgen.ml: this is another variant of allocation combiner.
Code, generated by this patch, is more close to code generated by the
ocaml-2.01 than the code from official ocaml-2.02 (Warning: I tested it
only on my own programs and on the Ocaml bootstrapping).

Regards,
Anton E.Moscal

Content-Type: APPLICATION/x-gunzip; name="selectgen.patch.gz"
Content-Transfer-Encoding: BASE64
Content-ID: <Pine.LNX.4.03.9903131640260.7348@post.tepkom.ru>
Content-Description: 
Content-Disposition: attachment; filename="selectgen.patch.gz"

H4sICBSA6TYAA3NlbGVjdGdlbi5wYXRjaACtV19v2zYQf44/xaF7mFTZXpw4
c+PUXYo+GUOLYdlbUQi0RMdEJcoj6bRuku++uyNlS7aSDMMMJJLIu9/95d1x
MBhAlYmyGJwNT89+EbbMqnL9i5WFzNyt1MOyOPlrtYGPwgCMYTSenl5Oxxcw
ury87CVJ0mAeaPmtG+Bmoz3AhADGF9PTcw9wfQ2D84tJfzSGhJ6/wvV1D+gn
CytBfndGZA4ipa0zaVZpC2qYS5vhQ5hb/G+kBfyL8VUjOShN/DtGqfOUmaMY
SRjFyr970BtEr+G9kfBN0rLKJQhAxRdKyxxEUaBZTlX6N3gd9wa9wZ0ooNw4
sSik302tE07CDKJPlZYwhehPeTt0IIwRW3iNoC6Gak0YDIDiiHAKuuqS0xug
1jdVKSPTBx1PQQpTKGkaJFDIpSNbNwWZiW+3yjok+Wy+9JkfINsYI7WDarm0
kqkCPwpb4MtXUBY+6y9DbxagVnNtpUG9Af2ykTqTyAxldYd+XZqqBLJubeUm
r1AgEKqrQOjKraRhGOhBKd2qysmRCJUSM1iTQW4dzDjG4zeT/vkZJOPJZX+0
D7L/Yaosf/K8EM2rdTTH8GWoeFQsir63IMbwkdcp6PREN1w9gcHyU+8nWxOD
yQEjln1Nq2Wb0+S9pP5+gDlLgxQG75pEf5gqG2L6OYEyUtLNwnQGzmxkC6xA
96CkGVAyZEai4+/AbdepyHMTcrNJa9UPyiF6pPL72oDUdxB9cJs15hkeJ7KX
cluH+PJvgYHXUAqXrVq5+E25VZMOOOPIktail+wqJ4o0yA8o9OEVwCB84MX+
XotDNbpD5/23R0e2zw8PX9ArV4fMTd3fDkL25332RnxEzaJkqRxyVBRR0rRW
jpy+kiKXhj0trVX69iWB5J0jGkyHxspD0Cr4AnMn7vCnWgZH2FSVpcwVCUBa
wDNy5DIkv9VoQNRM2WqN7sOSUa0JAb2Y515aDF426hUfInGJ5GQ4lsGJ2JGH
KKMjjE/rRCXXEVf0CYvQncS3oS8LrJwP7XGwXjASyMAje576Re+ppg7Feo3V
fOePuNMl+l+m2T6gkPy3hGOEF01A3bPV0JfjRmoeJys79IVsROt6yYGCHYfv
8Ngdsjx3iJrET2vPy9gztVNu29yIBrzF9Yw6YNyS3i61mAbtIss5O6JyuNPS
F8KWksdFtKPgEhX1ncn5GGeKZDI67Y/fhLZjA7RQBSNj3b0K635qwI6WYGP0
XZqs2vdgEg8LYVXm26kFHJ8+3vxOfTAhU0MnXBt1R/kWOn3KCNhQUVHvENZb
ZjXBKfDwMqu/IVriyMQFWfmtwTu4V1zigeefGcz5PDVCflWD0OMx9l9enMCD
Q78aPnPYjXbKeHWitwmWeIgIq8/MMY03/hNLXBiwZh6faxuScxGqNaONWZCO
EQjgvlOFuY2bVNh4OJlnvBepEYo4i7usDBScEwfuJK7O9bP4Ed4me+d61Rti
ndmykOcE72j+X9EWQckgnG0sYrjS2U7xbTpfAkux7pTqUV4QrJYUMo5XhKXP
9UF2Cj6g65LnupflSyoUFebsosq3XXJ5s9PZxPE8dOswaDarl5yEvG4n9O6E
MQEAHbQU1M+7YxbWAe7r84QrJ/dEwvrSFzSP2lVYap3LXTOH0M01F0Oioxl2
hnB0d6GVR15vcM9pguXFQApDnndxgWp2zcsLBTblUP5upBvKcu22OwseT4ik
cTfaeyVMfjq0v91e8BYViEb5iQPkocPDoA7Rq7LK1XL7qg9pzO5XjXD5cjMM
9agLCAcNhMGWkdOKf0fXpUywE5J6dGGxd/AcRliqMy86hKTPqNWgpBkDCaNT
LAzxfv2Iu52DdbXjhEM9+Uo629MudxQcdL553fCFyylRqB/+hoc3L0EJmfFX
LpfYf/i1dc3i9oVUucwKaim+012c9SfY6San+AiNbt9G6163HH4oyyHypnSm
uO3dk8BUi5LSYb9PC+FqQ5/cemf1FSzMKTXOrmX7/hl6/uHuYc1qcewELQVd
GhuK8MIjmU/jzz+IfRi5rxAAAA==



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: List.filter in Ocaml 2.02
  1999-03-12 18:37       ` Alexey Nogin
@ 1999-03-15  9:06         ` Jean-Francois Monin
  0 siblings, 0 replies; 25+ messages in thread
From: Jean-Francois Monin @ 1999-03-15  9:06 UTC (permalink / raw)
  To: Alexey Nogin; +Cc: Jean-Francois Monin, caml-list

Alexey Nogin wrote :
 > > exception Identity
 > >
 > > let share f x = try f x with Identity -> x
 > >
 > > let filter f l =
 > >  let rec fil = function
 > >    | [] -> raise Identity
 > >    | h :: t ->
 > >       if f h then h :: fil t else share fil t in
 > >  share fil l
 > [...]
 > 2) Even if the list is not changed, you first allocate the space and put
 > its copy there, then raise an exception and return the old value. That
 > means that you first waste time in memory allocation and then you waste
 > it in garbage collection.

I don't agree, because this construction 
 > >       if f h then h :: fil t else share fil t in
                         ^^
should never be evaluated when Identity is raised in the "fil t"
occurring just after.  At least, if you replace "h :: fil t" with
"h $$ fil t", where

let ($$) x l = x :: l

you get the expected behavior (for mem alloc concerns).
It would be very surprising that the above program behaves not the same.

-- Jean-Francois




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-13 13:40         ` Anton Moscal
@ 1999-03-24  4:20           ` Alexey Nogin
  1999-03-26 11:49             ` Anton Moscal
  0 siblings, 1 reply; 25+ messages in thread
From: Alexey Nogin @ 1999-03-24  4:20 UTC (permalink / raw)
  To: Anton Moscal
  Cc: Xavier Leroy, caml-list, Jason Hickey, Alexei Kopylov, Paul Stodghill

Anton Moscal wrote:

> > > pressure locally, causing other stuff to be spilled on the stack,
> > > though.
> >
> > Well, in this case I should probably try to remove the allocation coalescing
> > and see what happens. Am I right assuming that in order to do that I have to
> > revert changes for versions 1.8 -> 1.9 and 1.10 -> 1.11 of the
> > asmcomp/selectgen.ml?
> >
> Also you can try to apply this patch (attached to this message) to
> asmcomp/selectgen.ml: this is another variant of allocation combiner.
> Code, generated by this patch, is more close to code generated by the
> ocaml-2.01 than the code from official ocaml-2.02 (Warning: I tested it
> only on my own programs and on the Ocaml bootstrapping).

Thanks for the patch, but unfortunately it only made thinks even slower than with
2.02. It raised (compared with 2.02) number of memory requests by 0.2%-2% and
made L2 miss rate grow (up to from 37% to 44% on some input).

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)





^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-24  4:20           ` Alexey Nogin
@ 1999-03-26 11:49             ` Anton Moscal
  0 siblings, 0 replies; 25+ messages in thread
From: Anton Moscal @ 1999-03-26 11:49 UTC (permalink / raw)
  To: Alexey Nogin; +Cc: caml-list

Hello

> > >
> > Also you can try to apply this patch (attached to this message) to
> > asmcomp/selectgen.ml: this is another variant of allocation combiner.
> > Code, generated by this patch, is more close to code generated by the
> > ocaml-2.01 than the code from official ocaml-2.02 (Warning: I tested it
> > only on my own programs and on the Ocaml bootstrapping).
> 
> Thanks for the patch, but unfortunately it only made thinks even slower than with
> 2.02. It raised (compared with 2.02) number of memory requests by 0.2%-2% and

Can you send to me some small example?

> made L2 miss rate grow (up to from 37% to 44% on some input).

This can be normal result from elimination of useless commands (for
example many loading/storing of young_limit and young_ptr variables was
removed).

Regards, Anton




^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-03-11  9:44     ` Xavier Leroy
  1999-03-11 23:59       ` Alexey Nogin
@ 1999-04-06  2:06       ` Alexey Nogin
  1999-04-06  7:53         ` Xavier Leroy
  1 sibling, 1 reply; 25+ messages in thread
From: Alexey Nogin @ 1999-04-06  2:06 UTC (permalink / raw)
  To: Xavier Leroy; +Cc: caml-list, Jason Hickey, Alexei Kopylov, Paul Stodghill

Xavier Leroy wrote:

> > Also, I was doing some performance mesurements (using P6 performance
> > counter support patches for Linux by Erik Hendriks -
> > http://beowulf.gsfc.nasa.gov/software/ ) when I upgraded, so I have some
> > information (and can get more of it) on the performance counters for my
> > program under both 2.01 and 2.02. In particular, the number of requests
> > from the processor to the L1 data cache became 2%-3% bigger.
>
> That's more meaningful.  The two new optimizations in 2.02 (closed
> toplevel functions and allocation coalescing) should reduce the number
> of memory accesses.  Allocation coalescing might increase register
> pressure locally, causing other stuff to be spilled on the stack,
> though.

I tried backing up allocation coalescing (by backing up changes 1.8 -> 1.9
and 1.10 -> 1.11 of asmcomp/senectgen.ml), but that only increased the number
of memory accesses. Do you have another guess what else was changed between
2.01 and 2.02 that could have caused the increase of the number of memory
accesses?

> Is there any way you could get a per-function profile of
> memory requests? (like on the Alpha with the Digital Unix tools).

I do not think so. I could do something in the gprof-style to record the # of
memory requests at each function call, but such mesurement would probably be
too inefficient and inaccurate (since mesurement code would produce too many
memory accesses by itself). We also tried to run our code (MetaPRL,
http://ensemble01.cs.cornell.edu:12000/cvsweb/~checkout~/meta-prl/doc/htmlman/default.html
) under RSIM simulator, but could not make it work.

Alexey
--------------------------------------------------------------
Home Page: http://www.cs.cornell.edu/nogin/
E-Mail: nogin@cs.cornell.edu (office), ayn2@cornell.edu (home)
Office: Upson 4139, tel: 1-607-255-4934
ICQ #: 24708107 (office), 24678341 (home)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_!
  1999-04-06  2:06       ` Alexey Nogin
@ 1999-04-06  7:53         ` Xavier Leroy
  0 siblings, 0 replies; 25+ messages in thread
From: Xavier Leroy @ 1999-04-06  7:53 UTC (permalink / raw)
  To: Alexey Nogin; +Cc: caml-list, Jason Hickey, Alexei Kopylov, Paul Stodghill

> I tried backing up allocation coalescing (by backing up changes 1.8
> -> 1.9 and 1.10 -> 1.11 of asmcomp/senectgen.ml), but that only
> increased the number of memory accesses.

I'm not surprised that allocation coalescing reduces the number of
memory accesses, indeed.

> Do you have another guess
> what else was changed between 2.01 and 2.02 that could have caused
> the increase of the number of memory accesses?

Some library functions (on which your code may depend) have been
reimplemented differently, most notably Printf.sprintf.  That's all I
can see.  

- Xavier Leroy




^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~1999-04-06 16:31 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <Xavier.Leroy@inria.fr>
1999-03-05 10:41 ` Objective Caml 2.02 Xavier Leroy
1999-03-05 13:34   ` Camlp4 2.02 Daniel de Rauglaudre
1999-03-05 15:11   ` Objective Caml 2.02 Pierpaolo Bernardi
1999-03-05 19:59   ` doligez
1999-03-11  3:06   ` Upgrade from OCaml 2.01 to OCaml 2.02 made things _slower_! Alexey Nogin
1999-03-11  9:44     ` Xavier Leroy
1999-03-11 23:59       ` Alexey Nogin
1999-03-13 13:40         ` Anton Moscal
1999-03-24  4:20           ` Alexey Nogin
1999-03-26 11:49             ` Anton Moscal
1999-04-06  2:06       ` Alexey Nogin
1999-04-06  7:53         ` Xavier Leroy
1999-03-11 23:42   ` List.filter in Ocaml 2.02 Alexey Nogin
1999-03-12 10:10     ` Wolfram Kahl
1999-03-12 18:18       ` Alexey Nogin
1999-03-13  2:43       ` David Monniaux
1999-03-12 17:01     ` Jean-Francois Monin
1999-03-12 18:41       ` Alexey Nogin
     [not found]     ` <199903121011.LAA27611@lsun565.lannion.cnet.fr>
1999-03-12 18:37       ` Alexey Nogin
1999-03-15  9:06         ` Jean-Francois Monin
1999-03-06  0:27 Sort.array easily degenerates Markus Mottl
1999-03-09 10:44 ` Xavier Leroy
1999-03-09 23:03   ` doligez
1999-03-10 13:58     ` Xavier Leroy
1999-03-10  0:28   ` Markus Mottl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox