Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Gerd Stolpmann <Gerd.Stolpmann@darmstadt.netsurf.de>
To: William Chesters <williamc@dai.ed.ac.uk>
Cc: caml-list@inria.fr
Subject: Re: speed versus C
Date: Thu, 7 Oct 1999 00:49:10 +0200	[thread overview]
Message-ID: <99100702163301.21475@ice> (raw)
In-Reply-To: <199910061521.QAA02123@toy.william.bogus>

On Wed, 06 Oct 1999, William Chesters wrote:
>Gerd Stolpmann writes:
> > - The recursive data types of Ocaml are often more problem-oriented than
> >   "imperative" data structures. Consider you want to fill an (C) array with an
> >   unknown number of elements. The typical solution puts the elements into the
> >   array until it is full, and then enlarges the array by reallocating new
> >   memory. If you use the Ocaml lists instead, you do not have this problem.
>
>   But you do have N little problems (N memory allocations and
>structure initialisations); you waste large amounts of memory in
>headers and pointers, thereby filling up your caches; and you end up
>with a data structure which takes at just as long to traverse and
>entirely fails to support any other kind of access.  If you do the
>sums, you will find that an array-based data structure (`Vector')
>works out cheaper any way you care to look at it---as long as you
>increase its capacity exponentially, e.g. doubling it each time it
>runs out.  This stipulation is not a problem because the space you
>thereby waste is still less than the amount you would lose to the
>spine of a list (and is less pernicious since it doesn't have to churn
>through the cache).

The basis of your argumentation is that the array solution works better with
current hardware and typical problem sizes. For example, memcpy is very fast
because it is specially supported by the hardware, much faster than
initializing the same number of elements of a list. This is absolutely true,
but I think that it only demonstrates that my example was too simple.

Caml can only be faster than C if you take into account that programmers are not
perfect. In C, they often do not choose the most efficient data structure
because it would be complicated and/or require an enormous programming
discipline. An example are C strings which are far from being optimal, and the
alternative would be to use a library which implements strings with a length
field. In order to be compatible with the existing kind of strings these
improved strings are typically not abstract, and simply define a struct such
that you can anywhere access their components: just another source of errors.
There are a lot more problems with sophisticated data structures, such as
uninitialized pointers, the question who is responsible for memory deallocation,
and indexes out of the bounds of arrays. Some of these problems can be solved
by additional layers, but then C is not faster any longer. This is always the
same bad alternative in C: Write an efficient program and accept the risk of
errors, or write an average program and have some more protection against your
own imperfectness.

In Caml, choosing sophisticated data structures is less error-prone, and
programmers are more willing to implement and use them. Even the basic data
types (i.e. without abstraction layers) provide often a good protection against
stupid faults, and you can use them directly without wrapping them in functions
or modules. This simply means that a comparable programming style which tries
to avoid errors leads to faster code in Caml than in C, because you do not need
to encapsulate the details of accessing the structure. For example, list
construction x::l leads to very efficient code in Caml, and the comparable
implementation in C requires an additional function abstraction which
guarantees that the new list cell is properly initialized.

I know that there are some hackers who can live without abstraction but the
avarage programmer is happy if he/she has a means to avoid unnecessary errors.

My first example looks now far better, because you have forgotten to count the
function calls to hide the array implementation. In Caml, they are not
necessary.

>   Incidentally, implementing a Vector in ocaml is slightly fiddly,
>because you have to keep valid pointers of the right type in even the
>unused part all the time.  This means o.a. delaying the creation of
>the underlying array until you have at least one element to put in it.

Caml is simply not good at arrays. I think it is not a good idea to adopt too
much imperative style.

Gerd

--
----------------------------------------------------------------------------
Gerd Stolpmann      Telefon: +49 6151 997705 (privat)
Viktoriastr. 100             
64293 Darmstadt     EMail:   Gerd.Stolpmann@darmstadt.netsurf.de (privat)
Germany                     
----------------------------------------------------------------------------




  reply	other threads:[~1999-10-07  7:59 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
1999-10-03 21:35 Jan Brosius
1999-10-04 21:59 ` skaller
1999-10-05 23:22   ` chet
1999-10-06 10:22     ` skaller
1999-10-05 20:20 ` Gerd Stolpmann
1999-10-06 15:21   ` William Chesters
1999-10-06 22:49     ` Gerd Stolpmann [this message]
1999-10-07 10:26       ` Michel Quercia
1999-10-07 10:46       ` William Chesters
1999-10-07 15:48         ` Pierre Weis
1999-10-07 19:21         ` Gerd Stolpmann
1999-10-08  0:26           ` William Chesters
1999-10-10 16:27             ` Gerd Stolpmann
1999-10-10 20:48               ` William Chesters
1999-10-10 23:54                 ` Alain Frisch
1999-10-11 17:58                   ` William Chesters
1999-10-12 14:36                     ` Ocaml Machine (was Re: speed versus C) Alain Frisch
1999-10-12 15:32                       ` David Monniaux
1999-10-12 15:42                         ` Alain Frisch
1999-10-11 19:32                   ` speed versus C John Prevost
1999-10-11 20:50                 ` Gerd Stolpmann
1999-10-12 20:07                   ` skaller
1999-10-08  9:56           ` Pierre Weis
1999-10-07 15:25     ` Markus Mottl
1999-10-07  6:56   ` skaller
1999-10-07 12:37     ` Xavier Urbain
1999-10-07 22:18     ` Gerd Stolpmann
1999-10-08 19:15       ` skaller
1999-10-08 13:40   ` Anton Moscal
1999-10-06  7:58 ` Reply to: " Jens Olsson
1999-10-07 13:00 STARYNKEVITCH Basile
1999-10-08  6:57 Pascal Brisset
     [not found] <Pine.LNX.4.03.9910081713230.31666-100001@post.tepkom.ru>
1999-10-10  4:51 ` skaller
1999-10-11  9:08   ` Anton Moscal
1999-10-12 13:21 Damien Doligez
1999-10-12 20:42 ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99100702163301.21475@ice \
    --to=gerd.stolpmann@darmstadt.netsurf.de \
    --cc=caml-list@inria.fr \
    --cc=williamc@dai.ed.ac.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox