From mboxrd@z Thu Jan  1 00:00:00 1970
Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA15361; Mon, 27 Sep 2004 22:27:12 +0200 (MET DST)
X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f
Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA16376 for <caml-list@pauillac.inria.fr>; Mon, 27 Sep 2004 22:27:11 +0200 (MET DST)
Received: from gateway.imperium.ph ([202.175.240.154])
	by nez-perce.inria.fr (8.13.0/8.13.0) with ESMTP id i8RKR87S028574
	for <caml-list@inria.fr>; Mon, 27 Sep 2004 22:27:10 +0200
Received: from morgoth.imperium.ph (morgoth.imperium.ph [192.168.1.224])
	by gateway.imperium.ph (Postfix) with SMTP
	id 6588646CFD; Tue, 28 Sep 2004 04:16:32 +0800 (PHT)
Received: by morgoth.imperium.ph (sSMTP sendmail emulation); Tue, 28 Sep 2004 04:24:49 +0800
Date: Tue, 28 Sep 2004 04:24:49 +0800
From: "Rafael 'Dido' Sevilla" <dido@imperium.ph>
To: John Goerzen <jgoerzen@complete.org>
Cc: caml-list@inria.fr
Subject: Re: [Caml-list] Observations on OCaml vs. Haskell
Message-ID: <20040927202449.GA548@imperium.ph>
References: <200409271408.51872.jgoerzen@complete.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200409271408.51872.jgoerzen@complete.org>
X-Operating-System: Linux morgoth 2.4.26-gentoo-r9
User-Agent: Mutt/1.5.6i
X-Miltered: at nez-perce with ID 4158779C.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)!
X-Loop: caml-list@inria.fr
X-Spam: no; 0.00; rafael:01 'dido':01 dido:01 caml-list:01 observations:01 haskell:01 2004:99 haskell:01 re-use:01 notoriously:01 haskell's:01 tradeoff:01 boxing:01 tradeoff:01 arc:99 
Sender: owner-caml-list@pauillac.inria.fr
Precedence: bulk

On Mon, Sep 27, 2004 at 02:08:51PM -0500, John Goerzen wrote:
> 1. Haskell lists resemble OCaml Streams
> 
> This is, in fact, one of my main complaints about OCaml lists: that they 
> are a distinct type from OCaml streams.  Streams have a lot of power, 
> and having to convert back and forth between the two doesn't always 
> make a lot of sense.  I've doing things like written versions of map or 
> filter for streams, making them lazy, which results in a very powerful 
> approach to things like file reading, etc.  It's annoying to not be 
> able to re-use all the existing list-related functions on streams.
> 
> In Haskell, there is no separate stream type; a list is a stream.
> 

This is possible because Haskell is lazy.  OCaml is not.  Doing this in
a strict language like OCaml could be potentially messy.  It is
notoriously easy to shoot yourself in the foot if you do this wrong, and
end up reading the entire file in one fell swoop, because OCaml will
evaluate any stream arguments you use immediately, not on demand as
Haskell would.

> 2. Haskell strings are lists of characters
> 
> It's annoying that strings aren't normally processed this way in OCaml, 
> and even more annoying that (^) or (::) cannot be used in pattern 
> matching over strings.  I like Haskell's approach.  The list 
> concatenation operator is the same as the string concatenation operator 
> in Haskell.
> 

This is something of an efficiency/elegance tradeoff.  Making strings
act like lists means potentially boxing *every character in the string*.
In other words, it's potentially a very expensive way of doing business.
Paul Graham was mulling over this kind of tradeoff in his design of Arc,
as I recall.  Another language that does this type of thing is Erlang,
and both languages seem to be significantly slower than OCaml in string
handling, at least as far as this site goes:

http://shootout.alioth.debian.org/

For the word count benchmark OCaml scores 0.1850 seconds, while GHC is a
dismal last place at 105.2110 seconds!  Even the bytecode ocaml is an
order of magnitude faster.  The word frequency benchmark also shows this
kind of poor string handling performance for Haskell, with OCaml scoring
0.5669 seconds, while GHC scores a truly dismal 6.4540, more than an
order of magnitude slower, and even the bytecode ocaml is faster at
4.2644 seconds.

All in all, it would appear that Haskell's approach has been expensive
in terms of performance, if the benchmarks are to be taken at face
value.  Such are the tradeoffs language designers have to make.

> 3. The Num typeclass
> 
> I've written several functions that can work with a "number-like" type.  
> I don't really care if I get an integer, Int32, Int64, float, or what  
> But since these are all different types in OCaml, I am forced to care, 
> right down to using +, +., or Int64.add to perform basic arithmetic.  
> The Num typeclass in Haskell neatly solves that whole problem; I could 
> take a Num, use a unified set of operators, and produce the appropriate 
> result.
> 

Performance again.  The reason why there are so many numeric types in
OCaml is performance.  It is much, much cheaper to add two ints than it
is to add an int32.  And if you do care about accuracy, you would
definitely like to know if a division silently converted your previously
100% accurate integer types into a floating point value!  An int fits
into a machine integer on 32-bit machines, but has a bit or two
reserved.  An int32 or int64 needs to be boxed for garbage collection to
work properly, and hence incurs a performance hit.  As mentioned before,
you will definitely care whether you're using limited accuracy floating
point numbers or integers.

However, I do agree with you that the mess with OCaml having to add
totally different operators to do arithmetic for integral and floating
point types is a wart in what is otherwise an elegant language.  The
compiler could very easily make the arithmetic operators polymorphic,
and constrain the operands to be of the same (numeric) type, and signal
an error if you attempt to, say, multiply an integer with a floating
point number without doing the proper conversion.  Unless there's
something in the design of the type system that prevents this from
happening except at run-time...?

Look again at the benchmarks.  Haskell's flattening of the numeric types
has been costly in terms of performance: The simple integer sum
benchmark has ghc in last place, at 255.3892 seconds, where Ocaml is
near the top at 3.4465 seconds.  Nearly two orders of magnitude
difference.  I get the feeling this is because in OCaml an integer
addition is very nearly a machine language add instruction, as it is in
C, whereas GHC has to do unboxing and all that, and integers in GHC
bear little or no relation to the machine integers that an average CPU
is capable of handling directly.  The statistical moments benchmark
again has ocaml near the top at 0.1760 seconds and ghc at second to last
place with 6.5040...

Haskell doesn't fare so badly in the other benchmarks by comparison.
The Ackermann Function benchmark has GHC competitive with OCaml, so too
with the Fibonacci for instance, which can probably be improved by the
GHC team with some work.  These other benchmarks I point out, where
Haskell is orders of magnitude slower, point to something of an
inefficiency inherent in the way it does things.  Either that, or
whoever made those benchmarks needs help from people who know Haskell
better than he does. ;)

IMHO, Haskell is a truly elegant language, but I don't think it's
practical to use for most of the programming projects I do.  OCaml on
the other hand is the closest I've seen to a practical general-purpose
functional language that is able to balance elegance with efficiency,
which is why I love it in spite of its flaws.

-- 
dido
Te capiam, cuniculus sceleste!

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners