From: John Prevost <j.prevost@gmail.com>
To: caml-list@yquem.inria.fr, Hal Daume III <hdaume@isi.edu>
Subject: Re: [Caml-list] bigarrays much lower than normal ones
Date: Sun, 31 Oct 2004 12:26:34 -0500 [thread overview]
Message-ID: <d849ad2a0410310926466aae40@mail.gmail.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0410310750180.22156-100000@albini.isi.edu>
On Sun, 31 Oct 2004 08:05:46 -0800 (PST), Hal Daume III <hdaume@isi.edu> wrote:
> I've been hitting the limiting size of normal float arrays and was having
> a look at the Bigarray module. Unfortunately, it seems roughly 3-4 times
> *slower* than the standard array, which is pretty much unacceptable for
> me. Am I doing something naively wrong, or are the Bigarrays truly this
> slow?
It is indeed possible to speed things up by quite a lot. There are a
few factors at play that make your code slower than it has to be, but
the dominant factor is that your Bigarray version of normalize is much
more polymorphic than it needs to be:
val normalize : (float, 'a, 'b) Bigarray.Array1.t -> unit = <fun>
Compared to this for the normal array version:
val normalize : float array -> unit = <fun>
In the Array version, there's only one type parameter to worry about
at compile time--what goes into the array. That defines everything
you need to know. This is important because the compiler makes
optimizations for arrays of floating point numbers when it has the
ability to. When a function is polymorphic, on the other hand, it has
to generate more generic code.
You noted that float32 was slower than float64: That's because
O'Caml's native float representation is always a 64-bit value. In the
polymorphic version of normalize, the code has to figure out whether
it's working with a float32 or a float64 representation when it pulls
the values out. The other type variable, which defines the array
layout (C or Fortran) also needs to be cut down to avoid over-generic
code.
I tried some other modifications, trying to remove overhead from
bounds checking--but it turns out that those modifications actually
slowed things down. :) In any case, the version with restricted
polymorphism on normalize sped things up a *lot*.
Unmodified Array:
real 1m30.292s
user 1m30.190s
sys 0m0.110s
Unmodified Bigarray:
real 3m31.446s
user 3m31.310s
sys 0m0.130s
Modified Bigarray (restricted polymorphism):
real 1m37.916s
user 1m37.810s
sys 0m0.120s
------------
open Bigarray
let normalize
(a : (float, Bigarray.float64_elt, Bigarray.c_layout) Bigarray.Array1.t) =
let _N = Array1.dim a in
let rec sum n acc =
if n >= _N then acc
else sum (n+1) (acc +. Array1.get a n) in
let s = sum 0 0. in
for i = 0 to _N - 1 do
Array1.set a i (Array1.get a i /. s);
done;
()
let _ =
let a = Array1.create float64 c_layout 1000000 in
for iter = 1 to 100 do
for i = 0 to 999999 do
let i' = float_of_int i in
Array1.set a i (log (0.01 *. i' *. i' +. 3. *. i' +. 4.));
done;
normalize a;
done;
()
------------
You see that the one thing I changed here was to add the type
constraint in the definition of normalize, and it became almost as
fast as the normal array version.
The other thing I'll point out is that you can write:
Array1.set a i x; Array1.get a i
as
a.{i} <- x; a.{i}
Which can be quite a bit easier to read. If I recall right, this
works for arrays of more than one dimension, as well. I can't seem to
find the documentation for this feature, however.
John.
next prev parent reply other threads:[~2004-10-31 17:26 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-10-31 16:05 Hal Daume III
2004-10-31 17:26 ` John Prevost [this message]
2004-10-31 17:41 ` [Caml-list] " malc
2004-11-01 0:05 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d849ad2a0410310926466aae40@mail.gmail.com \
--to=j.prevost@gmail.com \
--cc=caml-list@yquem.inria.fr \
--cc=hdaume@isi.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox