bug in floating point implementation ?

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

* bug in floating point implementation ?
@ 1999-02-25  0:43 Shyjan Mahamud
  1999-02-25 12:10 ` Xavier Leroy
  0 siblings, 1 reply; 4+ messages in thread
From: Shyjan Mahamud @ 1999-02-25  0:43 UTC (permalink / raw)
  To: caml-list

hi folks, 
i'm new to CAML, so i'm not sure if this is a known issue.
is there a known bug or difference in the implementation of floating
point arithmetic between the toplevel and that for the standalone
compiled version. i've run the same exact function from both the
toplevel and a standalone program. the numbers come out slightly
different.

from the toplevel -->

test sample 1 residue = 0.584302
0 : -0.964812
1 : 0.221230
2 : 0.006142
3 : 0.069594
4 : -0.059595
5 : -0.069939
6 : -0.001882
7 : 0.004956
8 : -0.046719
9 : 0.063383
10 : -0.012646
11 : 0.004402

from the stanalone -->

test sample 1 residue = 0.570106
0 : -0.952025
1 : 0.229649
2 : 0.023926
3 : 0.010044
4 : -0.036213
5 : -0.052544
6 : -0.005504
7 : 0.010667
8 : -0.054280
9 : 0.079775
10 : 0.079762
11 : -0.058593

all corresponding numbers are supposed to be exactly the same.
also, as can be seen the errors accumulate as u go from 0 to 11.
both call an SVD routine implemented in C (hence i did have to create my own
toplevel linking in the SVD routine, so this is *not* the standard
toplevel). i'm running a linux box, RH 5.2.

i could provide the code, but it is long and am not sure if it helps
anyway. if it helps, the following routine is what gives the values
above (except for the residue), excuse the imperative style, it was
just a transplant from Numerical Recipes :

let svbksb uwv b =
  let isnan (x : float) = x <> x in
  let (u, w, v) = uwv in
  let m = (Array.length u) in
  let n = (Array.length w) in
  let sum = ref 0.0 in
  let tmp = Array.make n 0.0 in
  let x = Array.make n 0.0 in
  for j = 0 to n - 1 do
    sum := 0.0;
    if ((not (isnan w.(j))) && (w.(j) != 0.0)) then 
      begin
      	for i = 0 to m - 1 do 
	  sum := !sum +. u.(i).(j) *. b.(i)
      	done;
    	sum := !sum /. w.(j)
      end;
    tmp.(j) <- !sum
  done;
  for j = 0 to n - 1 do
    sum := 0.0;
    for jj = 0 to n - 1 do
      sum := !sum +. v.(j).(jj) *. tmp.(jj)
    done;
    x.(j) <- !sum
  done;
  x;;

i'm baffled !!!

- shyjan mahamud
Carnegie Mellon University

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bug in floating point implementation ?
  1999-02-25  0:43 bug in floating point implementation ? Shyjan Mahamud
@ 1999-02-25 12:10 ` Xavier Leroy
  1999-02-25 16:57   ` Stefan Monnier
  1999-03-09  3:24   ` mahamud
  0 siblings, 2 replies; 4+ messages in thread
From: Xavier Leroy @ 1999-02-25 12:10 UTC (permalink / raw)
  To: Shyjan Mahamud, caml-list

> i'm new to CAML, so i'm not sure if this is a known issue.
> is there a known bug or difference in the implementation of floating
> point arithmetic between the toplevel and that for the standalone
> compiled version. i've run the same exact function from both the
> toplevel and a standalone program. the numbers come out slightly
> different.

The toplevel and the bytecode compiler (ocamlc) should give exactly
the same results, because they use the same bytecode compiler, virtual
machine and C primitives.

The native-code compiler (ocamlopt) generates in-line assembly code
for floating-point operations, of course, and also has various
unboxing strategies that are not found in ocamlc and in the toplevel.

On all architectures except the Intel x86, the ocamlopt tricks
shouldn't make any differences, because the same floating-point format
(IEEE double) is used everywhere.

The Intel x86 has an interesting quirk: it computes internally in
extended precision (80 bits), and rounds to double precision (64 bits)
only when storing a result in memory.  This means that in a piece of
code such as

        let x = a * b + c

the virtual machine will compute a * b, round it (by storing it in
memory), add c, and round the result.  While the native-code compiler
will compute a * b, keep the result in extended precision in the top
float register, then add c, then round the result.

In extreme cases (badly conditioned algorithm, computation of errors),
you may thus get different results.

(Similar problems plague all x86 compilers.  It's really bad for Java,
which specifies that all computations must be done in 64 bits.)

If you don't think this explains your problem (e.g. you see the same
discrepancies on other processors), please send me a complete program
that reproduces the problem and I'll look at it.

- Xavier Leroy

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bug in floating point implementation ?
  1999-02-25 12:10 ` Xavier Leroy
@ 1999-02-25 16:57   ` Stefan Monnier
  1999-03-09  3:24   ` mahamud
  1 sibling, 0 replies; 4+ messages in thread
From: Stefan Monnier @ 1999-02-25 16:57 UTC (permalink / raw)
  To: caml-list

>>>>> "Xavier" == Xavier Leroy <Xavier.Leroy@inria.fr> writes:
> The Intel x86 has an interesting quirk: it computes internally in
> extended precision (80 bits), and rounds to double precision (64 bits)
> only when storing a result in memory.  This means that in a piece of

But you can make it work in 64bit-mode (or 32bit-mode for that matter)
at which point the discrepencies between true-real 64bit and x86-64bit become
very small (mostly due to the exponent still allowing a wider range).
Of course, there might be good reasons for not using this mode-switch.

	Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: bug in floating point implementation ?
  1999-02-25 12:10 ` Xavier Leroy
  1999-02-25 16:57   ` Stefan Monnier
@ 1999-03-09  3:24   ` mahamud
  1 sibling, 0 replies; 4+ messages in thread
From: mahamud @ 1999-03-09  3:24 UTC (permalink / raw)
  To: Xavier Leroy, caml-list

>>>>> "Xavier" == Xavier Leroy <Xavier.Leroy@inria.fr> writes:

    Xavier> The Intel x86 has an interesting quirk: it computes
    Xavier> internally in extended precision (80 bits), and rounds to
    Xavier> double precision (64 bits) only when storing a result in
    Xavier> memory.  This means that in a piece of code such as

    Xavier>         let x = a * b + c

    Xavier> the virtual machine will compute a * b, round it (by
    Xavier> storing it in memory), add c, and round the result.  While
    Xavier> the native-code compiler will compute a * b, keep the
    Xavier> result in extended precision in the top float register,
    Xavier> then add c, then round the result.

    Xavier> In extreme cases (badly conditioned algorithm, computation
    Xavier> of errors), you may thus get different results.

sorry about the late reply. but i was hung up with other stuff.
thanx for the explanation. i've found out that indeed my computation
was ill-conditioned and have fixed it now. 
in fact, were it not for this intel quirkiness, i might have
overlooked the bug for some-time. this definitely tops my 
list of ways i've debugged code! 

before i was able to spot the problem with my code, i tried
downloading the src on a different architecture. however, it
seems that the tar file is broken, since it creates files when it should
be creating dirs. i never faced this problem for my linux box since 
i downloaded the binary rpms. anyway it does'nt matter for now since
i know what the problem is. but in future it might be useful to
compile ocaml on a different architecture (i had tried both 2.01 &
2.02). 

- shyjan
carnegie mellon university

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~1999-03-09  7:42 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-02-25  0:43 bug in floating point implementation ? Shyjan Mahamud
1999-02-25 12:10 ` Xavier Leroy
1999-02-25 16:57   ` Stefan Monnier
1999-03-09  3:24   ` mahamud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox