* Strange behaviour of string_of_float
@ 2008-06-22 16:56 Paolo Donadeo
2008-06-22 19:58 ` [Caml-list] " Richard Jones
2008-06-22 20:32 ` Daniel Bünzli
0 siblings, 2 replies; 15+ messages in thread
From: Paolo Donadeo @ 2008-06-22 16:56 UTC (permalink / raw)
To: caml-list caml-list
Today I noticed this strange behaviour of string_of_float:
Let's start with:
# let pi = 4.0 *. atan 1.0;;
val pi : float = 3.14159265358979312
# let (|>) x f = f x;;
val ( |> ) : 'a -> ('a -> 'b) -> 'b = <fun>
Ok, I want to serialize pi:
# (pi |> string_of_float |> float_of_string) -. pi;;
- : float = 2.06945571790129179e-13
string_of_float is not the inverse of float_of_string, at least in this example.
Is this correct? It's not a problem at all, I used this workaround:
# let my_string_of_float = Printf.sprintf "%.1000g";;
val my_string_of_float : float -> string = <fun>
# (pi |> my_string_of_float |> float_of_string) -. pi;;
- : float = 0.
--
Paolo
~
~
:wq
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo
@ 2008-06-22 19:58 ` Richard Jones
2008-06-22 20:45 ` Paolo Donadeo
2008-06-23 8:35 ` Jon Harrop
2008-06-22 20:32 ` Daniel Bünzli
1 sibling, 2 replies; 15+ messages in thread
From: Richard Jones @ 2008-06-22 19:58 UTC (permalink / raw)
To: Paolo Donadeo; +Cc: caml-list caml-list
On Sun, Jun 22, 2008 at 06:56:22PM +0200, Paolo Donadeo wrote:
> string_of_float is not the inverse of float_of_string, at least in
> this example.
Yes, you wouldn't expect it to be, because the string is an
approximate base 10 representation of the float (which is itself only
an approximate base 2 representation of the transcendental number pi).
You might want to read a presentation called "What every computer
programmer should know about floating point arithmetic". There's a
PDF version here:
http://blogs.sun.com/darcy/resource/Wecpskafpa-ACCU.pdf
Rich.
--
Richard Jones
Red Hat
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo
2008-06-22 19:58 ` [Caml-list] " Richard Jones
@ 2008-06-22 20:32 ` Daniel Bünzli
2008-06-22 20:50 ` Paolo Donadeo
2008-06-23 1:06 ` Brian Hurt
1 sibling, 2 replies; 15+ messages in thread
From: Daniel Bünzli @ 2008-06-22 20:32 UTC (permalink / raw)
To: caml-list caml-list
Richard gave you the reason.
If you can serialize to binary, you can acheive what you want by
serializing the 64 bits integers you get with Int64.bits_of_float and
applying Int64.float_of_bits to the integers you deserialize.
Best,
Daniel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 19:58 ` [Caml-list] " Richard Jones
@ 2008-06-22 20:45 ` Paolo Donadeo
2008-06-23 1:25 ` Brian Hurt
2008-06-23 8:32 ` Mattias Engdegård
2008-06-23 8:35 ` Jon Harrop
1 sibling, 2 replies; 15+ messages in thread
From: Paolo Donadeo @ 2008-06-22 20:45 UTC (permalink / raw)
To: caml-list caml-list
I know what a float number is from my numerical analysis course :-).
In any case, what is the suggested way to serialize/deserialize a
float number in OCaml? The Sexplib, for example, suffers the same
problem of the string_of_float function:.
My intent is to extract an ASCII representation of an OCaml float
value so that it can be used to recreate *exactly* the same value, at
least on the same architecture.
--
Paolo
~
~
:wq
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 20:32 ` Daniel Bünzli
@ 2008-06-22 20:50 ` Paolo Donadeo
2008-06-23 8:45 ` David Allsopp
2008-06-23 1:06 ` Brian Hurt
1 sibling, 1 reply; 15+ messages in thread
From: Paolo Donadeo @ 2008-06-22 20:50 UTC (permalink / raw)
To: caml-list caml-list
> If you can serialize to binary, you can acheive what you want by serializing
> the 64 bits integers you get with Int64.bits_of_float and applying
> Int64.float_of_bits to the integers you deserialize.
Just posted a useless message :-)
This is *exactly* what I was searching for, thanks Daniel.
--
Paolo
~
~
:wq
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 20:32 ` Daniel Bünzli
2008-06-22 20:50 ` Paolo Donadeo
@ 2008-06-23 1:06 ` Brian Hurt
2008-06-23 7:58 ` Xavier Leroy
1 sibling, 1 reply; 15+ messages in thread
From: Brian Hurt @ 2008-06-23 1:06 UTC (permalink / raw)
To: Daniel Bünzli; +Cc: caml-list caml-list
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 1286 bytes --]
On Sun, 22 Jun 2008, Daniel Bünzli wrote:
> Richard gave you the reason.
>
> If you can serialize to binary, you can acheive what you want by serializing
> the 64 bits integers you get with Int64.bits_of_float and applying
> Int64.float_of_bits to the integers you deserialize.
Actually, for serialization, I strongly reccommend first using
classify_float to split off and handle NaNs, Infinities, etc., then using
frexp to split the float into a fraction and exponent. The exponent is
just an int, and the fractional part can be multiplied by, say, 2^56 and
then converted into an integer.
The advantage of doing things this way, despite it being slightly more
complicated, is two fold: one, it gaurentees you the ability to recovery
the exact binary value of the float, and two, it sidesteps a huge number
of compatibility issues between architectures- IIRC, IEEE 754 specifies
how many bits have to be used to represent each part of the float, but not
where they have to be in the word. Also, if you use hexadecimal for
saving the integers, this can actually be faster than converting to
base-10, as conversion to base-10 isn't cheap. It's a couple of more
branches, but a lot of divs and mods get turned into shifts and ands.
Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 20:45 ` Paolo Donadeo
@ 2008-06-23 1:25 ` Brian Hurt
2008-06-23 7:50 ` Paolo Donadeo
2008-06-23 8:32 ` Mattias Engdegård
1 sibling, 1 reply; 15+ messages in thread
From: Brian Hurt @ 2008-06-23 1:25 UTC (permalink / raw)
To: Paolo Donadeo; +Cc: caml-list caml-list
On Sun, 22 Jun 2008, Paolo Donadeo wrote:
> I know what a float number is from my numerical analysis course :-).
>
> In any case, what is the suggested way to serialize/deserialize a
> float number in OCaml? The Sexplib, for example, suffers the same
> problem of the string_of_float function:.
>
> My intent is to extract an ASCII representation of an OCaml float
> value so that it can be used to recreate *exactly* the same value, at
> least on the same architecture.
>
Code something like this should work:
let encode_float x =
match (classify_float x) with
| FP_zero -> if (x = -0.0) then "-0.0" else "0.0"
| FP_infinite -> if (x = neg_infinity) then "-INF" else "INF"
| FP_nan -> "NaN"
| _ ->
let s = x < 0.0 in
let x = abs_float x in
let frac, exp = frexp x in
let frac = frac *. 268435456.0 in (* 2^28 *)
let i1 = int_of_float frac in
let i2 = int_of_float ((frac -. (floor frac)) *. 268435456.0) in
let exp = exp - 56 in
let s2 = exp < 0 in
let exp = if exp < 0 then -exp else exp in
Printf.sprintf "%c%07X%07XX%c%X" (if s then '-' else '+') i1 i2
(if s2 then '-' else '+') exp
;;
I'll leave the decode to you- it should be obvious, once you discover the
ldexp function.
Brian
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-23 1:25 ` Brian Hurt
@ 2008-06-23 7:50 ` Paolo Donadeo
0 siblings, 0 replies; 15+ messages in thread
From: Paolo Donadeo @ 2008-06-23 7:50 UTC (permalink / raw)
To: caml-list caml-list
> Code something like this should work:
Thanks, this is even better. I should pay more attention to the
Pervasives API: I never noticed frexp and ldexp.
--
Paolo
~
~
:wq
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-23 1:06 ` Brian Hurt
@ 2008-06-23 7:58 ` Xavier Leroy
0 siblings, 0 replies; 15+ messages in thread
From: Xavier Leroy @ 2008-06-23 7:58 UTC (permalink / raw)
To: Brian Hurt; +Cc: Daniel Bünzli, caml-list caml-list
>> If you can serialize to binary, you can acheive what you want by
>> serializing the 64 bits integers you get with Int64.bits_of_float and
>> applying Int64.float_of_bits to the integers you deserialize.
>
> Actually, for serialization, I strongly reccommend first using
> classify_float to split off and handle NaNs, Infinities, etc., then
> using frexp to split the float into a fraction and exponent. The
> exponent is just an int, and the fractional part can be multiplied by,
> say, 2^56 and then converted into an integer.
>
> The advantage of doing things this way, despite it being slightly more
> complicated, is two fold: one, it gaurentees you the ability to recovery
> the exact binary value of the float, and two, it sidesteps a huge number
> of compatibility issues between architectures- IIRC, IEEE 754 specifies
> how many bits have to be used to represent each part of the float, but
> not where they have to be in the word.
The only architecture I know where this problem could occur is the old
(pre-EABI) ABI for ARM, which has "mixed-endian" floats. But the
implementation of Int64.{bits_of_float,float_of_bits} goes to some
length to rearrange bits as expected, i.e. with the sign bit in the
most significant bit of the int64, followed by the exponent bits,
followed by the mantissa bits in the least significant bits of the
int64.
So, the case analysis on the float that Brian suggests is a bit of an
overkill, and I strongly suggest using the result of
Int64.bits_of_float as the exact, serializable representation of a
Caml float.
- Xavier Leroy
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 20:45 ` Paolo Donadeo
2008-06-23 1:25 ` Brian Hurt
@ 2008-06-23 8:32 ` Mattias Engdegård
2008-06-23 8:50 ` Olivier Andrieu
1 sibling, 1 reply; 15+ messages in thread
From: Mattias Engdegård @ 2008-06-23 8:32 UTC (permalink / raw)
To: p.donadeo; +Cc: caml-list
>My intent is to extract an ASCII representation of an OCaml float
>value so that it can be used to recreate *exactly* the same value, at
>least on the same architecture.
A somewhat more portable (and readable, maybe) representation of
floating-point numbers is in hex (a la C99). It is independent of the
precision and binary format used. Unfortunately, ocaml's Printf has
already appropriated %a for a different purpose, but it remains a good
option for those willing to do some manual work.
I have used it in the past to good effect in text-based interchange
formats between applications written in C.
Of course the decimal notation can unambiguously represent any
(binary) floating-point number, so that representation is fine if you
have confidence in the output and reading routines. See, for instance,
William Clinger's _How to Read Floating Point Numbers Accurately_
(http://ftp.ccs.neu.edu/pub/people/will/retrospective.pdf).
But decimal handling will always be a little slower.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-22 19:58 ` [Caml-list] " Richard Jones
2008-06-22 20:45 ` Paolo Donadeo
@ 2008-06-23 8:35 ` Jon Harrop
1 sibling, 0 replies; 15+ messages in thread
From: Jon Harrop @ 2008-06-23 8:35 UTC (permalink / raw)
To: caml-list
On Sunday 22 June 2008 20:58:31 Richard Jones wrote:
> On Sun, Jun 22, 2008 at 06:56:22PM +0200, Paolo Donadeo wrote:
> > string_of_float is not the inverse of float_of_string, at least in
> > this example.
>
> Yes, you wouldn't expect it to be, because the string is an
> approximate base 10 representation of the float...
That is not true. All finite floats have exact finite decimal representations.
So it is perfectly reasonable to expect the conversions to recover the
original number exactly.
As Paolo has shown, OCaml's current string_of_float function is approximate.
The accuracy of this routine is unspecified but a quick test indicates that
it is simply printing too few digits to be exact:
# string_of_float pi;;
- : string = "3.14159265359"
Fortunately, you can ask sprintf to generate a sufficiently accurate result:
# open Printf;;
# sprintf "%0.17g" pi;;
- : string = "3.1415926535897931"
The float_of_string function does then recover the number exactly in this
case:
# float_of_string "3.1415926535897931" -. pi;;
- : float = 0.
Also, you should keep in mind in this context that calculations may be done
with 80-bit float arithmetic in registers or truncated to 64-bits when stored
to memory. Moreover, OCaml's bytecode and native code targets can behave
differently in this context. I do not believe that is a problem with Paolo's
code here though.
--
Dr Jon D Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/products/?e
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [Caml-list] Strange behaviour of string_of_float
2008-06-22 20:50 ` Paolo Donadeo
@ 2008-06-23 8:45 ` David Allsopp
2008-06-23 8:55 ` Olivier Andrieu
0 siblings, 1 reply; 15+ messages in thread
From: David Allsopp @ 2008-06-23 8:45 UTC (permalink / raw)
To: 'caml-list caml-list'
> > Richard gave you the reason.
Erm, please correct me if I'm wrong but every single possible floating point
value (on the same architecture) has a string representation that will be
reparsed to the same floating point value (on the same architecture). It's
the reverse that isn't true because floating point numbers are only an
approximation.
> > If you can serialize to binary, you can acheive what you want by
> > serializing the 64 bits integers you get with Int64.bits_of_float and
> > applying Int64.float_of_bits to the integers you deserialize.
>
> Just posted a useless message :-)
>
> This is *exactly* what I was searching for, thanks Daniel.
This is of course a better, more reliable and faster way of serialising, but
the real cause for your original spurious result is down to how
string_of_float is defined in pervasives.ml:
# pi;;
- : float = 3.1415926535897931
# string_of_float pi;;
- : string = "3.14159265359"
In other words, (pi |> string_of_float |> float_of_string) is never going to
be equal to your original pi. For some reason, string_of_float is defined
as:
let string_of_float f = valid_float_lexem (format_float "%.12g" f);;
Perhaps Xavier can say why it's only "%.12g" in the format (I imagine
there's a historical reason) but if you increase it to 16 then you'll get
the answer you expected (0.). All that said, the values given by
string_of_float cannot always be fed back to float_of_string anyway (e.g.
float_of_string (string_of_float nan))
David
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-23 8:32 ` Mattias Engdegård
@ 2008-06-23 8:50 ` Olivier Andrieu
0 siblings, 0 replies; 15+ messages in thread
From: Olivier Andrieu @ 2008-06-23 8:50 UTC (permalink / raw)
To: p.donadeo, Mattias Engdegård, caml-list
On Mon, Jun 23, 2008 at 10:32, Mattias Engdegård <mattias@virtutech.se> wrote:
>>My intent is to extract an ASCII representation of an OCaml float
>>value so that it can be used to recreate *exactly* the same value, at
>>least on the same architecture.
>
> A somewhat more portable (and readable, maybe) representation of
> floating-point numbers is in hex (a la C99). It is independent of the
> precision and binary format used. Unfortunately, ocaml's Printf has
> already appropriated %a for a different purpose, but it remains a good
> option for those willing to do some manual work.
>
> I have used it in the past to good effect in text-based interchange
> formats between applications written in C.
Indeed, that's a good solution. It's possible to use this %a
conversion directly, without writing external C code:
(* this external is in pervasives.ml *)
external format_float : string -> float -> string = "caml_format_float"
let hex_string_of_float f =
format_float "%a" f
# hex_string_of_float pi ;;
- : string = "0x1.921fb54442d18p+1"
Mind that this only works if the underlying C library knows how to
handle this C99 conversion specifier (MSVC6 doesn't for instance).
--
Olivier
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float
2008-06-23 8:45 ` David Allsopp
@ 2008-06-23 8:55 ` Olivier Andrieu
2008-06-23 12:06 ` David Allsopp
0 siblings, 1 reply; 15+ messages in thread
From: Olivier Andrieu @ 2008-06-23 8:55 UTC (permalink / raw)
To: David Allsopp; +Cc: caml-list caml-list
On Mon, Jun 23, 2008 at 10:45, David Allsopp <dra-news@metastack.com> wrote:
> All that said, the values given by
> string_of_float cannot always be fed back to float_of_string anyway (e.g.
> float_of_string (string_of_float nan))
euh, why do you say that ? it does :
# float_of_string (string_of_float nan) ;;
- : float = nan
float_of_string is basically strtod which should correctly handle nan and inf.
--
Olivier
^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [Caml-list] Strange behaviour of string_of_float
2008-06-23 8:55 ` Olivier Andrieu
@ 2008-06-23 12:06 ` David Allsopp
0 siblings, 0 replies; 15+ messages in thread
From: David Allsopp @ 2008-06-23 12:06 UTC (permalink / raw)
To: 'caml-list caml-list'
> > All that said, the values given by
> > string_of_float cannot always be fed back to float_of_string anyway
> > (e.g. float_of_string (string_of_float nan))
>
> euh, why do you say that ? it does :
>
> # float_of_string (string_of_float nan) ;;
> - : float = nan
Because:
Objective Caml version 3.09.3
# float_of_string (string_of_float nan);;
Exception: Failure "float_of_string".
but this is clearly fixed in 3.10!
David
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-06-23 12:06 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo
2008-06-22 19:58 ` [Caml-list] " Richard Jones
2008-06-22 20:45 ` Paolo Donadeo
2008-06-23 1:25 ` Brian Hurt
2008-06-23 7:50 ` Paolo Donadeo
2008-06-23 8:32 ` Mattias Engdegård
2008-06-23 8:50 ` Olivier Andrieu
2008-06-23 8:35 ` Jon Harrop
2008-06-22 20:32 ` Daniel Bünzli
2008-06-22 20:50 ` Paolo Donadeo
2008-06-23 8:45 ` David Allsopp
2008-06-23 8:55 ` Olivier Andrieu
2008-06-23 12:06 ` David Allsopp
2008-06-23 1:06 ` Brian Hurt
2008-06-23 7:58 ` Xavier Leroy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox