* Strange behaviour of string_of_float @ 2008-06-22 16:56 Paolo Donadeo 2008-06-22 19:58 ` [Caml-list] " Richard Jones 2008-06-22 20:32 ` Daniel Bünzli 0 siblings, 2 replies; 15+ messages in thread From: Paolo Donadeo @ 2008-06-22 16:56 UTC (permalink / raw) To: caml-list caml-list Today I noticed this strange behaviour of string_of_float: Let's start with: # let pi = 4.0 *. atan 1.0;; val pi : float = 3.14159265358979312 # let (|>) x f = f x;; val ( |> ) : 'a -> ('a -> 'b) -> 'b = <fun> Ok, I want to serialize pi: # (pi |> string_of_float |> float_of_string) -. pi;; - : float = 2.06945571790129179e-13 string_of_float is not the inverse of float_of_string, at least in this example. Is this correct? It's not a problem at all, I used this workaround: # let my_string_of_float = Printf.sprintf "%.1000g";; val my_string_of_float : float -> string = <fun> # (pi |> my_string_of_float |> float_of_string) -. pi;; - : float = 0. -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo @ 2008-06-22 19:58 ` Richard Jones 2008-06-22 20:45 ` Paolo Donadeo 2008-06-23 8:35 ` Jon Harrop 2008-06-22 20:32 ` Daniel Bünzli 1 sibling, 2 replies; 15+ messages in thread From: Richard Jones @ 2008-06-22 19:58 UTC (permalink / raw) To: Paolo Donadeo; +Cc: caml-list caml-list On Sun, Jun 22, 2008 at 06:56:22PM +0200, Paolo Donadeo wrote: > string_of_float is not the inverse of float_of_string, at least in > this example. Yes, you wouldn't expect it to be, because the string is an approximate base 10 representation of the float (which is itself only an approximate base 2 representation of the transcendental number pi). You might want to read a presentation called "What every computer programmer should know about floating point arithmetic". There's a PDF version here: http://blogs.sun.com/darcy/resource/Wecpskafpa-ACCU.pdf Rich. -- Richard Jones Red Hat ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 19:58 ` [Caml-list] " Richard Jones @ 2008-06-22 20:45 ` Paolo Donadeo 2008-06-23 1:25 ` Brian Hurt 2008-06-23 8:32 ` Mattias Engdegård 2008-06-23 8:35 ` Jon Harrop 1 sibling, 2 replies; 15+ messages in thread From: Paolo Donadeo @ 2008-06-22 20:45 UTC (permalink / raw) To: caml-list caml-list I know what a float number is from my numerical analysis course :-). In any case, what is the suggested way to serialize/deserialize a float number in OCaml? The Sexplib, for example, suffers the same problem of the string_of_float function:. My intent is to extract an ASCII representation of an OCaml float value so that it can be used to recreate *exactly* the same value, at least on the same architecture. -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 20:45 ` Paolo Donadeo @ 2008-06-23 1:25 ` Brian Hurt 2008-06-23 7:50 ` Paolo Donadeo 2008-06-23 8:32 ` Mattias Engdegård 1 sibling, 1 reply; 15+ messages in thread From: Brian Hurt @ 2008-06-23 1:25 UTC (permalink / raw) To: Paolo Donadeo; +Cc: caml-list caml-list On Sun, 22 Jun 2008, Paolo Donadeo wrote: > I know what a float number is from my numerical analysis course :-). > > In any case, what is the suggested way to serialize/deserialize a > float number in OCaml? The Sexplib, for example, suffers the same > problem of the string_of_float function:. > > My intent is to extract an ASCII representation of an OCaml float > value so that it can be used to recreate *exactly* the same value, at > least on the same architecture. > Code something like this should work: let encode_float x = match (classify_float x) with | FP_zero -> if (x = -0.0) then "-0.0" else "0.0" | FP_infinite -> if (x = neg_infinity) then "-INF" else "INF" | FP_nan -> "NaN" | _ -> let s = x < 0.0 in let x = abs_float x in let frac, exp = frexp x in let frac = frac *. 268435456.0 in (* 2^28 *) let i1 = int_of_float frac in let i2 = int_of_float ((frac -. (floor frac)) *. 268435456.0) in let exp = exp - 56 in let s2 = exp < 0 in let exp = if exp < 0 then -exp else exp in Printf.sprintf "%c%07X%07XX%c%X" (if s then '-' else '+') i1 i2 (if s2 then '-' else '+') exp ;; I'll leave the decode to you- it should be obvious, once you discover the ldexp function. Brian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-23 1:25 ` Brian Hurt @ 2008-06-23 7:50 ` Paolo Donadeo 0 siblings, 0 replies; 15+ messages in thread From: Paolo Donadeo @ 2008-06-23 7:50 UTC (permalink / raw) To: caml-list caml-list > Code something like this should work: Thanks, this is even better. I should pay more attention to the Pervasives API: I never noticed frexp and ldexp. -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 20:45 ` Paolo Donadeo 2008-06-23 1:25 ` Brian Hurt @ 2008-06-23 8:32 ` Mattias Engdegård 2008-06-23 8:50 ` Olivier Andrieu 1 sibling, 1 reply; 15+ messages in thread From: Mattias Engdegård @ 2008-06-23 8:32 UTC (permalink / raw) To: p.donadeo; +Cc: caml-list >My intent is to extract an ASCII representation of an OCaml float >value so that it can be used to recreate *exactly* the same value, at >least on the same architecture. A somewhat more portable (and readable, maybe) representation of floating-point numbers is in hex (a la C99). It is independent of the precision and binary format used. Unfortunately, ocaml's Printf has already appropriated %a for a different purpose, but it remains a good option for those willing to do some manual work. I have used it in the past to good effect in text-based interchange formats between applications written in C. Of course the decimal notation can unambiguously represent any (binary) floating-point number, so that representation is fine if you have confidence in the output and reading routines. See, for instance, William Clinger's _How to Read Floating Point Numbers Accurately_ (http://ftp.ccs.neu.edu/pub/people/will/retrospective.pdf). But decimal handling will always be a little slower. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-23 8:32 ` Mattias Engdegård @ 2008-06-23 8:50 ` Olivier Andrieu 0 siblings, 0 replies; 15+ messages in thread From: Olivier Andrieu @ 2008-06-23 8:50 UTC (permalink / raw) To: p.donadeo, Mattias Engdegård, caml-list On Mon, Jun 23, 2008 at 10:32, Mattias Engdegård <mattias@virtutech.se> wrote: >>My intent is to extract an ASCII representation of an OCaml float >>value so that it can be used to recreate *exactly* the same value, at >>least on the same architecture. > > A somewhat more portable (and readable, maybe) representation of > floating-point numbers is in hex (a la C99). It is independent of the > precision and binary format used. Unfortunately, ocaml's Printf has > already appropriated %a for a different purpose, but it remains a good > option for those willing to do some manual work. > > I have used it in the past to good effect in text-based interchange > formats between applications written in C. Indeed, that's a good solution. It's possible to use this %a conversion directly, without writing external C code: (* this external is in pervasives.ml *) external format_float : string -> float -> string = "caml_format_float" let hex_string_of_float f = format_float "%a" f # hex_string_of_float pi ;; - : string = "0x1.921fb54442d18p+1" Mind that this only works if the underlying C library knows how to handle this C99 conversion specifier (MSVC6 doesn't for instance). -- Olivier ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 19:58 ` [Caml-list] " Richard Jones 2008-06-22 20:45 ` Paolo Donadeo @ 2008-06-23 8:35 ` Jon Harrop 1 sibling, 0 replies; 15+ messages in thread From: Jon Harrop @ 2008-06-23 8:35 UTC (permalink / raw) To: caml-list On Sunday 22 June 2008 20:58:31 Richard Jones wrote: > On Sun, Jun 22, 2008 at 06:56:22PM +0200, Paolo Donadeo wrote: > > string_of_float is not the inverse of float_of_string, at least in > > this example. > > Yes, you wouldn't expect it to be, because the string is an > approximate base 10 representation of the float... That is not true. All finite floats have exact finite decimal representations. So it is perfectly reasonable to expect the conversions to recover the original number exactly. As Paolo has shown, OCaml's current string_of_float function is approximate. The accuracy of this routine is unspecified but a quick test indicates that it is simply printing too few digits to be exact: # string_of_float pi;; - : string = "3.14159265359" Fortunately, you can ask sprintf to generate a sufficiently accurate result: # open Printf;; # sprintf "%0.17g" pi;; - : string = "3.1415926535897931" The float_of_string function does then recover the number exactly in this case: # float_of_string "3.1415926535897931" -. pi;; - : float = 0. Also, you should keep in mind in this context that calculations may be done with 80-bit float arithmetic in registers or truncated to 64-bits when stored to memory. Moreover, OCaml's bytecode and native code targets can behave differently in this context. I do not believe that is a problem with Paolo's code here though. -- Dr Jon D Harrop, Flying Frog Consultancy Ltd. http://www.ffconsultancy.com/products/?e ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo 2008-06-22 19:58 ` [Caml-list] " Richard Jones @ 2008-06-22 20:32 ` Daniel Bünzli 2008-06-22 20:50 ` Paolo Donadeo 2008-06-23 1:06 ` Brian Hurt 1 sibling, 2 replies; 15+ messages in thread From: Daniel Bünzli @ 2008-06-22 20:32 UTC (permalink / raw) To: caml-list caml-list Richard gave you the reason. If you can serialize to binary, you can acheive what you want by serializing the 64 bits integers you get with Int64.bits_of_float and applying Int64.float_of_bits to the integers you deserialize. Best, Daniel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 20:32 ` Daniel Bünzli @ 2008-06-22 20:50 ` Paolo Donadeo 2008-06-23 8:45 ` David Allsopp 2008-06-23 1:06 ` Brian Hurt 1 sibling, 1 reply; 15+ messages in thread From: Paolo Donadeo @ 2008-06-22 20:50 UTC (permalink / raw) To: caml-list caml-list > If you can serialize to binary, you can acheive what you want by serializing > the 64 bits integers you get with Int64.bits_of_float and applying > Int64.float_of_bits to the integers you deserialize. Just posted a useless message :-) This is *exactly* what I was searching for, thanks Daniel. -- Paolo ~ ~ :wq ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [Caml-list] Strange behaviour of string_of_float 2008-06-22 20:50 ` Paolo Donadeo @ 2008-06-23 8:45 ` David Allsopp 2008-06-23 8:55 ` Olivier Andrieu 0 siblings, 1 reply; 15+ messages in thread From: David Allsopp @ 2008-06-23 8:45 UTC (permalink / raw) To: 'caml-list caml-list' > > Richard gave you the reason. Erm, please correct me if I'm wrong but every single possible floating point value (on the same architecture) has a string representation that will be reparsed to the same floating point value (on the same architecture). It's the reverse that isn't true because floating point numbers are only an approximation. > > If you can serialize to binary, you can acheive what you want by > > serializing the 64 bits integers you get with Int64.bits_of_float and > > applying Int64.float_of_bits to the integers you deserialize. > > Just posted a useless message :-) > > This is *exactly* what I was searching for, thanks Daniel. This is of course a better, more reliable and faster way of serialising, but the real cause for your original spurious result is down to how string_of_float is defined in pervasives.ml: # pi;; - : float = 3.1415926535897931 # string_of_float pi;; - : string = "3.14159265359" In other words, (pi |> string_of_float |> float_of_string) is never going to be equal to your original pi. For some reason, string_of_float is defined as: let string_of_float f = valid_float_lexem (format_float "%.12g" f);; Perhaps Xavier can say why it's only "%.12g" in the format (I imagine there's a historical reason) but if you increase it to 16 then you'll get the answer you expected (0.). All that said, the values given by string_of_float cannot always be fed back to float_of_string anyway (e.g. float_of_string (string_of_float nan)) David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-23 8:45 ` David Allsopp @ 2008-06-23 8:55 ` Olivier Andrieu 2008-06-23 12:06 ` David Allsopp 0 siblings, 1 reply; 15+ messages in thread From: Olivier Andrieu @ 2008-06-23 8:55 UTC (permalink / raw) To: David Allsopp; +Cc: caml-list caml-list On Mon, Jun 23, 2008 at 10:45, David Allsopp <dra-news@metastack.com> wrote: > All that said, the values given by > string_of_float cannot always be fed back to float_of_string anyway (e.g. > float_of_string (string_of_float nan)) euh, why do you say that ? it does : # float_of_string (string_of_float nan) ;; - : float = nan float_of_string is basically strtod which should correctly handle nan and inf. -- Olivier ^ permalink raw reply [flat|nested] 15+ messages in thread
* RE: [Caml-list] Strange behaviour of string_of_float 2008-06-23 8:55 ` Olivier Andrieu @ 2008-06-23 12:06 ` David Allsopp 0 siblings, 0 replies; 15+ messages in thread From: David Allsopp @ 2008-06-23 12:06 UTC (permalink / raw) To: 'caml-list caml-list' > > All that said, the values given by > > string_of_float cannot always be fed back to float_of_string anyway > > (e.g. float_of_string (string_of_float nan)) > > euh, why do you say that ? it does : > > # float_of_string (string_of_float nan) ;; > - : float = nan Because: Objective Caml version 3.09.3 # float_of_string (string_of_float nan);; Exception: Failure "float_of_string". but this is clearly fixed in 3.10! David ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-22 20:32 ` Daniel Bünzli 2008-06-22 20:50 ` Paolo Donadeo @ 2008-06-23 1:06 ` Brian Hurt 2008-06-23 7:58 ` Xavier Leroy 1 sibling, 1 reply; 15+ messages in thread From: Brian Hurt @ 2008-06-23 1:06 UTC (permalink / raw) To: Daniel Bünzli; +Cc: caml-list caml-list [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: TEXT/PLAIN; charset=X-UNKNOWN; format=flowed, Size: 1286 bytes --] On Sun, 22 Jun 2008, Daniel Bünzli wrote: > Richard gave you the reason. > > If you can serialize to binary, you can acheive what you want by serializing > the 64 bits integers you get with Int64.bits_of_float and applying > Int64.float_of_bits to the integers you deserialize. Actually, for serialization, I strongly reccommend first using classify_float to split off and handle NaNs, Infinities, etc., then using frexp to split the float into a fraction and exponent. The exponent is just an int, and the fractional part can be multiplied by, say, 2^56 and then converted into an integer. The advantage of doing things this way, despite it being slightly more complicated, is two fold: one, it gaurentees you the ability to recovery the exact binary value of the float, and two, it sidesteps a huge number of compatibility issues between architectures- IIRC, IEEE 754 specifies how many bits have to be used to represent each part of the float, but not where they have to be in the word. Also, if you use hexadecimal for saving the integers, this can actually be faster than converting to base-10, as conversion to base-10 isn't cheap. It's a couple of more branches, but a lot of divs and mods get turned into shifts and ands. Brian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] Strange behaviour of string_of_float 2008-06-23 1:06 ` Brian Hurt @ 2008-06-23 7:58 ` Xavier Leroy 0 siblings, 0 replies; 15+ messages in thread From: Xavier Leroy @ 2008-06-23 7:58 UTC (permalink / raw) To: Brian Hurt; +Cc: Daniel Bünzli, caml-list caml-list >> If you can serialize to binary, you can acheive what you want by >> serializing the 64 bits integers you get with Int64.bits_of_float and >> applying Int64.float_of_bits to the integers you deserialize. > > Actually, for serialization, I strongly reccommend first using > classify_float to split off and handle NaNs, Infinities, etc., then > using frexp to split the float into a fraction and exponent. The > exponent is just an int, and the fractional part can be multiplied by, > say, 2^56 and then converted into an integer. > > The advantage of doing things this way, despite it being slightly more > complicated, is two fold: one, it gaurentees you the ability to recovery > the exact binary value of the float, and two, it sidesteps a huge number > of compatibility issues between architectures- IIRC, IEEE 754 specifies > how many bits have to be used to represent each part of the float, but > not where they have to be in the word. The only architecture I know where this problem could occur is the old (pre-EABI) ABI for ARM, which has "mixed-endian" floats. But the implementation of Int64.{bits_of_float,float_of_bits} goes to some length to rearrange bits as expected, i.e. with the sign bit in the most significant bit of the int64, followed by the exponent bits, followed by the mantissa bits in the least significant bits of the int64. So, the case analysis on the float that Brian suggests is a bit of an overkill, and I strongly suggest using the result of Int64.bits_of_float as the exact, serializable representation of a Caml float. - Xavier Leroy ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2008-06-23 12:06 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-06-22 16:56 Strange behaviour of string_of_float Paolo Donadeo 2008-06-22 19:58 ` [Caml-list] " Richard Jones 2008-06-22 20:45 ` Paolo Donadeo 2008-06-23 1:25 ` Brian Hurt 2008-06-23 7:50 ` Paolo Donadeo 2008-06-23 8:32 ` Mattias Engdegård 2008-06-23 8:50 ` Olivier Andrieu 2008-06-23 8:35 ` Jon Harrop 2008-06-22 20:32 ` Daniel Bünzli 2008-06-22 20:50 ` Paolo Donadeo 2008-06-23 8:45 ` David Allsopp 2008-06-23 8:55 ` Olivier Andrieu 2008-06-23 12:06 ` David Allsopp 2008-06-23 1:06 ` Brian Hurt 2008-06-23 7:58 ` Xavier Leroy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox