* int_of_string bug @ 2007-03-29 16:27 Yaron Minsky 2007-03-29 21:29 ` [Caml-list] " Oliver Bandel 2007-03-30 1:21 ` Brian Hurt 0 siblings, 2 replies; 15+ messages in thread From: Yaron Minsky @ 2007-03-29 16:27 UTC (permalink / raw) To: caml-list [-- Attachment #1: Type: text/plain, Size: 745 bytes --] So, there's a weird int_of_string bug where positive decimal numbers are sometimes read in as negative numbers without error. Here's the bug: http://caml.inria.fr/mantis/view.php?id=0004210 This has been marked as "wontfix" in the bug database because apparently there's some weird spot in the lexer that depends on the wrong behavior of int_of_string. First of all, people should be aware of this behavior and should defend against it in their code. Secondly, the justification for not fixing it seems really thin. The behavior seems obviously wrong, and it's hard to see why one wouldn't simply fix the lexer (perhaps by providing an alternate broken implementation of int_of_string) and leave the ordinary int_of_string where it is. y [-- Attachment #2: Type: text/html, Size: 889 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-29 16:27 int_of_string bug Yaron Minsky @ 2007-03-29 21:29 ` Oliver Bandel 2007-03-30 0:26 ` Yaron Minsky 2007-03-30 1:21 ` Brian Hurt 1 sibling, 1 reply; 15+ messages in thread From: Oliver Bandel @ 2007-03-29 21:29 UTC (permalink / raw) To: caml-list On Thu, Mar 29, 2007 at 12:27:05PM -0400, Yaron Minsky wrote: > So, there's a weird int_of_string bug where positive decimal numbers are > sometimes read in as negative numbers without error. Here's the bug: > > http://caml.inria.fr/mantis/view.php?id=0004210 > > This has been marked as "wontfix" in the bug database because apparently > there's some weird spot in the lexer that depends on the wrong behavior of > int_of_string. [...] Oh, that's bad. :( But btw. it's also bad that, when overflowing of int occurs, no exception is thrown. :( Ciao, Oliver Bandel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-29 21:29 ` [Caml-list] " Oliver Bandel @ 2007-03-30 0:26 ` Yaron Minsky 2007-03-30 7:30 ` Florian Weimer 0 siblings, 1 reply; 15+ messages in thread From: Yaron Minsky @ 2007-03-30 0:26 UTC (permalink / raw) To: Oliver Bandel; +Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 1131 bytes --] On 3/29/07, Oliver Bandel <oliver@first.in-berlin.de> wrote: > > On Thu, Mar 29, 2007 at 12:27:05PM -0400, Yaron Minsky wrote: > > So, there's a weird int_of_string bug where positive decimal numbers are > > sometimes read in as negative numbers without error. Here's the bug: > > > > http://caml.inria.fr/mantis/view.php?id=0004210 > > > > This has been marked as "wontfix" in the bug database because apparently > > there's some weird spot in the lexer that depends on the wrong behavior > of > > int_of_string. > [...] > > Oh, that's bad. :( > > But btw. it's also bad that, when overflowing of int occurs, no > exception is thrown. :( That's a problem too, but there is at least a defensible reason for that, which is that it is expensive to get integer overflow to throw an exception. Ciao, > Oliver Bandel > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > [-- Attachment #2: Type: text/html, Size: 1960 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 0:26 ` Yaron Minsky @ 2007-03-30 7:30 ` Florian Weimer 2007-03-30 8:44 ` skaller 0 siblings, 1 reply; 15+ messages in thread From: Florian Weimer @ 2007-03-30 7:30 UTC (permalink / raw) To: Yaron Minsky; +Cc: Oliver Bandel, caml-list * Yaron Minsky: > That's a problem too, but there is at least a defensible reason for > that, which is that it is expensive to get integer overflow to throw > an exception. i386 and amd64 have hardware support for that, so it's not very expensive. There are pretty short RISC sequences for the checks, too. MLton uses the i386 hardware support, and I think you can disable the checks, so measuring the overhead shouldn't be too hard. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 7:30 ` Florian Weimer @ 2007-03-30 8:44 ` skaller 2007-03-30 8:59 ` Andreas Rossberg 0 siblings, 1 reply; 15+ messages in thread From: skaller @ 2007-03-30 8:44 UTC (permalink / raw) To: Florian Weimer; +Cc: Yaron Minsky, Oliver Bandel, caml-list On Fri, 2007-03-30 at 09:30 +0200, Florian Weimer wrote: > * Yaron Minsky: > > > That's a problem too, but there is at least a defensible reason for > > that, which is that it is expensive to get integer overflow to throw > > an exception. > > i386 and amd64 have hardware support for that, so it's not very > expensive. There are pretty short RISC sequences for the checks, too. > > MLton uses the i386 hardware support, and I think you can disable the > checks, so measuring the overhead shouldn't be too hard. But there is a difference you may have missed: Ocaml integers are 31 or 63 bits, not 32 or 64 bits. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 8:44 ` skaller @ 2007-03-30 8:59 ` Andreas Rossberg 2007-03-30 9:20 ` skaller 0 siblings, 1 reply; 15+ messages in thread From: Andreas Rossberg @ 2007-03-30 8:59 UTC (permalink / raw) To: skaller; +Cc: Florian Weimer, Oliver Bandel, Yaron Minsky, caml-list skaller wrote: >> >>> That's a problem too, but there is at least a defensible reason for >>> that, which is that it is expensive to get integer overflow to throw >>> an exception. >> i386 and amd64 have hardware support for that, so it's not very >> expensive. There are pretty short RISC sequences for the checks, too. >> >> MLton uses the i386 hardware support, and I think you can disable the >> checks, so measuring the overhead shouldn't be too hard. > > But there is a difference you may have missed: Ocaml integers > are 31 or 63 bits, not 32 or 64 bits. But it uses the most significant 31/63 bits for ints, so that becomes a non-issue. ;-) -- Andreas Rossberg, rossberg@ps.uni-sb.de ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 8:59 ` Andreas Rossberg @ 2007-03-30 9:20 ` skaller 0 siblings, 0 replies; 15+ messages in thread From: skaller @ 2007-03-30 9:20 UTC (permalink / raw) To: Andreas Rossberg; +Cc: Florian Weimer, Yaron Minsky, Oliver Bandel, caml-list On Fri, 2007-03-30 at 10:59 +0200, Andreas Rossberg wrote: > skaller wrote: > >> > >>> That's a problem too, but there is at least a defensible reason for > >>> that, which is that it is expensive to get integer overflow to throw > >>> an exception. > >> i386 and amd64 have hardware support for that, so it's not very > >> expensive. There are pretty short RISC sequences for the checks, too. > >> > >> MLton uses the i386 hardware support, and I think you can disable the > >> checks, so measuring the overhead shouldn't be too hard. > > > > But there is a difference you may have missed: Ocaml integers > > are 31 or 63 bits, not 32 or 64 bits. > > But it uses the most significant 31/63 bits for ints, so that becomes a > non-issue. ;-) For addition maybe, certainly not for multiplication: one of the operands has to be shifted right 1 place. But it depends on the code generator internal details. You could always shift BOTH operands, do the register calculation, then shift back .. in which case you'd lose overflow detection. The problem is you cannot use the carry bit after the shift back because the bit will definitely be set if the result is negative. >From what I've seen Ocaml actually uses tricks which also might defeat detection, for example I've seen some use of LEA (load effective address) with the scale by 2 option to load and shift one bit in a single instruction. Processors are quirky about flag bits .. some set sign bit on loading and others don't, etc, so it could be quite messy. This is why C doesn't specify what happens on overflow: it would compromise performance on some processors. -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-29 16:27 int_of_string bug Yaron Minsky 2007-03-29 21:29 ` [Caml-list] " Oliver Bandel @ 2007-03-30 1:21 ` Brian Hurt 2007-03-30 1:26 ` Yaron Minsky 1 sibling, 1 reply; 15+ messages in thread From: Brian Hurt @ 2007-03-30 1:21 UTC (permalink / raw) To: Yaron Minsky; +Cc: caml-list On Thu, 29 Mar 2007, Yaron Minsky wrote: > So, there's a weird int_of_string bug where positive decimal numbers are > sometimes read in as negative numbers without error. Here's the bug: > > http://caml.inria.fr/mantis/view.php?id=0004210 I'm actually not sure this is a bug either. Note that ocaml will quite happily compute max_int+1 without an error either. Wether this behavior (silent wrap around) is correct or not is another argument. Elsewhere I have opinioned that the only purpose for having more than one type of integer in your programming language is so that programmers can pick the wrong one. But I'm widely known to be a heretic. Ocaml's behavior is, at least, *consistent*. Brian ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 1:21 ` Brian Hurt @ 2007-03-30 1:26 ` Yaron Minsky 2007-03-30 4:23 ` skaller 0 siblings, 1 reply; 15+ messages in thread From: Yaron Minsky @ 2007-03-30 1:26 UTC (permalink / raw) To: Brian Hurt; +Cc: caml-list [-- Attachment #1: Type: text/plain, Size: 873 bytes --] On 3/29/07, Brian Hurt <bhurt@spnz.org> wrote: > > > Wether this behavior (silent wrap around) is correct or not is another > argument. Elsewhere I have opinioned that the only purpose for having > more than one type of integer in your programming language is so that > programmers can pick the wrong one. But I'm widely known to be a heretic. > > Ocaml's behavior is, at least, *consistent*. Not really all that consistent: # int_of_string "1073741824";; - : int = -1073741824 # int_of_string "1073741825";; Exception: Failure "int_of_string". # y Brian > > _______________________________________________ > Caml-list mailing list. Subscription management: > http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list > Archives: http://caml.inria.fr > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > [-- Attachment #2: Type: text/html, Size: 1659 bytes --] ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 1:26 ` Yaron Minsky @ 2007-03-30 4:23 ` skaller 2007-03-30 5:59 ` Erik de Castro Lopo 0 siblings, 1 reply; 15+ messages in thread From: skaller @ 2007-03-30 4:23 UTC (permalink / raw) To: Yaron Minsky; +Cc: Brian Hurt, caml-list On Thu, 2007-03-29 at 21:26 -0400, Yaron Minsky wrote: > On 3/29/07, Brian Hurt <bhurt@spnz.org> wrote: > > Wether this behavior (silent wrap around) is correct or not is > another > argument. Elsewhere I have opinioned that the only purpose > for having > more than one type of integer in your programming language is > so that > programmers can pick the wrong one. But I'm widely known to > be a heretic. > > Ocaml's behavior is, at least, *consistent*. > > Not really all that consistent: > > # int_of_string "1073741824";; > - : int = -1073741824 > # int_of_string "1073741825";; > Exception: Failure "int_of_string". > # skaller@rosella:/work/felix/svn/felix/felix/trunk$ ledit ocaml Objective Caml version 3.10+dev25 (2007-03-26) # int_of_string "1073741824";; - : int = 1073741824 # int_of_string "1073741825";; - : int = 1073741825 -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 4:23 ` skaller @ 2007-03-30 5:59 ` Erik de Castro Lopo 2007-03-30 6:22 ` skaller 2007-03-30 13:38 ` Markus Mottl 0 siblings, 2 replies; 15+ messages in thread From: Erik de Castro Lopo @ 2007-03-30 5:59 UTC (permalink / raw) To: caml-list skaller wrote: > On Thu, 2007-03-29 at 21:26 -0400, Yaron Minsky wrote: > > # int_of_string "1073741824";; > > - : int = -1073741824 > > # int_of_string "1073741825";; > > Exception: Failure "int_of_string". Thats the behaviour on 32 bit systems. > # int_of_string "1073741824";; > - : int = 1073741824 > # int_of_string "1073741825";; > - : int = 1073741825 But 64 bit systems get it right. Erik -- +-----------------------------------------------------------+ Erik de Castro Lopo +-----------------------------------------------------------+ "Java, the best argument for Smalltalk since C++." -- Frank Winkler ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 5:59 ` Erik de Castro Lopo @ 2007-03-30 6:22 ` skaller 2007-03-30 13:38 ` Markus Mottl 1 sibling, 0 replies; 15+ messages in thread From: skaller @ 2007-03-30 6:22 UTC (permalink / raw) To: caml-list On Fri, 2007-03-30 at 15:59 +1000, Erik de Castro Lopo wrote: > skaller wrote: > > > On Thu, 2007-03-29 at 21:26 -0400, Yaron Minsky wrote: > > > # int_of_string "1073741824";; > > > - : int = -1073741824 > > > # int_of_string "1073741825";; > > > Exception: Failure "int_of_string". > > Thats the behaviour on 32 bit systems. > > > # int_of_string "1073741824";; > > - : int = 1073741824 > > # int_of_string "1073741825";; > > - : int = 1073741825 > > But 64 bit systems get it right. The point being .. the behaviour for large values is platform independent anyhow, so in the abstract you can say the behaviour is undefined for large values, where 'large' isn't specified. If you want to get it RIGHT: if you have a user input string possibly containing digits, and you want to convert it, you must already write a parser to parse the input, so you won't be using int_of_string anyhow. If the input was generated (say by another Ocaml program), then it will already be correct. In the Felix compiler, after lexing 'string of digits' I use the Big_int module to convert to an integer: that behaviour is platform independent. If I really want an int (say for indexing), and there's a risk of the conversion overflowing .. there's a risk that even without overflowing the data is wrong and will blow up, eg .. I'm not going to be indexing arrays with max_int elements .. :) If I really want to check, I'll use an application specific bound such as 16000, and check the big_int against that before converting. Thus, all the operations are deterministic and platform independent if you do things properly. So the 'bug' in string_of_int is just an inconvenience. IMHO there is a 'bug' in some Ocaml documentation, where the abstract language is not clearly distinguished from the implementation. Throwing exceptions on error should generally NOT be considered a specified part of the language. Undefined behaviour is sometimes the right specification because it allows superior optimisation and prevents programmers relying on exceptions. This doesn't prevent the implementation throwing them, it just means catching them locally in your code is a bug (because you can't be sure they will be thrown). Bounds violations are a good example of this, and indeed since Ocaml allows -unsafe switch to disable bound checks you'd better NOT rely on catching them. The same applies to match failures -- use a wildcard if you want to catch unmatched cases (otherwise be willing to sketch a proof to your boss that there can't be a violation :) -- John Skaller <skaller at users dot sf dot net> Felix, successor to C++: http://felix.sf.net ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 5:59 ` Erik de Castro Lopo 2007-03-30 6:22 ` skaller @ 2007-03-30 13:38 ` Markus Mottl 2007-04-03 17:51 ` Toby Kelsey 1 sibling, 1 reply; 15+ messages in thread From: Markus Mottl @ 2007-03-30 13:38 UTC (permalink / raw) To: Erik de Castro Lopo; +Cc: caml-list On 3/30/07, Erik de Castro Lopo <mle+ocaml@mega-nerd.com> wrote: > But 64 bit systems get it right. Not really: # int_of_string "4611686018427387903";; - : int = 4611686018427387903 # int_of_string "4611686018427387904";; - : int = -4611686018427387904 # int_of_string "4611686018427387905";; Exception: Failure "int_of_string". The problem is just shifted to bigger numbers. This problem arises with all integer conversion functions, i.e. Int64.of_string, Int32.of_string, Nativeint.of_string, int_of_string. Regards Markus -- Markus Mottl http://www.ocaml.info markus.mottl@gmail.com ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-03-30 13:38 ` Markus Mottl @ 2007-04-03 17:51 ` Toby Kelsey 2007-04-03 22:37 ` ls-ocaml-developer-2006 0 siblings, 1 reply; 15+ messages in thread From: Toby Kelsey @ 2007-04-03 17:51 UTC (permalink / raw) To: caml-list Markus Mottl wrote: > The problem is just shifted to bigger numbers. This problem arises > with all integer conversion functions, i.e. Int64.of_string, > Int32.of_string, Nativeint.of_string, int_of_string. > > Regards > Markus This bug is not just a conversion problem: # let x = 1073741824;; val x : int = -1073741824 # (x < 0) && (x >= -x);; - : bool = true Toby ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Caml-list] int_of_string bug 2007-04-03 17:51 ` Toby Kelsey @ 2007-04-03 22:37 ` ls-ocaml-developer-2006 0 siblings, 0 replies; 15+ messages in thread From: ls-ocaml-developer-2006 @ 2007-04-03 22:37 UTC (permalink / raw) To: caml-list Toby Kelsey <toby.kelsey@gmail.com> writes: > Markus Mottl wrote: > >> The problem is just shifted to bigger numbers. This problem arises >> with all integer conversion functions, i.e. Int64.of_string, >> Int32.of_string, Nativeint.of_string, int_of_string. >> Regards >> Markus > > This bug is not just a conversion problem: > > # let x = 1073741824;; > val x : int = -1073741824 > # (x < 0) && (x >= -x);; > - : bool = true # let x = - 1073741824;; val x : int = -1073741824 # -x;; - : int = -1073741824 But this is as specified for modular ints. No surprise here ... Regards -- Markus ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2007-04-03 22:30 UTC | newest] Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2007-03-29 16:27 int_of_string bug Yaron Minsky 2007-03-29 21:29 ` [Caml-list] " Oliver Bandel 2007-03-30 0:26 ` Yaron Minsky 2007-03-30 7:30 ` Florian Weimer 2007-03-30 8:44 ` skaller 2007-03-30 8:59 ` Andreas Rossberg 2007-03-30 9:20 ` skaller 2007-03-30 1:21 ` Brian Hurt 2007-03-30 1:26 ` Yaron Minsky 2007-03-30 4:23 ` skaller 2007-03-30 5:59 ` Erik de Castro Lopo 2007-03-30 6:22 ` skaller 2007-03-30 13:38 ` Markus Mottl 2007-04-03 17:51 ` Toby Kelsey 2007-04-03 22:37 ` ls-ocaml-developer-2006
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox