From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.2 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id CAF0EBC6B for ; Fri, 9 Nov 2007 19:09:49 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAEgxNEfAXQImh2dsb2JhbACPAwIBCAopgQ8 X-IronPort-AV: E=Sophos;i="4.21,396,1188770400"; d="scan'208";a="19120820" Received: from discorde.inria.fr ([192.93.2.38]) by mail4-smtp-sop.national.inria.fr with ESMTP; 09 Nov 2007 19:09:49 +0100 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id lA9I9m30018515 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=OK) for ; Fri, 9 Nov 2007 19:09:49 +0100 From: pierre.weis@inria.fr (Pierre Weis) X-IronPort-AV: E=Sophos;i="4.21,396,1188770400"; d="scan'208";a="4076929" Received: from yquem.inria.fr ([128.93.8.37]) by mail2-relais-roc.national.inria.fr with ESMTP; 09 Nov 2007 19:09:48 +0100 Received: by yquem.inria.fr (Postfix, from userid 24253) id AF5F9BC6B; Fri, 9 Nov 2007 19:09:48 +0100 (CET) Date: Fri, 9 Nov 2007 19:09:48 +0100 To: Oliver Bandel Cc: caml-list Subject: Re: [Caml-list] STOP (was: Search for the smallest possible possible Ocaml segfault) Message-ID: <20071109180948.GA15291@yquem.inria.fr> References: <9d3ec8300711080617g1b023711o1a8f9aa50b7874@mail.gmail.com> <47334DAA.80407@inria.fr> <1194548803.47335e4348ebb@webmail.in-berlin.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1194548803.47335e4348ebb@webmail.in-berlin.de> User-Agent: Mutt/1.5.9i X-Miltered: at discorde with ID 4734A26C.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; weis:01 weis:01 ocaml:01 segfault:01 segfaults:01 ocaml:01 segfaults:01 bug:01 scanf:01 segfault:01 scanf:01 sscanf:01 printf:01 printf:01 compiler:01 Hello world, > Segfaults in OCaml are seldom, but nevertheless > those seldom seen segfaults should be fixed. > > The original poster stated out that the bug he > posted was four months on status "new". > > This was a littlebid astonishing, and possibly > the reason why this thread was started. I think this is my fault: I implemented Scanf in the first place. May be some of you missed the point in the segfault example involving Scanf that was given on this list: the example involves using positional parameters in the string argument passed to sscanf and using meta format specifications in the format string argument. Positional parameters: ---------------------- Positional parameters are parameters number specifications that allows format strings to refer to another parameter than the next in the presentation order. This is supposed to be useful for internationalization where you can change the printing order of the parameters to reflect the translation. This new feature has only been introduced in the documentation for Printf just after the successful correction of the long standing strange behaviour of printf with respect to partial evaluation. Positional parameters have never been mentioned in the Scanf documentation and Scanf is not supposed to supported them. To say the least, this feature is still experimental and not yet completely implemented in the current sources of the compiler. In fact, the typechecker abruptely rejects format strings with positional parameters, as exemplify here: # Printf.printf "%2$i %1$s" "toto" 1;; Bad conversion %$, at char number 0 in format string ``%2$i %1$s'' As mentioned above, Scanf is not supposed to handle positional parameters, and indeed rejects them at runtime in the format string (this could seem overkill, given that the type-checker rejects positional parameters in the first place, but well, 2 checks are better than one!). Meta format specifications: --------------------------- However, Scanf is still capable to read a format string lexem given in the input, provided the format used to read this lexem involves a meta format that properly describes the format string lexem to be read. For instance: Scanf.sscanf s "%{ %i %f %}" is supposed to read in the input (the string s) a format string that specify to read first an integer, then a floating point number. A more practical working example could be: # let fmt = Scanf.sscanf "\"Reference: %i Price : %f\"" "%{%i%f%}" (fun fmt -> fmt);; val fmt : (int -> float -> '_a, '_b, '_c, '_d, '_d, '_a) format6 = # string_of_format fmt;; - : string = "Reference: %i Price : %f" This features allows a procedure to read the format string it has to use to read a file: the procedure just reads the format as the first line of the file. The example seg-fault analysis: ------------------------------- So, the seg-faulting example given is quite involved, since it uses scanf's capability to read a format string in the input, in order to create a format string with positional specification. This was clearly unexpected, given the ``axiom'' that says "the type checker will prevent that in the first place since it rejects any positional parameter!". In this case, the typechecker cannot reject the format string given in the program, since it has no positional specification; and it cannot reject the format read, since this format is unknown at compile time! This simply means that there is a bug in the type compatibility runtime test for format strings that fails to properly reject positional parameters. This is not difficult to correct, if not at all satifactory. The corrections or problem suppression: --------------------------------------- I once thought that introducing positional parameters in Scanf, Printf, and Format, would be a piece of cake in comparison to correcting the hard bug of printf's treatment of partial evaluation (this ``misfeature'' stood there for more than 10 years, before a correction can be figured out). Unfortunately, I was wrong, positional parameters are not at all easy, even when the printf behaviour is corrected: admittedly, the runtime implementation for positional parameters was not too hard for Printf and it has been done quickly; on the other hand, the type checking of format strings with positional parameters proved to be untrivial: you need a deep breath to dig into the old code, you need once more understand it in depth, and basically you must rewrite it with a new logic that supports positional parameters. This has not yet been done, unfortunately. In conclusion, I have to correct the runtime compatibility check for formats to suppress the problem, and remove any mention of positional parameters from the documentation, until I achieve the new type checking stuff for format strings. Or just give up on this complex feature that has already upset too many people. I agree with any one who cares that I was wrong to introduce a not yet properly baked feature in Caml. The natural over optimistic tendancy of my researcher's enthousiasm caught me there ... [...] > I for myself have only stated that I hope the bugs will be fixed. I hope them to be fixed as well. Sometimes it's difficult. Sometimes we lack the time and concentration to find the solution. Even worse, sometimes we never find the solution for years... Best regards, PS: You can check in the bug tracking system that there are more than one message concerning positional parameters in format strings, and that I already answered to a lot of them. -- Pierre Weis INRIA Rocquencourt, http://bat8.inria.fr/~weis/