* [Caml-list] scanf and %2c @ 2003-06-13 20:28 Alan Schmitt 2003-06-19 8:57 ` Pierre Weis 2003-06-20 9:06 ` Pierre Weis 0 siblings, 2 replies; 5+ messages in thread From: Alan Schmitt @ 2003-06-13 20:28 UTC (permalink / raw) To: caml-list Hi, As I needed to parse some string representing time (of the form hh:mm), I decided to use scanf. The correct code to do it is: # let time_parse s = Scanf.sscanf s "%2s:%2s" (fun a b -> a,b) ;; val time_parse : string -> string * string = <fun> but of course this is not what I tried first, thinking that I wanted a string of two chars: # let time_parse s = Scanf.sscanf s "%2c:%2c" (fun a b -> a,b) ;; val time_parse : string -> char * char = <fun> this leads to the following: # time_parse "10:20" ;; Exception: Scanf.Scan_failure "scanf: bad input at char number 2: 0". # time_parse "1:2" ;; - : char * char = ('1', '2') So shouldn't there be a warning (or an error) when using a size field with chars ? Alan -- The hacker: someone who figured things out and made something cool happen. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] scanf and %2c 2003-06-13 20:28 [Caml-list] scanf and %2c Alan Schmitt @ 2003-06-19 8:57 ` Pierre Weis 2003-06-19 15:06 ` Nicolas George 2003-06-20 9:06 ` Pierre Weis 1 sibling, 1 reply; 5+ messages in thread From: Pierre Weis @ 2003-06-19 8:57 UTC (permalink / raw) To: Alan Schmitt; +Cc: caml-list Bonjour Alan, > As I needed to parse some string representing time (of the form hh:mm), [...] Welcome to the dates and time users' camp! Too bad that there is no support for that kind of stuff in our favorite language :( > So shouldn't there be a warning (or an error) when using a size field > with chars ? > > Alan We must be a bit more precise than that: we should check that the size field is positive and lesser or equal than 1. In effect: - a 0 sized char scanf specification has a special useful meaning (see Scanf.mli for details): it means ``pick'' the current character without reading it (in order to test its value and decide what to do next), - a 1 sized char scanf specification seems to be harmless. I will try to had a static check in the type-checker (the usual Caml way), orelse a runtime failure in Scanf (the usual way of more conventional programming languages). Amicalement, Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/ ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] scanf and %2c 2003-06-19 8:57 ` Pierre Weis @ 2003-06-19 15:06 ` Nicolas George 0 siblings, 0 replies; 5+ messages in thread From: Nicolas George @ 2003-06-19 15:06 UTC (permalink / raw) To: caml-list [-- Attachment #1: Type: text/plain, Size: 495 bytes --] Le primidi 1er messidor, an CCXI, Pierre Weis a écrit : > > As I needed to parse some string representing time (of the form hh:mm), > Welcome to the dates and time users' camp! Too bad that there is no > support for that kind of stuff in our favorite language :( I have written a quite complete date parser for the OCamlnet project. It is available at Sourceforge: <URL: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/ocamlnet/ocamlnet/src/netstring/ > (netdate.mli and netdate.mlp). [-- Attachment #2: Type: application/pgp-signature, Size: 185 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] scanf and %2c 2003-06-13 20:28 [Caml-list] scanf and %2c Alan Schmitt 2003-06-19 8:57 ` Pierre Weis @ 2003-06-20 9:06 ` Pierre Weis 2003-06-20 10:45 ` Alan Schmitt 1 sibling, 1 reply; 5+ messages in thread From: Pierre Weis @ 2003-06-20 9:06 UTC (permalink / raw) To: Alan Schmitt; +Cc: caml-list Hi Alan, > As I needed to parse some string representing time (of the form hh:mm), > I decided to use scanf. The correct code to do it is: > # let time_parse s = > Scanf.sscanf s "%2s:%2s" (fun a b -> a,b) > ;; > val time_parse : string -> string * string = <fun> Just to implement stricter parsing rules (and BTW to show scanf capabilities), I will elaborate a bit on this ``correct'' code. To ensure that hh and mm are indeed decimal digits, we could write: # let scan_date s = Scanf.sscanf s "%2d:%2d";; val scan_date : string -> (int -> int -> 'a) -> 'a = <fun> This way, the fields hh and mm are parsed and returned as integers as they are supposed to be. So far so good, but this is not precise enough, since (small) negative hours are still accepted: # scan_date "-2:12" (fun hh mm -> hh, mm);; - : int * int = (-2, 12) That's why I usually use: # let scan_date s = Scanf.sscanf s "%2[0-9]:%2[0-9]";; val scan_date : string -> (string -> string -> 'a) -> 'a = <fun> # scan_date "-2:12" (fun x y -> x, y);; Exception: Scanf.Scan_failure "scanf: bad input at char number 1: -". Then, you may argue that we still parse dates like 99:99 which are meaningless. Scanning the characters one at a time, we could be more precise and reject a large class of those erroneous dates: # let scan_date s = Scanf.sscanf s "%1[0-2]%1[0-9]:%1[0-5]%1[0-9]";; val scan_date : string -> (string -> string -> string -> string -> 'a) -> 'a = <fun> If minutes are now appropriately handled, we still accept to parse hours that are greater than 24! To deal with that problem, we first define two auxilliary functions am and pm to parse respectively dates before 20:00 and after 20:00, when the first digit of the hour is already properly parsed: let am ib = Scanf.bscanf ib "%1[0-9]:%1[0-5]%1[0-9]";; let pm ib = Scanf.bscanf ib "%1[0-3]:%1[0-5]%1[0-9]";; let scan_date_ib ib f = Scanf.bscanf ib "%c" (function c -> let h0 = String.make 1 c in match c with | '0' | '1' -> am ib (f h0) | '2' -> pm ib (f h0) | _ -> failwith ("Illegal date char " ^ h0));; val scan_date_ib : Scanf.Scanning.scanbuf -> (string -> string -> string -> string -> 'a) -> 'a = <fun> Remark that we turned to bscanf, that is scanning from scanning buffers (and not strings), since the scanning is now split into several phases that should go on scanning from the same data structure (to do so with strings would involve horrific substring manipulations of the string argument to pass it to the next step). As a rule of thumb, scanning from buffers is much more general and easy than scanning from string or files: phase scanning can be composed smoothly and scanning from any other data structure is easily expressed in terms of a basic function scanning from buffers. For instance, if you insist for scanning from strings, you could define: let scan_date s = scan_date_ib (Scanf.Scanning.from_string s);; Now: # scan_string_date "25:12";; Exception: Scanf.Scan_failure "scanf: bad input at char number 2: 5". Hope this helps, Pierre Weis INRIA, Projet Cristal, Pierre.Weis@inria.fr, http://pauillac.inria.fr/~weis/ PS: Using the new formats manipulation primitives, we could have factorized a bit the functions am and pm as: let minutes_fmt () = format_of_string ":%1[0-5]%1[0-9]";; let am_fmt () = "%1[0-9]" ^^ minutes_fmt ();; let pm_fmt () = "%1[0-3]" ^^ minutes_fmt ();; let am ib = Scanf.bscanf ib (am_fmt ());; let pm ib = Scanf.bscanf ib (pm_fmt ());; (Note the additional () abstractions to circumvenient the value restriction of polymorphic generalization.) ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Caml-list] scanf and %2c 2003-06-20 9:06 ` Pierre Weis @ 2003-06-20 10:45 ` Alan Schmitt 0 siblings, 0 replies; 5+ messages in thread From: Alan Schmitt @ 2003-06-20 10:45 UTC (permalink / raw) To: caml-list * Pierre Weis (pierre.weis@inria.fr) wrote: > Just to implement stricter parsing rules (and BTW to show scanf > capabilities), I will elaborate a bit on this ``correct'' code. Thanks a lot for this enlightening lecture. Alan Schmitt -- The hacker: someone who figured things out and made something cool happen. ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-06-20 10:45 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-06-13 20:28 [Caml-list] scanf and %2c Alan Schmitt 2003-06-19 8:57 ` Pierre Weis 2003-06-19 15:06 ` Nicolas George 2003-06-20 9:06 ` Pierre Weis 2003-06-20 10:45 ` Alan Schmitt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox