Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
* String wishes for Ocaml
@ 1997-07-02 12:39 Basile STARYNKEVITCH
  1997-07-03  9:04 ` Xavier Leroy
  0 siblings, 1 reply; 3+ messages in thread
From: Basile STARYNKEVITCH @ 1997-07-02 12:39 UTC (permalink / raw)
  To: caml-list



Hello All,

[[abreged french version below]]

I have some few small (except the 3rd) wishes for next Ocaml release,
regarding string processing:

1. some more basic string utilities in the standard Ocaml library (not
the Str package), like
   
   * strchr in C :
      String.pos str ch = 
        search ch in str returning first position or raise Not_found

   * strrchr in C :
      String.last_pos str ch = 
        search ch in str returning last position or raise Not_found

   * String.substr str first last = get a substring of str from index
     first to last      (String.sub takes a length, not a last index)

2. perhaps something similar to strtok in C (but reentrant)

3. Much harder. A sort of scanf facility. Perhaps the format could be
a list of formatting element...

I know that this last point is difficult. Actually, the Format.printf
type inference mechanism (which seems built in the compiler, not in
the library) is stiff a bit of magic for me.


....................
 
[[-- version française abrégée - lire plutot l'anglais ci-dessus --]]

  
Quelques souhaits pour la prochaine version d'Ocaml relatifs aux
chaines de caractères.

1. Quelques utilitaires manquants, a la maniere de strchr & strrchr en
Ansi C. Ainsi que substr qui renvoie la sous-chaine de premier et
dernier rangs données

2. Quelques chose comme strtok en C

3. Bien plus difficile. Un scanf ou equivalent. Mais c'est difficile,
et pour moi, l'inference de type de Format.printf (qui me parait être
dans le compilateur, pas la librarie) reste magique pour moi.


Merci à tous

N.B. Any opinions expressed here are solely mine, and not of my organization.
N.B. Les opinions exprimees ici me sont personnelles et n engagent pas le CEA.


----------------------------------------------------------------------
Basile STARYNKEVITCH   ----  Commissariat à l Energie Atomique 
DRN/DMT/SERMA * CEA/Saclay bat.470 * 91191 GIF/YVETTE CEDEX * France
fax: (33) 01,69.08.85.68; phone: 01,69.08.40.66; home: 01,46.65.45.53
email: Basile . Starynkevitch @ cea . fr  (but remove white space)
I speak french, english, russian. Je parle français, anglais, russe.
----------------------------------------------------------------------






^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: String wishes for Ocaml
  1997-07-02 12:39 String wishes for Ocaml Basile STARYNKEVITCH
@ 1997-07-03  9:04 ` Xavier Leroy
  0 siblings, 0 replies; 3+ messages in thread
From: Xavier Leroy @ 1997-07-03  9:04 UTC (permalink / raw)
  To: Basile STARYNKEVITCH; +Cc: caml-list

> I have some few small (except the 3rd) wishes for next Ocaml release,
> regarding string processing:
> 
> 1. some more basic string utilities in the standard Ocaml library (not
> the Str package), like [...] strchr in C [...and...] strrchr in C

This is reasonable. Indeed, the Filename standard library module
defines the equivalent of strrchr for its internal usage.

> 2. perhaps something similar to strtok in C (but reentrant)

Str.split (from the OCaml regexp library) is strictly more powerful
than strtok, since it supports arbitrary regexps as delimiters.

> 3. Much harder. A sort of scanf facility. Perhaps the format could be
> a list of formatting element...

When programming in C, I've never found scanf() very useful. It does
not allow enough flexibility in defining the scanning syntax.
I'd rather scan lines the Perl way, using regular expressions
(Str.string_match + extraction of \(...\) components using
Str.matched_string and conversion to int or float if needed).

Regards,

- Xavier Leroy





^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: String wishes for Ocaml
@ 1997-07-02 21:07 Robbert VanRenesse
  0 siblings, 0 replies; 3+ messages in thread
From: Robbert VanRenesse @ 1997-07-02 21:07 UTC (permalink / raw)
  To: Basile STARYNKEVITCH, caml-list

Here's a simple scanf facility.  It needs more work, probably.  The idea
is that you do something like

   sscanf "3 hello more" "%d%s"

and it returns the list

   [ Int 3; String "hello"; End 7 ]

(where 7 is the offset into the string where it stopped scanning).
iscanf is like sscanf, but you can specify a starting offset into
the string.

Robbert

(**************************************************************)
(* SCANF.MLI *)
(* Author: Robbert vanRenesse, Cornell University *)
(**************************************************************)
exception Parse_error
type value =
    Char of char
  | Int of int
  | Float of float
  | String of string
  | End of int
val iscanf : string -> int -> string -> value list
val sscanf : string -> string -> value list
val print_result : value list -> unit


(**************************************************************)
(* SCANF.ML *)
(* Author: Robbert vanRenesse *)
(**************************************************************)
(*
 * This implements sscanf.  It returns a list of the matched items.
 *)

open Printf

exception Parse_error

type value
  = Char of char
  | Int of int
  | Float of float
  | String of string
  | End of int

let iscanf str offset fmt =
  (* See if c is included in one of the characters in the string chars.
   *)
  let included c chars =
    let len = String.length chars in
    let rec find i =
      if i = len then false
      else if c = (String.get chars i) then true
      else find (i + 1)
    in find 0
  in
  let len_str = String.length str in
  (* Return a substring of s, starting at offset i, consisting of
   * characters in the given string chars.  Also return the new offset.
   *)
  let scan_chunk s i chars =
    let len_s = String.length s in
    let j = ref i in
    while (!j < len_s) && (included (String.get s !j) chars) do
      incr j
    done;
    ((if i = !j then "" else String.sub s i (!j - i)), !j)
  (* Return a substring of s, starting at offset i, consisting of
   * characters *not* in the given string chars.  Also return the
   * new offset.
   *)
  and scan_but_chunk s i chars =
    let len_s = String.length s in
    let j = ref i in
    while (!j < len_s) && not (included (String.get s !j) chars) do
      incr j
    done;
    ((if i = !j then "" else String.sub s i (!j - i)), !j)
  in
  (* Skip all blanks starting at offset i.  Return the new offset.
   *)
  let skip_blanks i =
    let j = ref i in
    while (!j < len_str) && (included (String.get str !j) " \t\n") do
      incr j
    done;
    !j
  in
  let scan_char i =
    (String.get str i, i + 1)
  and scan_int i =
    let (s, i) = scan_chunk str i "0123456789" in
    (int_of_string s, i)
  and scan_float i =
    let (s, i) = scan_chunk str i "0123456789.eE" in
    (float_of_string s, i)
  and scan_string i =
    scan_but_chunk str i " \t\n"
  in
  let len_fmt = String.length fmt in
  (* i is an offset in str, and j an offset in fmt.  Scan the next item
   * as specified in fmt.
   *)
  let rec doscan i j =
    let do_match c j =    
      if (String.get str i) = c then
        doscan (i + 1) j
      else
        raise Parse_error
    in
    if j = len_fmt then
      [End i]
    else
      let c = String.get fmt j in
      if j < (len_fmt - 1) & c = '%' then
        match String.get fmt (j + 1) with
          | 'c' ->
	      let (v, i) = scan_char i in
	      (Char v) :: (doscan i (j + 2))
	  | 'd' ->
	      let i = skip_blanks i in
	      let (v, i) = scan_int i in
	      (Int v) :: (doscan i (j + 2))
	  | 'f' ->
	      let i = skip_blanks i in
	      let (v, i) = scan_float i in
	      (Float v) :: (doscan i (j + 2))
	  | 's' ->
	      let i = skip_blanks i in
	      let (v, i) = scan_string i in
	      (String v) :: (doscan i (j + 2))
	  | '[' ->
	      if (String.get fmt (j + 2)) = '^' then
	        let (chars, j) = scan_but_chunk fmt (j + 3) "]" in
	        let (v, i) = scan_but_chunk str i chars
	        in (String v) :: (doscan i (j + 1))
	      else
	        let (chars, j) = scan_but_chunk fmt (j + 2) "]" in
	        let (v, i) = scan_chunk str i chars
	        in (String v) :: (doscan i (j + 1))
	  | _ as c ->
	      do_match c (j + 2)
      else
        do_match c (j + 1)
  in doscan offset 0

let sscanf str fmt =
  iscanf str 0 fmt

(* For debugging...
 *)
let rec print_result =
  let print = function
    | Char c ->
        printf "Char '%c'\n" c
    | Int v ->
        printf "Int '%d'\n" v
    | Float v ->
        printf "Float '%f'\n" v
    | String v ->
        printf "String '%s'\n" v
    | End o ->
	printf "End '%d'\n" o
  in function
    | hd :: tl ->
        print hd; flush stdout;
	print_result tl
    | [] ->
	()






^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~1997-07-03 18:49 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1997-07-02 12:39 String wishes for Ocaml Basile STARYNKEVITCH
1997-07-03  9:04 ` Xavier Leroy
1997-07-02 21:07 Robbert VanRenesse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox