From: Frank Atanassow <franka@cs.uu.nl>
To: Neale Pickett <neale-caml@woozle.org>
Cc: caml-list@pauillac.inria.fr
Subject: Re: [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc
Date: Thu, 23 Aug 2001 12:21:14 +0200 [thread overview]
Message-ID: <20010823122114.A3873@cs.uu.nl> (raw)
In-Reply-To: <w53heuz25f4.fsf@woozle.org>; from neale-caml@woozle.org on Wed, Aug 22, 2001 at 01:41:51PM -0700
Neale Pickett wrote (on 22-08-01 13:41 -0700):
> Alain Frisch writes:
> > On 22 Aug 2001 neale-caml@woozle.org wrote:
>
> >> # let rec f l =
> >> let sep = Str.regexp "^[ \t\n]*\\(.+\\)" in
> >> match l with
> >> | [] -> []
> >> | [""] -> []
> >> | s :: rest -> if (Str.string_match sep s 0) then
> >> let foo = print_string ("match " ^ Str.matched_group 1 s ^ "\n") in
> >> (Str.matched_group 1 s) :: (f rest)
> > ^^
>
> > This is wrong; with the current OCaml implementation, the right
> > operand of (::) is called first; so (Str.matched_group 1 s) is called
> > after subsequent calls to Str.string_match, which is obviously
> > incorrect.
>
> Aha! Thank you.
>
> This makes sense, but it is certainly not obvious, especially in a
> language which purports to have no side-effects.
Ocaml does not purport to have no side-effects. It has plenty of side-effects.
You must be thinking of Haskell or Miranda.
> I can't help thinking
> that s should be a different string for every invocation, but clearly it
> is somehow related to the initial input string. No doubt this is a
> clever optimization within OCaml which makes for drastically reduced
> memory usage when processing strings, but it does make things a bit
> confusing to the beginner.
I'm pretty sure there is no such optimization, but I'm not sure what you're
talking about here. Anyway, if an optimization affected the behavior of a
program, it would not be an optimization but rather an compiler bug.
> I don't have any good suggestions on how else to do it, although my base
> desire is to have a regexp matching function which returns a string list
> of the matched groups.
There is no need to mutate the list/string(s).
If I understand you correctly (but I don't think I do):
let sep_list =
let sep = Str.regexp "[ \t\n]+\\([^ \t\n]*\\)" in
fun s ->
let rec loop i =
if Str.string_match sep s i then
let m = Str.matched_group 1 s in
m :: loop (Str.match_end ())
else
[]
in loop 0
# sep_list " abc def ghi j";;
- : string list = ["abc"; "def"; "ghi"; "j"]
But this is what the Str.split procedure does already:
# Str.split (Str.regexp "[ \t\n]+") " abc def ghi j";;
- : string list = ["abc"; "def"; "ghi"; "j"]
Your function has type string list -> string list, and it seems like it just
does the same match on every element of the list, so it's much easier:
let map_match =
let sep = Str.regexp "[ \t\n]*\\(.+\\)" in
fun l ->
let f s = Str.string_match sep s 0; Str.matched_group 1 s in
List.map f l
# map_match [" arf"; " barf"];;
- : string list = ["arf"; "barf"]
--
Frank Atanassow, Information & Computing Sciences, Utrecht University
Padualaan 14, PO Box 80.089, 3508 TB Utrecht, Netherlands
Tel +31 (030) 253-3261 Fax +31 (030) 251-379
-------------------
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
next prev parent reply other threads:[~2001-08-23 10:21 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-08-22 18:53 neale-caml
2001-08-22 19:18 ` Alain Frisch
2001-08-22 20:41 ` Neale Pickett
2001-08-23 10:21 ` Frank Atanassow [this message]
2001-08-23 16:06 ` Neale Pickett
2001-08-23 16:25 ` Alain Frisch
2001-08-23 18:14 ` Neale Pickett
2001-08-22 20:23 ` Markus Mottl
2001-08-22 20:31 ` Miles Egan
2001-08-22 20:52 ` Michael Leary
2001-08-23 5:36 ` Jeremy Fincher
2001-08-22 22:06 ` Nicolas George
2001-08-23 7:08 ` [Caml-list] PCRE as standard (Was: Str.string_match raising Invalid_argument...) Florian Hars
2001-08-23 17:31 ` [Caml-list] Str.string_match raising Invalid_argument "String.sub" in gc Brian Rogoff
2001-08-23 18:08 ` [Caml-list] standard regex package Miles Egan
2001-08-23 19:28 ` Brian Rogoff
2001-08-23 19:49 ` Miles Egan
2001-08-23 19:51 ` Gerd Stolpmann
2001-08-23 21:12 ` Brian Rogoff
2001-08-23 21:27 ` Benjamin C. Pierce
2001-08-23 21:49 ` Gerd Stolpmann
2001-08-23 22:11 ` Miles Egan
2001-08-23 23:55 ` Gerd Stolpmann
2001-08-24 9:03 ` Claudio Sacerdoti Coen
2001-08-24 9:26 ` Sven
2001-08-27 15:46 ` [Caml-list] Package dependencies [Was: standard regex package] Ian Zimmerman
2001-08-27 20:50 ` Gerd Stolpmann
2001-08-24 9:23 ` [Caml-list] standard regex package Sven
2001-08-27 15:54 ` Ian Zimmerman
2001-08-30 8:41 ` Sven
2001-08-23 21:06 ` RE : " Lionel Fourquaux
2001-08-24 9:23 ` [Caml-list] dynamic loading and OS interface Xavier Leroy
2001-08-27 15:16 ` [Caml-list] standard regex package Ian Zimmerman
2001-08-27 15:35 ` Brian Rogoff
2001-08-24 9:13 ` Xavier Leroy
2001-08-24 10:16 ` Markus Mottl
2001-08-24 16:49 ` Miles Egan
[not found] ` <w533d6j1vxn.fsf@woozle.org>
[not found] ` <20010823112653.A7085@chopin.ai.univie.ac.at>
[not found] ` <w5366be7fd0.fsf_-_@woozle.org>
2001-08-23 20:01 ` [Caml-list] Re: [OFF-LIST] Str.string_match raising Invalid_argument "String.sub" in gc Markus Mottl
2001-08-23 20:31 ` Patrick M Doane
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20010823122114.A3873@cs.uu.nl \
--to=franka@cs.uu.nl \
--cc=caml-list@pauillac.inria.fr \
--cc=neale-caml@woozle.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox