Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Gerd Stolpmann <info@gerd-stolpmann.de>
To: William Lovas <wlovas@stwing.upenn.edu>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Str.string_match incorrect
Date: Wed, 22 Dec 2004 11:37:11 +0100	[thread overview]
Message-ID: <1103711830.1153.21.camel@ice.gerd-stolpmann.de> (raw)
In-Reply-To: <20041222080009.GA4501@force.stwing.upenn.edu>

On Mit, 2004-12-22 at 09:00, William Lovas wrote:
> On Tue, Dec 21, 2004 at 11:44:55PM -0800, Evan Martin wrote:
> > This is consistent with the docs, which say:
> >   [string_match r s start] tests whether the characters in s starting at
> >   position start match the regular expression r.
> > and in general with how regular expression systems work.  string_match
> > corresponds to running your automaton directly and seeing whether you
> > end up in an accept state, while string_partial_match effectively adds
> > an extra ".*" to the beginning.
> > (It's more debatable whether this makes sense.)
> 
> I concur with your assessment, but i think you're characterization of the
> semantics of string_partial_match is inaccurate:
> 
>     # Str.string_match (Str.regexp "ab") "a" 0;;
>     - : bool = false
>     # Str.string_partial_match (Str.regexp "ab") "a" 0;;
>     - : bool = true
>     # Str.string_match (Str.regexp ".*ab") "a" 0;;
>     - : bool = false
> 
> ... unless of course, i've misunderstood you.  I don't think there's a
> simple transformation on regular expressions that allow you to emulate
> string_partial_match's behavior using only string_match.  (Does anyone
> care to prove me wrong? :)

The difference between string_match and string_partial_match is quite
simple:

- string_match: whether a prefix of the string matches the regexp

- string_partial_match: whether the string is a prefix of another string
  matching the regexp

On the level of the automaton, string_match is true when the automaton
reaches an accept state whereas string_partial_match is true when the
automaton is in a valid state at the end of the string and there is
still a path to an accept state.

Of course, it is easy to emulate string_partial_match when you already
have the automaton - just make all states accepting that have a path to
an accept state. But there is not a comparably simple rule for the
regular expression. You can do an emulating transformation by generating
the automaton, adding the additional accept states, and transforming the
automaton back to the regexp. There is an algorithm for this, but the
length of the regexp explodes (exponentially when I remember correctly).

Gerd
-- 
------------------------------------------------------------
Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany 
gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
------------------------------------------------------------


  parent reply	other threads:[~2004-12-22 10:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-12-22  3:49 skaller
2004-12-22  7:44 ` [Caml-list] " Evan Martin
2004-12-22  8:00   ` William Lovas
2004-12-22  8:38     ` Evan Martin
2004-12-22 10:37     ` Gerd Stolpmann [this message]
2004-12-22 15:57     ` skaller
2004-12-22 16:58       ` David Brown
2004-12-23  2:33         ` skaller
2004-12-24 17:40           ` Christopher A. Watford
2004-12-25  0:57             ` skaller
2004-12-25  3:07               ` Christopher A. Watford
2004-12-25  4:24                 ` skaller
2004-12-26  1:14               ` William Lovas
2004-12-22 17:26       ` Kurt Welgehausen
2004-12-23  2:09         ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1103711830.1153.21.camel@ice.gerd-stolpmann.de \
    --to=info@gerd-stolpmann.de \
    --cc=caml-list@yquem.inria.fr \
    --cc=wlovas@stwing.upenn.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox