Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Yutaka OIWA <oiwa@yl.is.s.u-tokyo.ac.jp>
To: caml-list@inria.fr
Subject: Re: [Caml-list] ANNOUNCE: mod_caml 1.0.6 - includes security patch
Date: Sat, 17 Jan 2004 03:52:42 +0900	[thread overview]
Message-ID: <vfizncnlo91.fsf@tuba.is.s.u-tokyo.ac.jp> (raw)
In-Reply-To: <20040116093454.GA23909@redhat.com> (Richard Jones's message of "Fri, 16 Jan 2004 09:34:54 +0000")

Hello.

>> On Fri, 16 Jan 2004 09:34:54 +0000, Richard Jones <rich@annexia.org> said:

Richard> Being able to write:

Richard> var ~ /ab+/

Richard> and similar certainly makes string handling and simple parsing a lot
Richard> easier.

>> On Fri, 16 Jan 2004 13:05:15 -0600 (CST), Brian Hurt <bhurt@spnz.org> said:

Brian> What I'd like to see is to be able to pattern match on regexs, like:

Brian> match str with
Brian> 	| /ab+/ -> ...
Brian> 	| /foo(bar)*/ -> ...

Brian> etc.

My camlp4-macro named Regexp/OCaml may solve most of the requests:
try it from http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/caml/ .

Using Regexp/OCaml, you can write the code like

    Regexp.match str with
      "^(\d+)-(\d+)$" as f : int, t : int ->
        for i = f to t do
          printf "%d\n" i
        done
    | "^(\d+)$" as s : int ->
        printf "%d\n" s

to perform branch based on multiple regular patterns and to extract
matched substrings automatically (bound to f, t, s respectively, after
converted to int type by using int_of_string).  See 
http://www.yl.is.s.u-tokyo.ac.jp/~oiwa/pub/caml/regexp-pp-0.9.3/README.match-regexp
for further details.


Brian> The compiler could then combine all the matchings into a single DFA, 
Brian> improving performance over code like:

Brian> if (regex_match str "ab+") then
Brian>     ...
Brian> else if (regex_match str "foo(bar)*") then
Brian>     ...
Brian> else 
Brian>     ...

The code generated by current Regexp/OCaml is something similar to the
above, (however, pattern compilations are performed only once per
execution per each pattern.) but if the backend regexp engine
(currently Regexp/OCaml uses PCRE/OCaml) supports optimization for
multiple regular expression matching, Regexp/OCaml can easily
utilize it.  Analysis for patterns may be performed at compilation
(camlp4-translation) phase, if required.

Brian> The regex matching would also let the compiler know if there were possible 
Brian> unmatched strings (these would should up as transitions to the error state 
Brian> in the DFA).

This feature is not currently implemented in Regexp/OCaml, but
as the macro package owns self-implemented parser for regular
patterns, it is possible to implement if I have enough time to do.
(And it is included in my personal to-do list for Regexp/OCaml.)

-- 
Yutaka Oiwa              Yonezawa Lab., Dept. of Computer Science,
      Graduate School of Information Sci. & Tech., Univ. of Tokyo.
      <oiwa@yl.is.s.u-tokyo.ac.jp>, <yutaka@oiwa.shibuya.tokyo.jp>
PGP fingerprint = C9 8D 5C B8 86 ED D8 07  EA 59 34 D8 F4 65 53 61

-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners


  reply	other threads:[~2004-01-16 18:52 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-01-15 14:03 Richard Jones
     [not found] ` <4006AC01.F2AD2741@decis.be>
2004-01-15 15:42   ` Richard Jones
2004-01-15 16:19     ` Markus Mottl
2004-01-15 16:53       ` Richard Jones
2004-01-16  6:15         ` james woodyatt
2004-01-16  9:34           ` Richard Jones
2004-01-16 19:05             ` Brian Hurt
2004-01-16 18:52               ` Yutaka OIWA [this message]
2004-01-16 19:20                 ` Markus Mottl
2004-01-16 19:01               ` Markus Mottl
2004-01-19 10:13               ` Luc Maranget
2004-01-19 11:36                 ` Richard Jones
2004-01-19 14:43                   ` Luc Maranget
2004-01-19 16:10                     ` Richard Jones
2004-01-19 17:46                       ` Markus Mottl
2004-01-19 18:05                         ` Richard Jones
2004-01-19 21:45                           ` Eray Ozkural
2004-01-20 11:31                             ` Markus Mottl
2004-01-20 12:30                               ` Eray Ozkural
2004-01-21 14:01                               ` skaller
2004-01-20 17:34                             ` Michal Moskal
2004-01-20 17:52                               ` Eray Ozkural
2004-01-20 18:54                                 ` Michal Moskal
2004-01-20 19:21                                   ` Markus Mottl
2004-01-20 19:37                                   ` David Brown
2004-01-20 20:38                                     ` Eray Ozkural
2004-01-21 19:07                                     ` Max Kirillov
     [not found]                                       ` <Pine.GSO.4.53.0401211150520.10508@cascade.cs.ubc.ca>
2004-01-22  2:15                                         ` Max Kirillov
2004-01-20 23:00                               ` Brian Hurt
2004-01-20 23:48                                 ` Eray Ozkural
2004-01-21  0:34                                   ` David Brown
2004-01-21  2:32                                     ` Eray Ozkural
2004-01-21  2:34                                     ` Eray Ozkural
2004-01-21  2:34                                       ` Shawn Wagner
2004-01-21  9:43                                     ` Andreas Rossberg
2004-01-21  5:16                                   ` Brian Hurt
2004-01-19 21:59                           ` Kenneth Knowles
2004-01-19 18:18                         ` David Brown
2004-01-19 19:15                           ` Markus Mottl
2004-01-19 19:19                             ` David Brown
     [not found]                       ` <20040119185746.A12690@beaune.inria.fr>
2004-01-19 18:07                         ` Richard Jones
2004-01-20  1:29                 ` skaller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=vfizncnlo91.fsf@tuba.is.s.u-tokyo.ac.jp \
    --to=oiwa@yl.is.s.u-tokyo.ac.jp \
    --cc=caml-list@inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox