From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA10169; Sun, 29 Dec 2002 19:41:33 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id TAA10855 for ; Sun, 29 Dec 2002 19:41:32 +0100 (MET) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by nez-perce.inria.fr (8.11.1/8.11.1) with ESMTP id gBTIfVn01013 for ; Sun, 29 Dec 2002 19:41:31 +0100 (MET) Received: from fichte.ai.univie.ac.at (markus@localhost [127.0.0.1]) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) with ESMTP id gBTIfU6K018561; Sun, 29 Dec 2002 19:41:30 +0100 Received: (from markus@localhost) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) id gBTIfUad018560; Sun, 29 Dec 2002 19:41:30 +0100 Date: Sun, 29 Dec 2002 19:41:30 +0100 From: Markus Mottl To: Oleg Cc: caml-list@inria.fr Subject: Re: [Caml-list] PCRE Message-ID: <20021229184130.GA11651@fichte.ai.univie.ac.at> Mail-Followup-To: Oleg , caml-list@inria.fr References: <200212291042.49347.oleg_inconnu@myrealbox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200212291042.49347.oleg_inconnu@myrealbox.com> User-Agent: Mutt/1.4i Organization: Austrian Research Institute for Artificial Intelligence Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, 29 Dec 2002, Oleg wrote: > I'm new to PCRE. Can anyone explain to me why the output of > > # open Pcre;; > # version;; > - : string = "3.4 22-Aug-2000" > # full_split ~pat:"^(\\w+)(,(\\w+))*$" "a,b,c,d";; > - : Pcre.split_result list = > [Delim "a,b,c,d"; Group (1, "a"); Group (2, ",d"); Group (3, "d")] > > does not contain Group(3, "b") and Group(3, "c") ? Note that using "split" instead of "full_split" produces this list: - : string list = [""; "a"; ",d"; "d"] This is absolutely correct Perl-behaviour. "full_split" is actually the same, but it also allows you to access matched substrings. Grouped subpatterns can only capture substrings once (i.e. the last one if several are possible)! > Similarly, I expected > > # full_split ~pat:"S(a\\d)+" "Sa1a2";; > - : Pcre.split_result list = [Delim "Sa1a2"; Group (1, "a2")] > > to produce [Delim "Sa1a2"; Group(1, "a1"); Group (1, "a2")] > > The above uses the latest pcre-ocaml-4.31.0. The same applies here: the behaviour of PCRE-OCaml is correct wrt. Perl-semantics. Regards, Markus Mottl -- Markus Mottl markus@oefai.at Austrian Research Institute for Artificial Intelligence http://www.oefai.at/~markus ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners