From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id OAA11849; Sun, 8 Dec 2002 14:52:15 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA11853 for ; Sun, 8 Dec 2002 14:52:14 +0100 (MET) Received: from fichte.ai.univie.ac.at (fichte.ai.univie.ac.at [131.130.174.156]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id gB8DqDX13214 for ; Sun, 8 Dec 2002 14:52:13 +0100 (MET) Received: from fichte.ai.univie.ac.at (markus@localhost [127.0.0.1]) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) with ESMTP id gB8Dpu6K020832; Sun, 8 Dec 2002 14:51:56 +0100 Received: (from markus@localhost) by fichte.ai.univie.ac.at (8.12.3/8.12.3/Debian -4) id gB8Dptrw020830; Sun, 8 Dec 2002 14:51:55 +0100 Date: Sun, 8 Dec 2002 14:51:55 +0100 From: Markus Mottl To: onlyclimb Cc: caml-list@inria.fr Subject: Re: [Caml-list] pcre Message-ID: <20021208135155.GA20297@fichte.ai.univie.ac.at> Mail-Followup-To: onlyclimb , caml-list@inria.fr References: <3DF3B807.9080502@163.com> <3DF3B51F.7060704@163.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3DF3B807.9080502@163.com> <3DF3B51F.7060704@163.com> User-Agent: Mutt/1.4i Organization: Austrian Research Institute for Artificial Intelligence Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk On Sun, 08 Dec 2002, onlyclimb wrote: > let s ="abcdbcd" ;; > let t = pcre_exec ~pat:"bc" s ;; > > t = [| 1; 3; 0 |] > > what does the last zero mean? The matching engine needs extra workspace. This wasn't mentioned in the documentation (added now). You'll rarely need this function anyway. Better use "exec", "exec_all" or "extract", "extract_all" for your purposes. On Sun, 08 Dec 2002, onlyclimb wrote: > Dose the int array returned by pcre_exec contain the offsets of all > the matches or the first match from pos? It contains the offsets of the first, whole match followed by the offsets of matched subpatterns (introduced with parentheses in the pattern string). > However as i tried, it seems that it returned the first match offects, > then why it return a int array not a turple of int*int which refers to > (from, to) ? Because there can be arbitrarily many subpatterns. E.g. try this: let t = pcre_exec ~pat:"a(bc)" s ;; t = [|0; 3; 1; 3; 0; 0|] The whole match ranges from character 0 to 3 (exclusive), the first subgroup from 1 to 3. The remaining zeroes belong to the extra workspace. Using "extract" will make things clearer. Then the result is: [|"abc"; "bc"|] With "exec" you can extract strings of matched (sub)patterns more efficiently if you do not want to access all of them. Regards, Markus Mottl -- Markus Mottl markus@oefai.at Austrian Research Institute for Artificial Intelligence http://www.oefai.at/~markus ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners