From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id DAA02485 for caml-redistribution; Fri, 30 Jul 1999 03:27:36 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id VAA15246 for ; Wed, 28 Jul 1999 21:02:15 +0200 (MET DST) Received: from miss.wu-wien.ac.at (miss.wu-wien.ac.at [137.208.107.17]) by nez-perce.inria.fr (8.8.7/8.8.7) with ESMTP id VAA29227 for ; Wed, 28 Jul 1999 21:02:14 +0200 (MET DST) Received: (from mottl@localhost) by miss.wu-wien.ac.at (8.9.0/8.9.0) id VAA20083 for caml-list@inria.fr; Wed, 28 Jul 1999 21:02:12 +0200 (MET DST) From: Markus Mottl Message-Id: <199907281902.VAA20083@miss.wu-wien.ac.at> Subject: Announcement: PCRE-library for OCaml To: caml-list@inria.fr (OCAML) Date: Wed, 28 Jul 1999 21:02:12 +0100 (MET DST) X-Mailer: ELM [version 2.4 PL21] MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Sender: weis Hello all, I have just released an interface library to the PCRE (Perl-Compatibility regular expressions) library for OCaml. The original PCRE-library is written in C. Here some highlights of this distribution: * The PCRE-library by Philip Hazel has been under development for about two years now and is fairly advanced and stable. It implements just about all of the convenient functionality of regular expressions as you can find them in PERL. * In contrast to PERL, the library creates DFAs (deterministic finite automata) instead of NFAs (nondeterministic finite automata). DFAs generally allow much faster pattern matching, because they need not do backtracking on most patterns. Especially patterns with many alternations can see a great speedup. * It is reentrant - and thus thread safe. This is not the case with the "Str"-module of OCaml, which builds on the GNU "regex"-library. Using reentrant libraries also means more convenience for the programmer. He does not have to reason about states in which the library might be in. * The higher level functions like substitution and splitting, they are all implemented in OCaml, are much faster than the ones of the "Str"-module. In fact, when compiled to native code, they even seem to be significantly faster than those of PERL (PERL is written in C) (possibly around 20-30% in average). Testing performance is a tricky business, but in some quick examples the speedup over PERL was somewhere between 20 and 100% for pattern substitution (including the speedup due to faster pattern matching). * You can also rely on the data returned being unique. In other terms: if the result of a function is a string, you can safely use destructive updates on it without having to fear side effects. * The library also implements all of the functionality you can find in the Str-module. This should make it easier to switch. The compatibility functions have just one "incompatibility": they do not crash with a stack overflow if you try to use patterns that can match the null-string (eg. " *" - may match everywhere). For downloading and more details look at: http://miss.wu-wien.ac.at/~mottl/ocaml_sources/intro.html Best regards, Markus Mottl -- Markus Mottl, mottl@miss.wu-wien.ac.at, http://miss.wu-wien.ac.at/~mottl