From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id JAA31513; Thu, 24 Jun 2004 09:54:32 +0200 (MET DST) Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id JAA31436 for ; Thu, 24 Jun 2004 09:54:31 +0200 (MET DST) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by concorde.inria.fr (8.12.10/8.12.10) with ESMTP id i5O7sUSH027333 for ; Thu, 24 Jun 2004 09:54:31 +0200 Received: from ibm3.cicrp.jussieu.fr (ibm3.cicrp.jussieu.fr [134.157.15.3]) by shiva.jussieu.fr (8.12.11/jtpda-5.4) with ESMTP id i5O7sUvb080619 for ; Thu, 24 Jun 2004 09:54:30 +0200 (CEST) X-Ids: 164 Received: from ibm1.cicrp.jussieu.fr (ibm1.cicrp.jussieu.fr [134.157.15.1]) by ibm3.cicrp.jussieu.fr (8.8.8/jtpda/mob-V8) with ESMTP id JAA64796 for ; Thu, 24 Jun 2004 09:52:39 +0200 Received: from localhost (fernande@localhost) by ibm1.cicrp.jussieu.fr (8.8.8/jtpda/mob-v8) with ESMTP id JAA815242 for ; Thu, 24 Jun 2004 09:52:38 +0200 X-Authentication-Warning: ibm1.cicrp.jussieu.fr: fernande owned process doing -bs Date: Thu, 24 Jun 2004 09:52:37 +0200 (DST) From: Diego Olivier Fernandez Pons X-X-Sender: fernande@ibm1 To: caml-list@inria.fr Subject: [Caml-list] Substring search on an array of strings Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Miltered: at concorde with ID 40DA88B6.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Miltered: at shiva.jussieu.fr with ID 40DA88B6.002 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Antivirus: scanned by sophie at shiva.jussieu.fr X-Loop: caml-list@inria.fr X-Spam: no; 0.00; pons:01 pons:01 etu:99 workarounds:01 fernandez:01 fernandez:01 caml:01 olivier:02 olivier:02 match:02 string:03 algorithm:03 suffix:03 library:03 algorithms:03 Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk Bonjour, Caml Str library does not seem to provide a function allowing to match efficently a string in a array of strings. I was wondering if there were already known algorithms for this before I try the workarounds I had in mind : i) building a 'superstring' with the array of strings (the shortest superstring problem is NP but there are simple primal-dual approximation algorithms) and use a classical search algorithm. ii) try to lift the suffix Knuth-Morris-Prat automaton to work on the union of the words instead of a single word (is this possible ?) Diego Olivier ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners