From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from discorde.inria.fr (discorde.inria.fr [192.93.2.38]) by yquem.inria.fr (Postfix) with ESMTP id 01DCABC0A for ; Mon, 8 Jan 2007 07:45:28 +0100 (CET) Received: from einhorn.in-berlin.de (einhorn.in-berlin.de [192.109.42.8]) by discorde.inria.fr (8.13.6/8.13.6) with ESMTP id l086jRuQ020410 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Mon, 8 Jan 2007 07:45:27 +0100 X-Envelope-From: oliver@first.in-berlin.de X-Envelope-To: Received: from first (dslb-088-073-125-144.pools.arcor-ip.net [88.73.125.144]) (authenticated bits=0) by einhorn.in-berlin.de (8.13.6/8.13.6/Debian-1) with ESMTP id l086jMA0020477 for ; Mon, 8 Jan 2007 07:45:22 +0100 Received: by first (Postfix, from userid 501) id 048CA357840; Mon, 8 Jan 2007 07:45:14 +0100 (CET) Date: Mon, 8 Jan 2007 07:45:14 +0100 From: Oliver Bandel To: caml-list@inria.fr Subject: Re: [Caml-list] Before teaching OCaml Message-ID: <20070108064513.GA336@first.in-berlin.de> References: <1168193722.6133.38.camel@Blefuscu> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1168193722.6133.38.camel@Blefuscu> User-Agent: Mutt/1.5.6i X-Scanned-By: MIMEDefang_at_IN-Berlin_e.V. on 192.109.42.8 X-Miltered: at discorde with ID 45A1E887.000 by Joe's j-chkmail (http://j-chkmail . ensmp . fr)! X-Spam: no; 0.00; bandel:01 in-berlin:01 ocaml:01 0100,:01 fuzzy:01 bandel:01 wrote:01 oliver:01 oliver:01 behaviour:01 caml-list:01 strings:01 strings:01 neural:03 module:03 On Sun, Jan 07, 2007 at 07:15:22PM +0100, David Teller wrote: > Dear list, [...] > * the task -- for the moment, I have no interesting idea of OCaml-based > projects. Perhaps something like finding the shortest path along > subway/train lines ? [...] "realworld task": spamfilter. Then, if thisfits the needs of the mathematical teaching during the time they are studying, can be very interesting. Starting with string-comparisons on fixed strings from a list (file based), which seem to be definitely spam, can be changed to strings or patterns that are often inside spam, soyou can use probabilistic methods. Later you can try neural nets and support vector machines. Also the filter can have an inputlanguage 8domain specific language) so that the global/general behaviour can be controled by the admin (fuzzy logicrules for the filtering and configuration of any kind) and does not depend on the mathematical built in's only. The module system can be very helpful in combining the approaches. If the sources will be published, also other peoplewould have a benefit, and maybe some students will use it and developing it maybe late after finishing studying there... Best wishes, Oliver Bandel