Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: Yoann Padioleau <padator@wanadoo.fr>
To: "Grégoire Seux" <kamaradclimber@gmail.com>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] porter's stemmer implementation
Date: Sun, 17 Jan 2010 11:33:37 -0800	[thread overview]
Message-ID: <A8A64EBF-A0A2-48A2-9DCF-C634677A25A0@wanadoo.fr> (raw)
In-Reply-To: <1ae8fe881001162301y15093c32q23a3f274293fcc38@mail.gmail.com>


On Jan 16, 2010, at 11:01 PM, Grégoire Seux wrote:

> Hello,
> 
> I am looking for an implementation of Porter's stemmer in ocaml.

There is one in nltk, a very complete python library for NLP and there is
ocamlpython to link ocaml and python code. As the stemmer interface is
very simpler (string -> string), it's very easy to use ocamlpython to do that.

Basically you can do in a python file nltk_ocaml,py:

import nltk, re

stemmer = nltk.PorterStemmer()

def stem(s):
    return stemmer.stem(s)

def test_nltk_ocaml():
    print "test_nltk_ocaml"
    nltk.draw.tree.demo()




and in a ml file nltk.ml:
open Pycaml

module Py = Python 
let modul = Py.cpr (Pycaml.pyimport_importmodule "nltk_ocaml")
let dict = Py.cpr (Pycaml.pymodule_getdict modul)

let stem s = 
  let py_str = Pycaml.pystring_fromstring s in
  let f = Py.cpr (Pycaml.pydict_getitemstring(dict, "stem")) in
  let args = Py.cpr (Pycaml.pytuple_fromsingle py_str) in
  let res = Py.cpr (Pycaml.pyeval_callobject (f,args)) in
  Pycaml.guarded_pystring_asstring res







> Erik Arneson published a few years ago a link to his implementation, but the file is no longer available, does someone has a copy of the file or another implementation ? 
> By the way, I am looking for every library that could be used in index construction in Ocaml, of course !
> 
> Thanks by advance, 
> 
> Gregoire Seux
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs




  parent reply	other threads:[~2010-01-17 19:33 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-17  7:01 Grégoire Seux
2010-01-17 19:12 ` [Caml-list] " Richard Jones
2010-01-17 19:33 ` Yoann Padioleau [this message]
2010-01-18  0:49   ` Guillaume Yziquel
2010-01-18  2:33     ` Yoann Padioleau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=A8A64EBF-A0A2-48A2-9DCF-C634677A25A0@wanadoo.fr \
    --to=padator@wanadoo.fr \
    --cc=caml-list@inria.fr \
    --cc=kamaradclimber@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox