* [Caml-list] [ANN] Uunf 0.9.0 and Uucd 0.9.0
@ 2012-09-07 16:46 Daniel Bünzli
0 siblings, 0 replies; only message in thread
From: Daniel Bünzli @ 2012-09-07 16:46 UTC (permalink / raw)
To: Caml List; +Cc: caml-hump
Hello,
I'd like to announce the following two modules. First Uunf:
Uunf is an OCaml module for normalizing Unicode text. It supports all
Unicode normalization forms and is independent from any IO
mechanism or Unicode text data structure. Text can be processed
without a complete in-memory representation.
Uunf is made of a single independent module and distributed under the
BSD3 license.
Project homepage: http://erratique.ch/software/uunf
API doc & examples: http://erratique.ch/software/uunf/doc/Uunf
Note that if you use `findlib` to install you'll need Uutf because it used by
the sample programs, but using Uunf itself doesn't actually require Uutf.
For those what wonder what this is about, Unicode normal forms are
needed if you want to test Unicode strings for equality via binary
equality or order them in a textually *unmeaningful* way via
binary comparison e.g. for Set.Make or Map.Make.
This is because in Unicode there is more than one way to represent the
same user perceived character, e.g. é can be represented by the sequence
<U+00E9> (precomposed character é) or <U+0065, U+0301> (character e
followed by non-spacing mark ´). Normalizing all your strings to a
given normal form ensures that all equivalent subsequences in them
are represented the same way.
The second module is Uucd:
Uucd is an OCaml module to decode the data of the Unicode Character
Database from its XML representation. It provides high-level (but
not necessarily efficient) access to the data so that efficient representations
can be extracted.
Uucd is made of a single module, depends on Xmlm and is
distributed under the BSD3 license.
Project home page: http://erratique.ch/software/uucd
API doc: http://erratique.ch/software/uucd/doc/Uucd
If you want to install the modules via odb here are a few line
you can add to your odb package file:
http://erratique.ch/software/odb-packages.txt
(these things are still not in oasis-db because uploads of .tbz is
currently broken)
Best,
Daniel
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2012-09-07 16:46 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-09-07 16:46 [Caml-list] [ANN] Uunf 0.9.0 and Uucd 0.9.0 Daniel Bünzli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox