From: Alain Frisch <alain.frisch@lexifi.com>
To: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
Cc: caml-list <caml-list@inria.fr>
Subject: Re: [Caml-list] sedlex = ulex without camlp4
Date: Fri, 18 Jan 2013 16:38:49 +0100 [thread overview]
Message-ID: <50F96C89.3030905@lexifi.com> (raw)
In-Reply-To: <21CE18DB888F4112A274E2BCBF1D0B6B@erratique.ch>
I have to admit that I don't know much about Unicode and surrogates
(moreover, support for utf-16 was contributed by someone else). I'll
happily update the documentation if someone looks at the source code and
tells me that the property you mention indeed holds.
-- Alain
On 01/18/2013 04:32 PM, Daniel Bünzli wrote:
> Hello Alain,
>
> I rapidly went through your documentation.
>
> If your UTF-8 and UTF-16 decoders are conformant, your module, on output, doesn't generate Unicode code points, but Unicode scalar values (code points minus the UTF-16 surrogates [1]). If that is the case it would be nice to state this invariant explicitely in the documentation.
>
> This allows to directly pass the data generated by sedlex to modules like Uunf without further checks as those values belong to the Uunf.uchar type [2].
>
> Best,
>
> Daniel
>
>
> [1] http://www.unicode.org/glossary/#unicode_scalar_value
> [2] http://erratique.ch/software/uunf/doc/Uunf#TYPEuchar
>
>
>
prev parent reply other threads:[~2013-01-18 15:38 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-18 14:24 Alain Frisch
2013-01-18 15:32 ` Daniel Bünzli
2013-01-18 15:38 ` Alain Frisch [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50F96C89.3030905@lexifi.com \
--to=alain.frisch@lexifi.com \
--cc=caml-list@inria.fr \
--cc=daniel.buenzli@erratique.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox