From: "Daniel Bünzli" <daniel.buenzli@erratique.ch>
To: David Allsopp <dra-news@metastack.com>
Cc: Christophe TROESTLER <Christophe.Troestler@umons.ac.be>,
OCaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] GSoC: better UTF-8 support
Date: Mon, 28 Feb 2011 13:32:02 +0100 [thread overview]
Message-ID: <AANLkTikuGFN1CyxDyuyVG-8PaTPU8KjtFhn+WF6TQVKO@mail.gmail.com> (raw)
In-Reply-To: <E51C5B015DBD1348A1D85763337FB6D949100ED6@Remus.metastack.local>
> D:\>md "Paweł Łukaszewski"
> D:\>cd "Paweł Łukaszewski"
> D:\Paweł Łukaszewski>ocaml
> Objective Caml version 3.11.2
>
> # Sys.getcwd();;
> - : string = "D:\\Pawel Lukaszewski"
1) That's very different problem from defining new Char and String
like modules for UTF-8 encoded strings.
2) Is that a windows problems ? Here on osx :
> mkdir Łukaszewski
> cd Łukaszewski
> rlwrap ocaml
Objective Caml version 3.12.0
# Sys.getcwd ();;
- : string = "/private/tmp/?\129ukaszewski"
# Char.code (Sys.getcwd ()).[13];;
- : int = 197
# Char.code (Sys.getcwd ()).[14];;
- : int = 129
so we have 0xC5 0x81 for Ł which is the right UTF-8 representation for it.
I'm currently not up to date on the problem of unicode encoded
filenames in ocaml but isn't that something that should be handled by
the underlying libc ?
Note, maybe a nice addition for the gsoc project would be to add an
option to ocaml so that it doesn't escape the bytes 127 to 159 when it
prints strings allowing your UTF-8 aware tty to display UTF-8 encoded
strings correctly, not as above.
> Fully aware - but just because you need to work with strings does *not* imply you ever need even to compare them. Granted, the documentation may note that to perform a canonical comparison you'll need a third party library but I still maintain that having basic support is better and more usable than having none
[...]
But then, if you only need byte-level comparison String.compare is
fine. So I really don't see the benefits of these UTF-8 Char and
String like modules.
Best,
Daniel
next prev parent reply other threads:[~2011-02-28 12:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-28 8:35 Christophe TROESTLER
2011-02-28 8:58 ` Daniel Bünzli
2011-02-28 10:07 ` David Allsopp
2011-02-28 11:21 ` Daniel Bünzli
2011-02-28 11:46 ` David Allsopp
2011-02-28 12:32 ` Daniel Bünzli [this message]
2011-02-28 12:59 ` [Caml-list] " Sylvain Le Gall
2011-02-28 10:59 ` Sylvain Le Gall
2011-02-28 14:39 ` [Caml-list] " David Rajchenbach-Teller
2011-02-28 10:07 ` David Allsopp
[not found] ` <20110228.143157.1265982603697554449.Christophe.Troestler+ocaml@umons.ac.be>
2011-02-28 14:11 ` Daniel Bünzli
2011-02-28 14:57 ` Dario Teixeira
2011-02-28 14:13 ` Gerd Stolpmann
2011-02-28 14:31 ` [Caml-list] " Sylvain Le Gall
2011-02-28 15:09 ` [Caml-list] " Dario Teixeira
2011-02-28 15:50 ` David Allsopp
2011-03-01 5:49 ` [Caml-list] " Yoriyuki Yamagata
2011-02-28 14:21 ` [Caml-list] " Michael Ekstrand
2011-03-03 15:37 ` Damien Doligez
2011-03-03 16:42 ` Dario Teixeira
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AANLkTikuGFN1CyxDyuyVG-8PaTPU8KjtFhn+WF6TQVKO@mail.gmail.com \
--to=daniel.buenzli@erratique.ch \
--cc=Christophe.Troestler@umons.ac.be \
--cc=caml-list@inria.fr \
--cc=dra-news@metastack.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox