From: tab@snarc.org (Vincent Hanquez)
To: Loup Vaillant <loup.vaillant@gmail.com>
Cc: Jon Harrop <jon@ffconsultancy.com>, caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Re: Rope is the new string
Date: Tue, 9 Oct 2007 21:51:19 +0200 [thread overview]
Message-ID: <20071009195119.GA29263@snarc.org> (raw)
In-Reply-To: <6f9f8f4a0710091032r3dace80bi4f9b584ae5056675@mail.gmail.com>
On Tue, Oct 09, 2007 at 07:32:25PM +0200, Loup Vaillant wrote:
> > definitely we also need some UTFstring type library (which can use rope,
> > string, whatever internally), with all common type of operations
> > (appending, finding, ...), but it's a just a specific sub case and also
> > a different type not compatible with strings (in OCaml terminology).
>
> Then, we should have both byte arrays (the native Ocaml strings), and
> unicode strings. We will also need proper syntactic sugar for unicode
> strings. Operators, and literal values (like #"example"). Only then,
> ropes could feel like native strings --and be useful as such.
not sure If i see your point here, since your are mixing rope and
unicode. however I think we are missing some other type of string
implementation (maybe rope) *along* the current implementation of
string.
while we also miss unicode support somehow integrated, what
implementation of the underlaying basic byte string is used, is
irrevelant.
> > [...] it's a just a specific sub case [...]
>
> Internationalization is, mere text crunching is not. (You meant that,
> right?) With properly interfaced unicode strings, I can do my text
> crunching without worrying about internationalization, and with no
> programming overhead. Then, when (if) I have to internationalize, it
> is much easier.
Absolutely. What I meant basicly resume into, that unicode strings are
just a subset of strings (as array of bytes). you can store a unicode
string in a byte string, whereas you can't store a byte string into a
unicode string.
i want a UTF library to be able to do something like:
type ustring = unicode_type * string
of_string: string -> ustring (* raise if not unicode compliant *)
to_string: ustring -> string
append: ustring -> ustring -> ustring
...etc
that way when I'm manipulating unicode string, i won't try to append a
binary string to a unicode string. I can code safely with my unicode
string (whatever the format utf-{8..32}), and certainly expect the type
system to complain loudly when doing something that might break unicode.
> About the incompatibility, the two types of strings are incompatible
> anyway.
>
> Maybe even more than ints and floats. Sure you once tried some
> "Obj.magic" conversions of an non-English text with emacs. :-)
I use vim ;), but heh after using Obj.magic you're on your own :)
--
Vincent Hanquez
next prev parent reply other threads:[~2007-10-09 19:51 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-08 15:08 Correct way of programming a CGI script Tom
2007-10-08 15:32 ` [Caml-list] " Dario Teixeira
2007-10-08 16:04 ` Gerd Stolpmann
2007-10-08 21:37 ` skaller
2007-10-08 22:21 ` Erik de Castro Lopo
2007-10-08 23:05 ` skaller
2007-10-08 23:19 ` skaller
2007-10-08 23:23 ` Arnaud Spiwack
2007-10-08 23:47 ` skaller
2007-10-09 5:49 ` David Teller
2007-10-09 10:15 ` Christophe TROESTLER
2007-10-09 15:29 ` skaller
2007-10-09 15:49 ` Vincent Hanquez
2007-10-09 16:00 ` Jon Harrop
2007-10-09 14:02 ` William D. Neumann
2007-10-09 15:25 ` skaller
2007-10-09 15:33 ` William D. Neumann
2007-10-09 15:48 ` Jon Harrop
2007-10-08 23:37 ` skaller
2007-10-09 10:20 ` Christophe TROESTLER
2007-10-09 13:40 ` Rope is the new string Jon Harrop
2007-10-09 15:57 ` [Caml-list] " Vincent Hanquez
2007-10-09 16:42 ` Loup Vaillant
2007-10-09 16:55 ` Vincent Hanquez
2007-10-09 17:32 ` Loup Vaillant
2007-10-09 19:51 ` Vincent Hanquez [this message]
2007-10-09 21:06 ` Loup Vaillant
2007-10-10 7:35 ` Vincent Hanquez
2007-10-10 8:05 ` Loup Vaillant
2007-10-11 13:23 ` Vincent Hanquez
2007-10-09 22:04 ` Chris King
2007-10-11 13:03 ` Vincent Hanquez
2007-10-11 13:54 ` skaller
2007-10-11 14:21 ` Vincent Hanquez
2007-10-11 14:27 ` Benjamin Monate
2007-10-11 14:48 ` skaller
2007-10-11 21:16 ` Alain Frisch
2007-10-15 20:35 ` Warning on home-made functions dealing with UTF-8 Julien Moutinho
2007-10-15 23:51 ` [Caml-list] " skaller
2007-10-16 2:21 ` Julien Moutinho
2007-10-16 18:46 ` Julien Moutinho
2007-10-16 18:51 ` Julien Moutinho
2007-10-17 2:23 ` [Caml-list] " skaller
2007-10-09 10:26 ` [Caml-list] Correct way of programming a CGI script Gerd Stolpmann
2007-10-09 15:16 ` skaller
2007-10-09 15:31 ` William D. Neumann
2007-10-09 12:52 ` Brian Hurt
2007-10-09 13:56 ` Jon Harrop
2007-10-09 15:18 ` William D. Neumann
2007-10-08 16:11 ` Loup Vaillant
2007-10-08 19:07 ` Christophe TROESTLER
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071009195119.GA29263@snarc.org \
--to=tab@snarc.org \
--cc=caml-list@yquem.inria.fr \
--cc=jon@ffconsultancy.com \
--cc=loup.vaillant@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox