From: Alain.Frisch@ens.fr
To: Xavier Leroy <xavier.leroy@inria.fr>
Cc: brogoff@speakeasy.net, Oliver Bandel <oliver@first.in-berlin.de>,
"caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] Strings as arrays or lists...
Date: Sun, 2 Mar 2003 20:03:12 +0100 (MET) [thread overview]
Message-ID: <Pine.SOL.4.44.0303021938500.4941-100000@clipper.ens.fr> (raw)
In-Reply-To: <20030302193437.A6487@pauillac.inria.fr>
On Sun, 2 Mar 2003, Xavier Leroy wrote:
> > > in Haskell, strings are lists of chars.
> >
> > And what a horrible design decision that is!
>
> Agreed. Well, it's a great way to multiply the memory requirements
> for your strings by a factor of 12 (on 32-bit platforms) or 24 (on
> 64-bit platforms), while at the same time losing constant-time
> indexing :-)
Not necessarily. Considering strings as lists of chars is a design
decision about the semantics of the language, not its implementation. For
instance in CDuce, we choosed to define the type String as [Char*], that
is, homogeneous sequences of (Unicode) characters. The idea is that we
want to have characters and elements at the same level inside XML elements
(the other solution is to have elements and strings; but when you
concatenate two sequences of elements-or-strings, you want to perform
implicit concatenation of strings at the middle to avoid two consecutive
strings). For the implementation, we introduce a compact representation of
strings at runtime, namely a pointer to a segment of a buffer (OCaml
string) + a "continuation" (which can be a representation of the same
kind, or a real list cell). When a pattern tries to deconstruct such a
representation to see it as list cell (head,tail), the runtime system
extracts the first character from the buffer and compute a pointer to the
next character (depending on the internal Unicode encoding of the buffer).
When a pattern extracts a consecutive subsequence (substrings) - patterns
in CDuce are roughly regular expressions - it actually returns a pointer
to a subsegment of the buffer. Concatenation of sequences is computed
lazily, to avoid quadractic behavior when appending elements one by one at
the end. Basically, in many situations, we avoid space overhead and keep a
reasonable time overhead.
I'm not advocating this kind of techniques for a fully general
language such as OCaml; I just wanted to defend the "horrible" design for
specific applications...
-- Alain
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2003-03-03 21:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-02-27 22:31 Oliver Bandel
2003-02-28 1:03 ` brogoff
2003-03-02 18:34 ` Xavier Leroy
2003-03-02 19:03 ` Alain.Frisch [this message]
2003-03-03 8:50 ` Luc Maranget
2003-03-03 17:12 ` brogoff
2003-03-03 17:40 ` Diego Olivier Fernandez Pons
2003-03-04 2:49 ` Eric C. Cooper
2003-03-04 8:29 ` Fabrice Le Fessant
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.SOL.4.44.0303021938500.4941-100000@clipper.ens.fr \
--to=alain.frisch@ens.fr \
--cc=brogoff@speakeasy.net \
--cc=caml-list@inria.fr \
--cc=oliver@first.in-berlin.de \
--cc=xavier.leroy@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox