From: Kuba Ober <ober.14@osu.edu>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Re: Why OCaml sucks
Date: Tue, 13 May 2008 09:33:13 -0400 [thread overview]
Message-ID: <200805130933.13952.ober.14@osu.edu> (raw)
In-Reply-To: <200805121516.13983.jon@ffconsultancy.com>
On Monday 12 May 2008, Jon Harrop wrote:
> On Monday 12 May 2008 13:54:45 Kuba Ober wrote:
> > > 5. Strings: pushing unicode throughout a general purpose language is a
> > > mistake, IMHO. This is why languages like Java and C# are so slow.
> >
> > Unicode by itself, when wider-than-byte encodings are used, adds "zero"
> > runtime overhead; the only overhead is storage (2 or 4 bytes per
> > character).
>
> You cannot degrade memory consumption without also degrading performance.
> Moreover, there are hidden costs such as the added complexity in a lexer
> which potentially has 256x larger dispatch tables or an extra indirection
> for every byte read.
In a typical programming language which only accepts ASCII characters outside
of string constants, your dispatch table will be short anyway (covers ASCII
subset only), and there will be an extra comparison or two, active only when
lexing strings. So no biggie.
> > Given that storage is cheap, I'd much rather have Unicode support than
> > lack of it.
>
> Sure. I don't mind unicode being available. I just don't want to have to
> use it myself because it is of no benefit to me (or many other people) but
> is a significant cost.
Let's look at a relatively widely deployed example: Qt toolkit.
Qt uses a 16 bit Unicode representation, and I really doubt that there are any
runtime-measurable costs associated with it. By "runtime measurable" I mean
that, say, application startup would take longer. A typical Qt application
will do quite a bit of string manipulation on startup (even file names
are stored in Unicode and converted to/from OS's code page), and they have
slashed startup time by half on "major" applications, between Qt 3 and Qt 4,
by doing algorithmic-style optimizations unrelated to strings (reducing number
of malloc's, for one). So, unless you can show that one of your applications
actually runs faster when you use non-Unicode strings as compared to well
implemented Unicode ones, I will not really consider Unicode to be a problem.
I do agree that many tools, like lexer generators, may not be Unicode-aware or
have poorly implemented Unicode awareness. The 256x lexer table blowup
shouldn't happen even if you were implementing APL with fully Unicode-aware
lexer. The 1st level lexer table should be split into two pieces (ASCII and
APL ranges), and everything else is either an error or goes opaquely into
string constants.
A lexer jump table only makes sense when it actually saves time compared to
a bunch of compare-and-jumps. On modern architectures some jump lookup tables
may actually be slower than compare-and-jumps, because some hardware
optimizations done by CPU (say branch prediction) may simply ignore branch
lookup tables, or only handle tables commonly generated by compilers...
Cheers, Kuba
next prev parent reply other threads:[~2008-05-13 13:33 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-09 0:39 Jon Harrop
2008-05-09 1:11 ` [Caml-list] " Matthew William Cox
2008-05-09 5:10 ` [Caml-list] Re: Why OCaml **cks Jon Harrop
2008-05-09 4:45 ` [Caml-list] Re: Why OCaml sucks Arthur Chan
2008-05-09 5:09 ` Jon Harrop
2008-05-09 11:12 ` [Caml-list] Re: Why OCaml rocks Gerd Stolpmann
2008-05-09 11:58 ` Gabriel Kerneis
2008-05-09 12:10 ` Concurrency [was Re: [Caml-list] Re: Why OCaml rocks] Robert Fischer
2008-05-09 12:41 ` [Caml-list] Re: Why OCaml rocks Gerd Stolpmann
2008-05-09 12:49 ` David Teller
2008-05-09 18:10 ` Jon Harrop
2008-05-09 20:40 ` Gerd Stolpmann
2008-05-09 20:55 ` Berke Durak
2008-05-10 10:56 ` Gerd Stolpmann
2008-05-09 21:00 ` Till Varoquaux
2008-05-09 21:13 ` Berke Durak
2008-05-09 22:26 ` Richard Jones
2008-05-09 23:01 ` Berke Durak
2008-05-10 7:52 ` Richard Jones
2008-05-10 8:24 ` Berke Durak
2008-05-10 8:51 ` Richard Jones
2008-05-13 3:47 ` Jon Harrop
2008-05-09 22:25 ` David Teller
2008-05-09 22:57 ` Vincent Hanquez
2008-05-10 19:59 ` Jon Harrop
2008-05-10 21:39 ` Charles Forsyth
2008-05-11 3:58 ` Jon Harrop
2008-05-11 9:41 ` Charles Forsyth
2008-05-12 13:22 ` Richard Jones
2008-05-12 18:07 ` Jon Harrop
2008-05-12 20:05 ` Arthur Chan
2008-05-13 0:42 ` Gerd Stolpmann
2008-05-13 1:19 ` Jon Harrop
2008-05-13 2:03 ` Gerd Stolpmann
2008-05-13 3:13 ` Jon Harrop
2008-05-12 20:33 ` Arthur Chan
2008-05-12 21:22 ` Till Varoquaux
2008-05-09 13:00 ` [Caml-list] Re: Why OCaml sucks Ulf Wiger (TN/EAB)
2008-05-09 17:46 ` Jon Harrop
2008-05-09 18:17 ` Ulf Wiger (TN/EAB)
2008-05-10 1:29 ` Jon Harrop
2008-05-10 14:51 ` [Caml-list] Re: Why OCaml **cks Ulf Wiger (TN/EAB)
2008-05-10 18:19 ` Jon Harrop
2008-05-10 21:58 ` Ulf Wiger (TN/EAB)
2008-05-10 18:39 ` Mike Lin
2008-05-12 13:31 ` [Caml-list] Re: Why OCaml sucks Kuba Ober
2008-05-12 18:18 ` Jon Harrop
2008-05-12 13:13 ` Kuba Ober
2008-05-12 19:32 ` Arthur Chan
2008-05-09 6:31 ` Tom Primožič
2008-05-09 6:46 ` Elliott Oti
2008-05-09 7:53 ` Till Varoquaux
2008-05-09 7:45 ` Richard Jones
2008-05-09 8:10 ` Jon Harrop
2008-05-09 9:31 ` Richard Jones
2008-05-09 7:58 ` [Caml-list] Re: Why OCaml rocks David Teller
2008-05-09 10:29 ` Jon Harrop
2008-05-09 13:08 ` David Teller
2008-05-09 15:38 ` Jeff Polakow
2008-05-09 18:09 ` Jon Harrop
2008-05-09 20:36 ` Berke Durak
2008-05-09 22:34 ` Richard Jones
2008-05-14 13:44 ` Kuba Ober
2008-05-09 8:29 ` constructive criticism about Ocaml Ulf Wiger (TN/EAB)
2008-05-09 9:45 ` [Caml-list] Re: Why OCaml sucks Vincent Hanquez
2008-05-09 10:23 ` [Caml-list] Re: Why OCaml **cks Jon Harrop
2008-05-09 22:01 ` Vincent Hanquez
2008-05-09 22:23 ` David Teller
2008-05-10 8:36 ` Christophe TROESTLER
2008-05-10 9:18 ` Vincent Hanquez
2008-05-09 11:37 ` [Caml-list] Re: Why OCaml sucks Ralph Douglass
2008-05-09 13:02 ` [Caml-list] Re: Why OCaml rocks David Teller
2008-05-09 12:33 ` not all functional languages lack parallelism Ulf Wiger (TN/EAB)
2008-05-09 18:10 ` Jon Harrop
2008-05-09 20:26 ` Ulf Wiger (TN/EAB)
2008-05-12 12:54 ` [Caml-list] Re: Why OCaml sucks Kuba Ober
2008-05-12 14:16 ` Jon Harrop
2008-05-13 13:33 ` Kuba Ober [this message]
2008-05-13 13:49 ` Robert Fischer
2008-05-13 14:01 ` Brian Hurt
2008-05-13 14:13 ` Robert Fischer
2008-05-13 15:18 ` Berke Durak
2008-05-14 4:40 ` Kuba Ober
2008-05-13 14:25 ` Gerd Stolpmann
2008-05-14 4:29 ` Kuba Ober
2008-05-12 13:01 ` Kuba Ober
2008-05-12 19:18 ` Arthur Chan
2008-05-12 19:41 ` Karl Zilles
2008-05-13 13:17 ` Kuba Ober
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200805130933.13952.ober.14@osu.edu \
--to=ober.14@osu.edu \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox