From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by yquem.inria.fr (Postfix) with ESMTP id A9554BBCA for ; Tue, 13 May 2008 15:49:30 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AiwCAKM5KUjQccgFf2dsb2JhbACSEAEBCwUCBgcRmkU X-IronPort-AV: E=Sophos;i="4.27,479,1204498800"; d="scan'208";a="10666553" Received: from lax-green-bigip-5.dreamhost.com (HELO looneymail-a3.g.dreamhost.com) ([208.113.200.5]) by mail2-smtp-roc.national.inria.fr with ESMTP; 13 May 2008 15:49:30 +0200 Received: from Robert-Fischers-Smokejumping-MacBook-Pro.local (unknown [64.241.37.140]) by looneymail-a3.g.dreamhost.com (Postfix) with ESMTP id 2D6DE27E1F for ; Tue, 13 May 2008 06:49:28 -0700 (PDT) Message-ID: <48299C67.1010905@fischerventure.com> Date: Tue, 13 May 2008 08:49:27 -0500 From: Robert Fischer User-Agent: Thunderbird 2.0.0.14 (Macintosh/20080421) MIME-Version: 1.0 To: caml-list@yquem.inria.fr Subject: Re: [Caml-list] Re: Why OCaml sucks References: <200805090139.54870.jon@ffconsultancy.com> <200805120854.45499.ober.14@osu.edu> <200805121516.13983.jon@ffconsultancy.com> <200805130933.13952.ober.14@osu.edu> In-Reply-To: <200805130933.13952.ober.14@osu.edu> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Spam: no; 0.00; ocaml:01 encodings:01 runtime:01 lexer:01 indirection:01 byte:01 assertion:01 mutable:01 imho:01 caml-list:01 strings:01 data:02 bytes:03 dispatch:03 dispatch:03 >>>> 5. Strings: pushing unicode throughout a general purpose language is a >>>> mistake, IMHO. This is why languages like Java and C# are so slow. >>> Unicode by itself, when wider-than-byte encodings are used, adds "zero" >>> runtime overhead; the only overhead is storage (2 or 4 bytes per >>> character). >> You cannot degrade memory consumption without also degrading performance. >> Moreover, there are hidden costs such as the added complexity in a lexer >> which potentially has 256x larger dispatch tables or an extra indirection >> for every byte read. > Okay, I was going to let this slide, but it kept resurfacing and annoying me. Is there any empirical support for the assertion that Java and C# are slow because of *unicode*? Of all things, *unicode*? The fact that they're bytecod languages isn't a bigger hit? At least with the JVM, the hypercomplicated GC should probably take some of the blame, too -- I've seen 2x speed increases by *reducing* the space available to the GC, and 10x speed increases by boosting the space available to ridiculous levels so that the full GC barely ever has to fire. The the nigh-universal optimization-ruining mutable data and virtual function (e.g. method) dispatch I'm sure doesn't help, too. And this is to say nothing of user-space problems like the explosion of nontrivial types associated with the object-driven style. With all that going on, you're blaming their *Unicode support* for why they're slow? "This is why languages like Java and C# are so slow." Really? Got evidence for that? ~~ Robert.