From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by yquem.inria.fr (Postfix) with ESMTP id E22B7BB84 for ; Wed, 14 May 2008 14:04:31 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: An0AAPlxKkgRlBBRmWdsb2JhbACSGQEBAQEBCAUGCREDmy8 X-IronPort-AV: E=Sophos;i="4.27,486,1204498800"; d="scan'208";a="12574917" Received: from smtpoutm.mac.com ([17.148.16.81]) by mail3-smtp-sop.national.inria.fr with ESMTP; 14 May 2008 14:04:30 +0200 Received: from mac.com (asmtp010-s [10.150.69.73]) by smtpoutm.mac.com (Xserve/smtpout018/MantshX 4.0) with ESMTP id m4EC4T9k004991 for ; Wed, 14 May 2008 05:04:29 -0700 (PDT) Received: from [192.168.8.107] (c-24-218-151-219.hsd1.ma.comcast.net [24.218.151.219]) (authenticated bits=0) by mac.com (Xserve/asmtp010/MantshX 4.0) with ESMTP id m4EC4Qm4012934 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for ; Wed, 14 May 2008 05:04:27 -0700 (PDT) Message-Id: From: Gordon Henriksen To: caml-list@yquem.inria.fr In-Reply-To: <2a1a1a0c0805132204m33659840k1f84eb59f93bd3e4@mail.gmail.com> Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: [Caml-list] Re: Why OCaml sucks [unicode] Date: Wed, 14 May 2008 08:04:23 -0400 References: <2a1a1a0c0805132204m33659840k1f84eb59f93bd3e4@mail.gmail.com> X-Mailer: Apple Mail (2.919.2) X-Spam: no; 0.00; ocaml:01 runtime:01 bindings:01 fwiw:01 bool:01 bool:01 ifdef:01 16.:98 char:01 wrote:01 caml-list:01 strings:01 variant:02 variant:02 define:02 On May 14, 2008, at 01:04, Mike Lin wrote: > Re: unicode support, anyone else who was around programming Win32 =20 > 5-10 years ago (before .NET) might join me in offering a wary word =20 > of caution: the otherwise-disarming "let's just support both in one =20= > runtime" philosophy can be taken too far. In the nightmare scenario =20= > (which was reality until Windows 9x finally died an overdue death) =20 > you end up with two versions of every library function that takes a =20= > string. (This when we now have like 3 contenders for a standard =20 > library and n GUI toolkit bindings!) FWIW, this continues to be the reality. Win32 entry points that handle =20= strings are declared as such: BOOL DoFooW(...); // UTF-16 variant takes wchar_t. Always UTF-16. BOOL DoFooA(...); // =93ANSI=94 character set variant takes char. Could = be =20 UTF-8. #ifdef _UNICODE # define DoFoo DoFooW #else # define DoFoo DoFooA #endif Doubtless, the implementation in most cases transcodes the string in =20 one variant and passes it to the other. > I'll take a couple days to try to reason out more specifically about =20= > how that got so out of control, but I just want to note that there =20 > is a precedent for trying to have a sympatric speciation between =20 > ASCII and unicode software and, while it was necessary for =20 > historical reasons, it was pretty godawful to deal with! =97 Gordon