From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from weis@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id TAA21970 for caml-redistribution; Thu, 21 Oct 1999 19:10:47 +0200 (MET DST) Received: from nez-perce.inria.fr (nez-perce.inria.fr [192.93.2.78]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id OAA13398 for ; Thu, 21 Oct 1999 14:05:03 +0200 (MET DST) Received: from mail.nap.com.ar (mail-in.nap.com.ar [200.49.40.90]) by nez-perce.inria.fr (8.8.7/8.8.7) with SMTP id OAA27992 for ; Thu, 21 Oct 1999 14:04:51 +0200 (MET DST) Received: from [200.41.180.74] (HELO k-bell.com) by mail.nap.com.ar (Stalker SMTP Server 1.8b3) with ESMTP id S.0003814090; Thu, 21 Oct 1999 09:04:43 -0300 Message-ID: <380F0157.CDBBAD7D@k-bell.com> Date: Thu, 21 Oct 1999 09:05:00 -0300 From: =?iso-8859-1?Q?Mat=EDas?= Giovannini Reply-To: matias@k-bell.com Organization: Script S.A. X-Mailer: Mozilla 4.7 (Macintosh; I; PPC) X-Accept-Language: en,es-AR,es MIME-Version: 1.0 To: caml-list@inria.fr CC: Gerd.Stolpmann@darmstadt.netsurf.de, skaller Subject: Re: localization, internationalization and Caml References: <380CB30E.56D1A8A2@maxtal.com.au> <99102100543400.15513@ice> Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: weis Gerd Stolpmann wrote: > > On Tue, 19 Oct 1999, John Skaller wrote: > >Gerd Stolpmann wrote: > >> The enlarged character sets become more and more important, and it is only a > >> matter of time until every piece of software which wants to be taken seriously > >> can process them, even a dumb terminal or simple text editor. So you will be > >> able to put accented characters into your comments, and you will see them as > >> such even if you 'cat' the program text to the terminal or printer; this will > >> work everywhere... > > > > Yes. This time is not here yet, but it will come soon that > >international support is mandatory for all large software purchases > >by governments and large corporations. > > I do not believe that this will be the driving force because the current > solutions exist, and it is VERY expensive to replace them. It is even cheaper > to replace a language than a character set/encoding. Looks like another Year > 2000 but without deadline. I still don't understand the point of this discussion. As a MacOS programmer of many years, I tend to view localization and internationalization as tasks best performed by the operating system, or at least by pluggable modules. This discussion of patching l12n and i18n functions *into* OCaml is, to me at least, losing direction. OCaml uses Latin1 for its *internal* encoding of identifiers. While I'll agree that my view is chauvinistic (and selfish, perhaps: I already have "¿¡áéíóúuñÁÉÍÓÚÜÑ" for writing in Spanish, why should I ask for more?), I see no restriction in that (well, If I were Chinese, or Egiptian, I would see things differently). What's more, the whole syntactic apparatus of a programming language *assumes* a Latin setting, where things make sense when read from left to right, from top to bottom; and where punctuation is what we're used to. Programming languages suited for a Han, or Arab, or even a Hebrew audience would have to be rethinked from the grounds up. On the other hand, OCaml provides a String type that *can be* seen as a variable-length sequence of uninterpreted bytes. We have uninterpreted bytes! It's all we need to build whatever I18NString type we may need. What is missing is *library* facilities to abstract that view into a full-fledged i18n machinery. Of course, there's a problem with the manipulation of 32-bit integer values, but if used with care, the Nat datatype could serve perfectly well as the underlying, low-level datatype. Which makes me think, John, you already have variable-length int arrays. Nat's are as unsafe as they get :-) Regards, Matías. -- I got your message. I couldn't read it. It was a cryptogram. -- Laurie Anderson