From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p1SAC8Xs022624 for ; Mon, 28 Feb 2011 11:12:08 +0100 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AvUBAIIDa01RZ90xkWdsb2JhbACEJKE9ZBUBAQEBCQsKBxEDIqtfj3+BJ4NEdgSMHYwR X-IronPort-AV: E=Sophos;i="4.62,238,1297033200"; d="scan'208";a="76725675" Received: from mtaout03-winn.ispmail.ntl.com ([81.103.221.49]) by mail3-smtp-sop.national.inria.fr with ESMTP; 28 Feb 2011 11:12:03 +0100 Received: from aamtaout03-winn.ispmail.ntl.com ([81.103.221.35]) by mtaout03-winn.ispmail.ntl.com (InterMail vM.7.08.04.00 201-2186-134-20080326) with ESMTP id <20110228101202.YFWW23441.mtaout03-winn.ispmail.ntl.com@aamtaout03-winn.ispmail.ntl.com>; Mon, 28 Feb 2011 10:12:02 +0000 Received: from romulus.metastack.com ([81.102.132.77]) by aamtaout03-winn.ispmail.ntl.com (InterMail vG.3.00.04.00 201-2196-133-20080908) with ESMTP id <20110228101202.YEDE28282.aamtaout03-winn.ispmail.ntl.com@romulus.metastack.com>; Mon, 28 Feb 2011 10:12:02 +0000 Received: from remus.metastack.local ([172.16.0.1]) by romulus.metastack.com (8.14.2/8.14.2) with ESMTP id p1SABw4Q003367 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Mon, 28 Feb 2011 10:11:59 GMT Received: from Remus.metastack.local ([fe80::547c:3c42:e1da:eda2]) by Remus.metastack.local ([fe80::547c:3c42:e1da:eda2%11]) with mapi; Mon, 28 Feb 2011 10:07:12 +0000 From: David Allsopp To: =?utf-8?B?J0RhbmllbCBCw7xuemxpJw==?= , "'Christophe TROESTLER'" CC: "'OCaml Mailing List'" Thread-Topic: [Caml-list] GSoC: better UTF-8 support Thread-Index: AQHL1yHbpO6NgLMzxEmbDuJZqeBtYpQWnSWAgAALgYA= Date: Mon, 28 Feb 2011 10:07:11 +0000 Message-ID: References: <20110228.093528.996524125295855263.Christophe.Troestler@umons.ac.be> In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Organization: MetaStack Solutions Ltd. X-Scanned-By: MIMEDefang 2.65 on 81.102.132.77 X-Cloudmark-Analysis: v=1.1 cv=R50lirqlHffDPPkwUlkuVa99MrvKdVWo//yz83qex8g= c=1 sm=0 a=fAv-q_DikvIA:10 a=IkcTkHD0fZMA:10 a=xqWC_Br6kY4A:10 a=ZpFJ-VaCQrGO8RhLpKAA:9 a=vS7bXYMPKVtd-88181xOvhZJgWkA:4 a=QEXdDO2ut3YA:10 a=HpAAvcLHHh0Zw7uRqdWCyQ==:117 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by walapai.inria.fr id p1SAC8Xs022624 Subject: RE: [Caml-list] GSoC: better UTF-8 support Daniel Bünzli wrote: > > - UTF8.Char and UTF8.String modules should be written with the same > >  interface as Char and String.  [Camomile should be adapted > >  consequently.] > > Is it a good idea to replicate the poor interface that the module Char > and String represent to manipulate strings ? If it's to go into the standard library then yes, it should exactly replicate the interface of the Char and String modules, that way within the standard library UTF8 can be used as a drop-in replacement (via a module or open statement at the top of some code). Anything else would be a very bad idea - two different interfaces over two representations of the same abstract concept (i.e. strings) would be far worse than one slightly inferior interface used on both. Out of interest, what are your complaints against the String and Char modules - missing functions or something deeper? Anything else sounds like too wild a change to have a cat in hell's chance of getting into the standard library as a patch. > > - Graphics: UTF-8 text printing > > Are there really a lot of people using the Graphics module ? Again, if the idea is to get UTF-8 support properly into the *standard* library then all aspects of string processing within the whole of the standard library should support it properly. > > - Str: (character ranges) > > This would be the only interesting thing to me. It is mildly amusing that you criticise the String and Char modules, yet have interest in this module given that Pcre is so often recommended as the practical/sensible one to use (I know that Str is a lot faster than it used to be and indeed use it myself when I don't want the external dependency) :o) David