From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.9 required=5.0 tests=AWL,SPF_NEUTRAL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id 3D99FBC6C for ; Fri, 18 Jan 2008 23:54:24 +0100 (CET) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AgAAAGC8kEfRVYC7mGdsb2JhbACQFQEBAQEBBgIGCQoYgRSVNYdo X-IronPort-AV: E=Sophos;i="4.25,218,1199660400"; d="scan'208";a="21491546" Received: from fk-out-0910.google.com ([209.85.128.187]) by mail4-smtp-sop.national.inria.fr with ESMTP; 18 Jan 2008 23:54:23 +0100 Received: by fk-out-0910.google.com with SMTP id b27so1017728fka.11 for ; Fri, 18 Jan 2008 14:54:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to:content-type:content-transfer-encoding:mime-version:subject:date:x-mailer:sender; bh=qjphfwZg3Iqa58fEU5lMWwqEtpWmGlbGL+AMo5OjJgM=; b=a+4OFm6a4dOH5pObILWs+gQyL3IKZsjViRIteBdPjqr2pMHu7YRXGia2mS7gmCxhN/+dj8RJbVwHMGOZesUxznk2E/A0rndtN1ID+MJcpWd33v+sTz4qi3I5yLH7CtKNzwmCh/6FSCqGzO6H+W9A9Mm9DuGoWJzsJTkc0edkdRY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:content-type:content-transfer-encoding:mime-version:subject:date:x-mailer:sender; b=jvqpOfpC/HYSl1iSfxQhd9aVZABRuljtKWdNUae1zO0ifKIb6mSdSCJK1dtJz7JQm6XWHd7ZrNEvZPHzsnDsxc6/tfkAazXH5hVtJJhDxI0YFPt/AaFLKwL78SV18Y9H8b/VyPTfrV+UCTb76pIv0UhJi2HH4hV1MqP2bDiQSB8= Received: by 10.78.138.14 with SMTP id l14mr5716552hud.57.1200696863172; Fri, 18 Jan 2008 14:54:23 -0800 (PST) Received: from ?192.168.1.58? ( [85.2.14.75]) by mx.google.com with ESMTPS id 31sm2429831nfu.5.2008.01.18.14.54.22 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 18 Jan 2008 14:54:22 -0800 (PST) Message-Id: <11B29ED2-4DE3-45F1-BF19-E19790944216@erratique.ch> From: =?ISO-8859-1?Q?B=FCnzli_Daniel?= To: caml-list caml-list Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v915) Subject: Seeking feedback on Xmlm Date: Fri, 18 Jan 2008 23:54:20 +0100 X-Mailer: Apple Mail (2.915) Sender: =?ISO-8859-1?Q?Daniel=20B=FCnzli?= X-Spam: no; 0.00; bunzli:01 buenzli:01 semantics:01 parses:01 commited:01 polymorphic:01 syntactic:01 parsing:01 parsing:01 exception:01 functions:01 int:01 int:01 boolean:01 prefixes:01 Hello, I plan to make some changes to Xmlm. While it has been downloaded =20 approximatively a hundred time I never got any feedback on it (bots =20 only ?). If anybody found any issues I'd be happy to know about them =20 now. I would also gladly take comments on the following points. 1. I plan to remove the persistent cursor and the tree representation ('a =20= Xmlm.tree and 'a Xmlm.cursor). =46rom my experience 'a Xmlm.tree is =20 awkward to pattern match on and more than once I found it much cleaner =20= to input documents with the sequential interface into a custom data =20 structure corresponding to the document's semantics. Besides while I =20 really see the point of the zipper in the context in which it was =20 invented (ui for a structured text editor) I wasn't =13convinced by the =20= use of the cursor to make "batch" tree processing hence I don't think =20= it has its place at Xmlm's level. Finally using the sequential =20 interface to input/output a custom tree representation is only a few =20 lines of code and provided in the documentation's sample code [1]. By =20= removing these types I hope Xmlm's users won't waste their time to =20 reach to the same conclusions. 2. Instead of throwing an exception the input function will return a =20 value of type [ `Value of 'a | `Error of (int * int) * error ]. Since =20= this will break backward compatibility, I'll take the opportunity to =20 also change the 'error' type from a variant to a polymorphic one, for =20= syntactic convenience. 3. I will implement better xml namespace support. Currently Xmlm parses =20 qualified names however the client has to maintain its own structure =20 during parsing to know in which namespace he is. The idea is to add =20 the boolean label ?expand_names to the input function. When set to =20 true, you get expanded names instead of qualified names (an expanded =20 name is a couple (uri, name) where uri is the namespace uri). The =20 output_of_* functions will have a label ~expanded_names to indicate =20 that we will pass expanded names and xmlm will automatically take care =20= of the rest (though to have pretty prefixes you'll have to process =20 your names manually). 4. As an external add-on I commited the file test/xhtml.ml containing a =20 mapping from xhtml 1.1 entities to their corresponding utf-8 character =20= sequence. This can be used to construct a function to resolve xhtml =20 entities and hence get xhtml parsing at your fingertips. This will be all for this new version. I hope this is the last time I =20= break backward compatibility. Best, Daniel [1] http://erratique.ch/software/xmlm/doc/Xmlm#ex