Seeking feedback on Xmlm - Bünzli Daniel

Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed

From: "Bünzli Daniel" <daniel.buenzli@erratique.ch>
To: caml-list caml-list <caml-list@yquem.inria.fr>
Subject: Seeking feedback on Xmlm
Date: Fri, 18 Jan 2008 23:54:20 +0100	[thread overview]
Message-ID: <11B29ED2-4DE3-45F1-BF19-E19790944216@erratique.ch> (raw)

Hello,

I plan to make some changes to Xmlm. While it has been downloaded  
approximatively a hundred time I never got any feedback on it (bots  
only ?). If anybody found any issues I'd be happy to know about them  
now.

I would also gladly take comments on the following points.

1.
I plan to remove the persistent cursor and the tree representation ('a  
Xmlm.tree and 'a Xmlm.cursor). From my experience 'a Xmlm.tree is  
awkward to pattern match on and more than once I found it much cleaner  
to input documents with the sequential interface into a custom data  
structure corresponding to the document's semantics. Besides while I  
really see the point of the zipper in the context in which it was  
invented (ui for a structured text editor) I wasn't \x13convinced by the  
use of the cursor to make "batch" tree processing hence I don't think  
it has its place at Xmlm's level. Finally using the sequential  
interface to input/output a custom tree representation is only a few  
lines of code and provided in the documentation's sample code [1]. By  
removing these types I hope Xmlm's users won't waste their time to  
reach to the same conclusions.

2.
Instead of throwing an exception the input function will return a  
value of type [ `Value of 'a | `Error of (int * int) * error ]. Since  
this will break backward compatibility, I'll take the opportunity to  
also change the 'error' type from a variant to a polymorphic one, for  
syntactic convenience.

3.
I will implement better xml namespace support. Currently Xmlm parses  
qualified names however the client has to maintain its own structure  
during parsing to know in which namespace he is. The idea is to add  
the boolean label ?expand_names to the input function. When set to  
true, you get expanded names instead of qualified names (an expanded  
name is a couple (uri, name) where uri is the namespace uri).  The  
output_of_* functions will have a label ~expanded_names to indicate  
that we will pass expanded names and xmlm will automatically take care  
of the rest (though to have pretty prefixes you'll have to process  
your names manually).

4.
As an external add-on I commited the file test/xhtml.ml containing a  
mapping from xhtml 1.1 entities to their corresponding utf-8 character  
sequence. This can be used to construct a function to resolve xhtml  
entities and hence get xhtml parsing at your fingertips.

This will be all for this new version. I hope this is the last time I  
break backward compatibility.

Best,

Daniel

[1] http://erratique.ch/software/xmlm/doc/Xmlm#ex

                 reply	other threads:[~2008-01-18 22:54 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11B29ED2-4DE3-45F1-BF19-E19790944216@erratique.ch \
    --to=daniel.buenzli@erratique.ch \
    --cc=caml-list@yquem.inria.fr \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox