From: Alain Frisch <alain@frisch.fr>
To: Jim Miller <gordon.j.miller@gmail.com>
Cc: caml-list <caml-list@yquem.inria.fr>
Subject: Re: [Caml-list] [OSR] Suggested topic - XML processing API
Date: Wed, 30 Jan 2008 08:35:44 +0100 [thread overview]
Message-ID: <47A028D0.2000909@frisch.fr> (raw)
In-Reply-To: <beed19130801291926u36e7fc30w958d0370c87d3bf0@mail.gmail.com>
Jim Miller wrote:
> type xmlNode =
> | XmlElement of (namespace: string * tagName: string * attributes:
> (string * string) list * (children:xmlNode list) )
> | XmlPCData of (text:string)
There has been some discussions here a while ago about standardizing XML
types across OCaml libraries. You might want to look up the archives.
Here are some random remarks.
First, you need to specify several things in the type above.
- the encoding of strings; if the parser cannot be configured, I guess
that normalizing everything to utf-8 is the most natural choice.
- the handling of namespaces; does the first argument to XmlElement
refers to the namespace prefix as used in the document (it'd make
matching impossible because the document can use arbitrary prefixes), a
normalized version (you'd need to provide the parser with more info), or
the namespace URI (which makes pattern matching quite tedious). Also, it
is sometimes necessary to keep the [prefix->uri] dictionnary available
in at every node (e.g. to deal with XML Schema documents, where prefixes
can be used in attribute values). Moreover, some XML documents may be
valid w.r.t. to the XML spec without conforming to the XML Namespaces one.
- whether adjacent XmlPCData nodes are allowed or not.
- whether the parser performs whitespace normalization (and how).
Also, in many cases, the client of the parser might want to get more
information, like locations in the source document.
If you intend to use the same type to produce XML documents from an
internal representation, I think you might want to add an extra constructor:
| XmlMany of xmlNode list
This makes it much easier to build and compose XML fragments in a
modular way.
Also, you need to specify how the XML printer is supposed to deal with
namespaces.
-- Alain
next prev parent reply other threads:[~2008-01-30 7:40 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-30 0:54 Jim Miller
2008-01-30 2:37 ` [Caml-list] " Bünzli Daniel
2008-01-30 3:26 ` Jim Miller
2008-01-30 7:35 ` Alain Frisch [this message]
2008-01-30 10:32 ` Bünzli Daniel
2008-01-30 10:35 ` Jon Harrop
2008-01-30 17:25 ` Jim Miller
2008-02-05 3:23 ` Jim Miller
2008-02-05 5:02 ` Alain Frisch
2008-02-05 8:36 ` Bünzli Daniel
2008-02-05 9:51 ` Vincent Hanquez
2008-02-05 10:13 ` Jacques Garrigue
2008-02-05 11:14 ` Vincent Hanquez
2008-02-05 10:31 ` Bünzli Daniel
2008-02-05 10:43 ` Nicolas Pouillard
2008-02-05 13:29 ` Jon Harrop
2008-02-05 14:53 ` micha
2008-02-05 14:53 ` Jon Harrop
2008-02-05 14:57 ` David Teller
2008-02-05 11:21 ` Vincent Hanquez
2008-02-05 8:15 ` Vincent Hanquez
2008-02-05 11:16 ` Stefano Zacchiroli
2008-01-30 15:55 ` Vincent Hanquez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47A028D0.2000909@frisch.fr \
--to=alain@frisch.fr \
--cc=caml-list@yquem.inria.fr \
--cc=gordon.j.miller@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox