From: "Till Varoquaux" <till.varoquaux@gmail.com>
To: "Vincent Hanquez" <tab@snarc.org>
Cc: "Dario Teixeira" <darioteixeira@yahoo.com>, caml-list@yquem.inria.fr
Subject: Re: [Caml-list] XML library for validating MathML
Date: Thu, 18 Sep 2008 10:12:26 +0100 [thread overview]
Message-ID: <9d3ec8300809180212r7e3dcdf3wd13c5cff69d5034b@mail.gmail.com> (raw)
In-Reply-To: <20080918083853.GA15219@snarc.org>
PXP is tough to work with and feels a bit crazy but it is good with
standards (It can sort out any DTD's I have ever thrown at it).
xml-light is, well, very broken (it doesn't even support charcode
switching). There are several XML parsers in OCaml and I've had a
stint with a few of them; the only two I would consider using are
expat and Pxp with a marked preference for the later. PXP can be very
confusing and feels over engineered at times but it does the job. And
remember parsing XML is a hard job, much harder than we often give it
credit for....
Hats off to Gerd for providing us with a proper parser.
Till
On Thu, Sep 18, 2008 at 9:38 AM, Vincent Hanquez <tab@snarc.org> wrote:
> On Wed, Sep 17, 2008 at 11:58:05AM -0700, Dario Teixeira wrote:
>> Given a string containing a mathematical expression in the MathML
>> markup, I need to verify that the expression is indeed valid MathML.
>> I am therefore looking for an XML library that can verify an expression
>> against a given DTD.
>>
>> Now, I have tried Xml-light, and the code I used is listed below.
>> Unfortunately, it fails when trying to parse MathML's DTD (it's the
>> standard DTD from the W3C). I have tried simpler DTDs, and it does work
>> with them; am I therefore correct in assuming that Xml-light can only
>> handle a particular version/subset of DTD features?
>
> I don't know about validation (i'll probably suggest looking at PXP tho),
> but xml-light is very bad for XML compliance. the library is (happily) parsing
> XML files that it shouldn't, which tell a lots concerning its validation
> abilities ...
>
> for example, the XML supported character range is not even checked:
>
> Xml 1.0 specification -- 2.2 Characters
>
> Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] |
> [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> others problems include (uncomplete list):
> - complete unicode un-awareness
> - funny & wrong entities handling
>
> --
> Vincent
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>
next prev parent reply other threads:[~2008-09-18 9:12 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-17 18:58 Dario Teixeira
2008-09-17 22:13 ` [Caml-list] " Richard Jones
2008-09-18 2:58 ` Matt Gushee
2008-09-18 8:06 ` Re : " Adrien
2008-09-18 8:38 ` Vincent Hanquez
2008-09-18 9:12 ` Till Varoquaux [this message]
2008-09-18 9:44 ` Vincent Hanquez
2008-09-18 11:52 ` Gerd Stolpmann
2008-09-18 13:35 ` Markus Mottl
2008-09-19 11:30 ` Matt Gushee
2008-09-18 14:26 ` Dario Teixeira
2008-09-18 17:58 ` Dario Teixeira
2008-09-18 18:28 ` Gerd Stolpmann
2008-09-18 20:44 ` Dario Teixeira
2008-09-18 20:48 ` Gerd Stolpmann
2008-09-19 13:23 ` Stefano Zacchiroli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9d3ec8300809180212r7e3dcdf3wd13c5cff69d5034b@mail.gmail.com \
--to=till.varoquaux@gmail.com \
--cc=caml-list@yquem.inria.fr \
--cc=darioteixeira@yahoo.com \
--cc=tab@snarc.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox