Mailing list for all users of the OCaml language and system.
 help / color / mirror / Atom feed
From: "Till Varoquaux" <till.varoquaux@gmail.com>
To: "Vincent Hanquez" <tab@snarc.org>
Cc: "Dario Teixeira" <darioteixeira@yahoo.com>, caml-list@yquem.inria.fr
Subject: Re: [Caml-list] XML library for validating MathML
Date: Thu, 18 Sep 2008 10:12:26 +0100	[thread overview]
Message-ID: <9d3ec8300809180212r7e3dcdf3wd13c5cff69d5034b@mail.gmail.com> (raw)
In-Reply-To: <20080918083853.GA15219@snarc.org>

PXP is tough to work with and feels a bit crazy but it is good with
standards (It can sort out any DTD's I have ever thrown at it).
xml-light is, well, very broken (it doesn't even support charcode
switching). There are several XML parsers in OCaml and I've had a
stint with a few of them; the only two I would consider using are
expat and Pxp with a marked preference for the later. PXP can be very
confusing and feels over engineered at times but it does the job. And
remember parsing XML is a hard job, much harder than we often give it
credit for....

Hats off to Gerd for providing us with a proper parser.

Till

On Thu, Sep 18, 2008 at 9:38 AM, Vincent Hanquez <tab@snarc.org> wrote:
> On Wed, Sep 17, 2008 at 11:58:05AM -0700, Dario Teixeira wrote:
>> Given a string containing a mathematical expression in the MathML
>> markup, I need to verify that the expression is indeed valid MathML.
>> I am therefore looking for an XML library that can verify an expression
>> against a given DTD.
>>
>> Now, I have tried Xml-light, and the code I used is listed below.
>> Unfortunately, it fails when trying to parse MathML's DTD (it's the
>> standard DTD from the W3C).  I have tried simpler DTDs, and it does work
>> with them; am I therefore correct in assuming that Xml-light can only
>> handle a particular version/subset of DTD features?
>
> I don't know about validation (i'll probably suggest looking at PXP tho),
> but xml-light is very bad for XML compliance. the library is (happily) parsing
> XML files that it shouldn't, which tell a lots concerning its validation
> abilities ...
>
> for example, the XML supported character range is not even checked:
>
> Xml 1.0 specification -- 2.2 Characters
>
> Char       ::=          #x9 | #xA | #xD | [#x20-#xD7FF] |
>                [#xE000-#xFFFD] | [#x10000-#x10FFFF]
>
> others problems include (uncomplete list):
> - complete unicode un-awareness
> - funny & wrong entities handling
>
> --
> Vincent
>
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs
>


  reply	other threads:[~2008-09-18  9:12 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-09-17 18:58 Dario Teixeira
2008-09-17 22:13 ` [Caml-list] " Richard Jones
2008-09-18  2:58   ` Matt Gushee
2008-09-18  8:06     ` Re : " Adrien
2008-09-18  8:38 ` Vincent Hanquez
2008-09-18  9:12   ` Till Varoquaux [this message]
2008-09-18  9:44     ` Vincent Hanquez
2008-09-18 11:52     ` Gerd Stolpmann
2008-09-18 13:35       ` Markus Mottl
2008-09-19 11:30       ` Matt Gushee
2008-09-18 14:26 ` Dario Teixeira
2008-09-18 17:58   ` Dario Teixeira
2008-09-18 18:28     ` Gerd Stolpmann
2008-09-18 20:44       ` Dario Teixeira
2008-09-18 20:48         ` Gerd Stolpmann
2008-09-19 13:23         ` Stefano Zacchiroli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9d3ec8300809180212r7e3dcdf3wd13c5cff69d5034b@mail.gmail.com \
    --to=till.varoquaux@gmail.com \
    --cc=caml-list@yquem.inria.fr \
    --cc=darioteixeira@yahoo.com \
    --cc=tab@snarc.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox