From: Dario Teixeira <darioteixeira@yahoo.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] XML library for validating MathML
Date: Thu, 18 Sep 2008 10:58:03 -0700 (PDT) [thread overview]
Message-ID: <359336.28901.qm@web54601.mail.re2.yahoo.com> (raw)
In-Reply-To: <38632.29890.qm@web54602.mail.re2.yahoo.com>
Hi,
Well, as it turns out, building a basic "Hello World" in PXP is relatively
simple (I followed the manual which is very helpful in the beginning).
However, though the DTD validation works fine with the simple examples I tried,
it fails for a MathML document. Note that I am using the DTD as provided
by the W3C, available from here: http://www.w3.org/Math/DTD/mathml2.tgz
When processing the MathML DTD, PXP outputs a few a warnings about entities
declared twice, about names reserved for future extensions, and quite a
lot of warnings about code points that cannot be represented. I can ignore
those for now.
When it does fail, this is the error produced:
In entity ent-isonum = PUBLIC "-//W3C//ENTITIES Numeric and Special Graphic for MathML 2.0//EN" "isonum.ent", at line 28, position 44:
Called from entity [dtd] = SYSTEM "mathml2.dtd", line 1969, position 0:
ERROR (Well-formedness constraint): The character '&' must be written as '&'
Looking at the "isonum.ent" file (packaged with the W3C zip), these are
the contents of line 28, where the error occurs:
<!ENTITY amp "&&" ><!--=ampersand -->
Though 0x26 is indeed the codepoint for the ampersand character, I don't
get why it appears twice. Is this a case of double escaping? Could this
be the reason PXP chokes?
Any thoughts?
Best regards,
Dario Teixeira
P.S. This is the programme I used for testing. Its code is pretty much
lifted from the PXP manual:
open Pxp_document
open Pxp_yacc
class warner =
object
method warn w = print_endline ("WARNING: " ^ w)
end
let rec print_structure n =
let ntype = n#node_type
in match ntype with
| T_element name ->
print_endline ("Element of type " ^ name);
let children = n # sub_nodes
in List.iter print_structure children
| T_data ->
print_endline "Data"
| _ ->
assert false
let () =
try
let config = {default_config with warner = new warner} in
let doc = parse_document_entity config (from_file "test.xml") default_spec
in print_structure (doc#root)
with
exc -> print_endline (Pxp_types.string_of_exn exc)
next prev parent reply other threads:[~2008-09-18 17:58 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-17 18:58 Dario Teixeira
2008-09-17 22:13 ` [Caml-list] " Richard Jones
2008-09-18 2:58 ` Matt Gushee
2008-09-18 8:06 ` Re : " Adrien
2008-09-18 8:38 ` Vincent Hanquez
2008-09-18 9:12 ` Till Varoquaux
2008-09-18 9:44 ` Vincent Hanquez
2008-09-18 11:52 ` Gerd Stolpmann
2008-09-18 13:35 ` Markus Mottl
2008-09-19 11:30 ` Matt Gushee
2008-09-18 14:26 ` Dario Teixeira
2008-09-18 17:58 ` Dario Teixeira [this message]
2008-09-18 18:28 ` Gerd Stolpmann
2008-09-18 20:44 ` Dario Teixeira
2008-09-18 20:48 ` Gerd Stolpmann
2008-09-19 13:23 ` Stefano Zacchiroli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=359336.28901.qm@web54601.mail.re2.yahoo.com \
--to=darioteixeira@yahoo.com \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox