From: "Mikkel Fahnøe Jørgensen" <mikkel@dvide.com>
To: Richard Jones <rich@annexia.org>
Cc: Till Varoquaux <till@pps.jussieu.fr>,
Yaron Minsky <yminsky@gmail.com>,
"caml-list@inria.fr" <caml-list@inria.fr>
Subject: Re: [Caml-list] xpath or alternatives
Date: Wed, 30 Sep 2009 12:49:12 +0200 [thread overview]
Message-ID: <caee5ad80909300349r103957ffs69a33949c4ae265e@mail.gmail.com> (raw)
In-Reply-To: <20090930101622.GA15517@annexia.org>
2009/9/30 Richard Jones <rich@annexia.org>:
> On Wed, Sep 30, 2009 at 01:00:15AM +0200, Mikkel Fahnøe Jørgensen wrote:
>> In line with what Yaron suggests, you can use a combinator parser.
> It's interesting you mention xmlm, because I couldn't write
> the code using xmlm at all.
If you can manage to convert an xml document into a json like tagged
tree structure,
then a simple solution like
module Value = struct
56 type value_type =
57 Object of (string * value_type) list
58 | Array of value_type list
59 | String of string
60 | Int of int
61 | Float of float
62 | Bool of bool
63 | Null
64 end
65
..
665 let get_object v = match v with Object x -> x
666 | _ -> fail "json object expected"
..
685 let pattern_path value names =
686 let rec again value = function
687 | "*" :: names -> List.iter (fun (n, v) -> try again v names
688 with Invalid_argument _ | Not_found -> ()) (get_object value)
689 | name :: names -> again (List.assoc name (get_object value)) names
690 | [] -> raise (Found value)
691 in try again value names; raise Not_found with Found value -> value
692
combined with a path split function
22 let split c s =
23 let n = String.length s in
24 let rec again i lst =
25 begin try let k = String.rindex_from s i c in
26 again (k - 1) ((if i = k then "" else (String.sub s (k + 1)
(i - k))) :: lst)
27 with _ -> (String.sub s 0 (i + 1)) :: lst
28 end
29 in again (n - 1) []
will do almost exactly what you are asking for - notice the "*"
searches broadly in all subtrees. You can add your own xpath like
functions as you discover a need for them.
I believe that the xmlm examples has a tree transformation operation
that would easily be adapted to produce a json like tree, if modified
a little.
let out_tree o t =
let frag = function
| E (tag, childs) -> `El (tag, childs)
| D d -> `Data d
in
Xmlm.output_doc_tree frag o t
> My best effort, using xml-light, is around 40 lines:
If you spend those 40 lines on a layer on top of a lightweight xml
parser, you might get away with 3 lines the next time.
next prev parent reply other threads:[~2009-09-30 10:49 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-09-28 12:17 Richard Jones
2009-09-28 12:48 ` [Caml-list] " Yaron Minsky
2009-09-28 15:06 ` Till Varoquaux
2009-09-29 23:00 ` Mikkel Fahnøe Jørgensen
2009-09-30 10:16 ` Richard Jones
2009-09-30 10:36 ` Sebastien Mondet
2009-09-30 10:49 ` Mikkel Fahnøe Jørgensen [this message]
2009-09-30 11:05 ` Dario Teixeira
2009-09-30 11:57 ` Richard Jones
2009-09-30 12:59 ` Richard Jones
2009-09-30 13:33 ` Till Varoquaux
2009-09-30 14:01 ` Richard Jones
2009-09-30 14:28 ` Till Varoquaux
2009-09-30 14:51 ` Alain Frisch
2009-09-30 15:09 ` Richard Jones
2009-09-30 15:18 ` Alain Frisch
2009-10-28 2:22 ` Daniel Bünzli
2009-09-30 13:39 ` Stefano Zacchiroli
2009-09-30 14:49 ` Gerd Stolpmann
2009-09-30 15:12 ` Stefano Zacchiroli
2009-09-30 15:22 ` Jordan Schatz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=caee5ad80909300349r103957ffs69a33949c4ae265e@mail.gmail.com \
--to=mikkel@dvide.com \
--cc=caml-list@inria.fr \
--cc=rich@annexia.org \
--cc=till@pps.jussieu.fr \
--cc=yminsky@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox