From: skaller <skaller@users.sourceforge.net>
To: Brian Hurt <bhurt@spnz.org>
Cc: briand@aracnet.com, John Prevost <j.prevost@gmail.com>,
Ocaml Mailing List <caml-list@inria.fr>
Subject: Re: [Caml-list] looping recursion
Date: 29 Jul 2004 08:05:59 +1000 [thread overview]
Message-ID: <1091052358.5870.1000.camel@pelican.wigram> (raw)
In-Reply-To: <Pine.LNX.4.44.0407280907470.6739-100000@localhost.localdomain>
On Thu, 2004-07-29 at 00:36, Brian Hurt wrote:
> On 28 Jul 2004, skaller wrote:
>
> > On Wed, 2004-07-28 at 11:43, Brian Hurt wrote:
> > > On Tue, 27 Jul 2004 briand@aracnet.com wrote:
> >
> > > Very long lists are a sign that you're using the wrong data structure.
> >
> > What would you recommend for a sequence of tokens?
> > Streams are slow and hard to match on.. bucket lists
> > have lower storage overhead but hard to match on.
>
> Extlib Enumerations. For short lists, yeah they're slower than lists.
That doesn't matter -- the lists are long by specification.
> But for long lists, I could see them being a lot faster. Don't forget
> cache effects- streaming processing can have much better cache behavior
> than repeatedly walking a long list (too large to fit into cache).
Can't pattern match on them. One reason for building
a list is I filter it, for example, in Felix I strip out white space
tokens, in Vyper (Python interpreter written in Ocaml)
I did something like 13 separate passes to handle
the indentation and other quirks to precondition the input
to the parser so it became LALR(1).
So, I'd have to use a list as a buffer for the head of the stream
anyhow..
Also, there is a serious design problem with ExtLib Enums.
Although the data structure appears functional, it doesn't
specify when things happen precisely.
In particular if the input is a stream, that is, uses
mutators to extract elements, then instead of using
the persistence and laziness so you can use the Enums
as forward iterators -- for example in a backtracking
parser -- the Enums actually degrade to uncopyable
input iterators.
Since Ocamllex uses a mutable lex buffer, the Enums
based on them are also non-functional input iterators ..
[I can get around that by calling 'force()' but that
totally defeats the purpose of using Enums .. :]
Whereas, a plain old list is a purely functional
forward iterator, and unquestionably works with
a backtracking parser.
As an example of a simple modification I could do that
won't work easily with uncontrolled control inversion:
suppose I cache the token stream on disk, and in
particular Marshal file 'fred.flx' out as 'fred.tokens'.
[Now you *have* to force() all the iterators, or
each one inside the #include will write the file
to disk at the end of the sub-file .. but that
should only be done once -- its quite slow writing
a file to disk .. forcing all the enums makes
separate copies of the tokens .. argggg .. ]
The problem goes away when I manually build lists
and preprocess them because I have explicit control.
Bottom line is that Enums work fine to integrate
purely functional data structures together but they're
not very useful mixing coupled streams together.
Crudely -- if you have a hierarchy of streams
you may need to read them in a particular order
due to the coupling .. with STL input iterators
you can do that, with hand written Ocaml
you can do that -- with Enums you can't.
--
John Skaller, mailto:skaller@users.sf.net
voice: 061-2-9660-0850,
snail: PO BOX 401 Glebe NSW 2037 Australia
Checkout the Felix programming language http://felix.sf.net
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2004-07-28 22:06 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-27 23:43 briand
2004-07-28 0:27 ` John Prevost
2004-07-28 0:38 ` John Prevost
2004-07-28 1:17 ` skaller
2004-07-28 1:05 ` briand
2004-07-28 1:43 ` Brian Hurt
2004-07-28 2:49 ` briand
2004-07-28 3:12 ` Brian Hurt
2004-07-28 3:20 ` Brian Hurt
2004-07-28 5:54 ` brogoff
2004-07-28 7:22 ` Alex Baretta
2004-07-28 16:38 ` brogoff
2004-07-28 19:40 ` Jon Harrop
2004-07-28 20:18 ` Brandon J. Van Every
2004-07-29 6:01 ` Alex Baretta
2004-07-28 21:22 ` brogoff
2004-07-29 9:13 ` Daniel Andor
2004-07-29 9:25 ` Keith Wansbrough
2004-07-29 9:41 ` Nicolas Cannasse
2004-07-29 9:57 ` Xavier Leroy
2004-07-29 10:44 ` Daniel Andor
2004-07-29 12:56 ` brogoff
2004-07-29 10:11 ` skaller
2004-07-29 12:41 ` brogoff
2004-07-29 6:28 ` Alex Baretta
2004-07-29 14:58 ` brogoff
2004-07-29 16:12 ` Brian Hurt
2004-07-29 17:49 ` james woodyatt
2004-07-29 19:25 ` Brian Hurt
2004-07-29 20:01 ` brogoff
2004-07-30 4:42 ` james woodyatt
2004-07-29 17:44 ` james woodyatt
2004-07-29 23:12 ` skaller
2004-07-29 22:42 ` Alex Baretta
2004-07-30 2:38 ` Corey O'Connor
[not found] ` <200407300136.14042.jon@jdh30.plus.com>
2004-07-30 12:45 ` Alex Baretta
2004-07-30 17:07 ` brogoff
2004-07-30 18:25 ` [Caml-list] kaplan-okasaki-tarjan deque (was "looping recursion") james woodyatt
2004-07-30 21:20 ` brogoff
2004-07-31 5:37 ` james woodyatt
2004-07-28 7:27 ` [Caml-list] looping recursion skaller
2004-07-28 14:36 ` Brian Hurt
2004-07-28 22:05 ` skaller [this message]
2004-07-28 0:37 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1091052358.5870.1000.camel@pelican.wigram \
--to=skaller@users.sourceforge.net \
--cc=bhurt@spnz.org \
--cc=briand@aracnet.com \
--cc=caml-list@inria.fr \
--cc=j.prevost@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox