From: Alex Baretta <alex@baretta.com>
To: brogoff <brogoff@speakeasy.net>
Cc: Ocaml <caml-list@inria.fr>
Subject: Re: [Caml-list] looping recursion
Date: Thu, 29 Jul 2004 08:28:00 +0200 [thread overview]
Message-ID: <410898F0.6030804@baretta.com> (raw)
In-Reply-To: <Pine.LNX.4.58.0407280924150.10739@shell1.speakeasy.net>
brogoff wrote:
> On Wed, 28 Jul 2004, Alex Baretta wrote:
>
>>brogoff wrote:
>>
>>>Sometimes lists are best. If lists are OK for 100, 1_000, or 10_000 items,
>>>why not 100_000 or 1_000_000? Map and friends shouldn't blow stack.
>>>
>>>-- Brian
>>
>>Actually, it's not that bad.
>
>
> Well, I'm still on the caml-list, so of course it isn't *that* bad. Also, as
> I said, if you need a tail recursive map over built in lists, you have at
> least two options. Unfortunately, my preference is to use Obj, which IMO
> points to a deficiency in the core language. Most, maybe all, of my issues
> with OCaml, amount to petty complaints, no showstoppers or dealbreakers.
> Hey, at least I'm not whining about the license!
Yes, I guess we simply can live with it. I can't wait to have STL
iterators in Ocaml, but meanwhile I can live with this functional
nonsense: lists, maps, sets... ;)
> There are quite a few cases where singly linked lists are the most efficient
> data structure, and they're just right for the problem. Streams and laziness
> are not at all efficient in comparison. Stack overflow commonly happens whenm
> one of my users (yup, I have users of my code who aren't me ;-) decides to
> run on data much larger than I'd anticipated.
Huge lists are usually not an efficient data structure. Lists are local
structures: at any point you can only observe a fixed number of head
elements (usually one at a time) and an unspecified tail. Such locality
generally makes caching the entire data structure useless. Long lists
(lists of a priori unbounded length) arise from user input: typically
the are obtained by reading in a file of a priori unbounded length. In
such cases lists are very much inefficient from the point of view of
memory complexity: lists cost O(n) where n is the size of the input;
streams cost O(1). Both cost O(n) in terms of time complexity. Usually
the potential speed benefit of lists is amply counterbalanced by
reduction in memory footprint, which usually means no memory swapping in
the kernel.
Here's an obvious example (not too obvious: I fell for it at first). I
have a XML syntax specifying relational databases. Essentially an entire
relational database can be represented with it. This XML lexicon is
meant for now only for the database initial schema definition and for
human-readable backups. I initially wrote the backup algorithm in terms
of transformations on a list of PXP trees which were finally serialized.
Algebraically, this is perfect. The only problem is that this algorithm
stores the entire contents of a database in memory before serializing
everything. Ooops, here's a SIGKILL landing in! Now I only use PXP trees
to represent the database schema, which I immediately begin to
serialize. All the while I pass around continuations which are invoked
iteratively on subnodes. If a subnode is an table-data node I declare an
SQL cursor in the database and begin walking through the table. Result,
where my initial memory footprint bought me a SIGKILL on a 2GB server
now I have bounded memory usage (a few megabytes). I don't swap, so the
the backup process actually runs an order of magnitude faster. And I
actually get my output at the end ;)
> Also, lists are builtin, and have syntactic support in OCaml, by which I mean
> nice, easy to read syntax. So the language encourages you to use them.
They are sooo cool! I actually ask all trainees to reimplement the List
module. But then the industrial practice is to use lists only where
there is no input or as the result of some kind of slicing or research
applied to bigger and more efficient data structures. I actually use a
list iteration to schedule tasks in my real-time control kernel, but of
course, I don't expect my collegues to write more than a few tasks at a
time for my kernel to schedule: safety, liveness and UI.
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2004-07-29 6:27 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-07-27 23:43 briand
2004-07-28 0:27 ` John Prevost
2004-07-28 0:38 ` John Prevost
2004-07-28 1:17 ` skaller
2004-07-28 1:05 ` briand
2004-07-28 1:43 ` Brian Hurt
2004-07-28 2:49 ` briand
2004-07-28 3:12 ` Brian Hurt
2004-07-28 3:20 ` Brian Hurt
2004-07-28 5:54 ` brogoff
2004-07-28 7:22 ` Alex Baretta
2004-07-28 16:38 ` brogoff
2004-07-28 19:40 ` Jon Harrop
2004-07-28 20:18 ` Brandon J. Van Every
2004-07-29 6:01 ` Alex Baretta
2004-07-28 21:22 ` brogoff
2004-07-29 9:13 ` Daniel Andor
2004-07-29 9:25 ` Keith Wansbrough
2004-07-29 9:41 ` Nicolas Cannasse
2004-07-29 9:57 ` Xavier Leroy
2004-07-29 10:44 ` Daniel Andor
2004-07-29 12:56 ` brogoff
2004-07-29 10:11 ` skaller
2004-07-29 12:41 ` brogoff
2004-07-29 6:28 ` Alex Baretta [this message]
2004-07-29 14:58 ` brogoff
2004-07-29 16:12 ` Brian Hurt
2004-07-29 17:49 ` james woodyatt
2004-07-29 19:25 ` Brian Hurt
2004-07-29 20:01 ` brogoff
2004-07-30 4:42 ` james woodyatt
2004-07-29 17:44 ` james woodyatt
2004-07-29 23:12 ` skaller
2004-07-29 22:42 ` Alex Baretta
2004-07-30 2:38 ` Corey O'Connor
[not found] ` <200407300136.14042.jon@jdh30.plus.com>
2004-07-30 12:45 ` Alex Baretta
2004-07-30 17:07 ` brogoff
2004-07-30 18:25 ` [Caml-list] kaplan-okasaki-tarjan deque (was "looping recursion") james woodyatt
2004-07-30 21:20 ` brogoff
2004-07-31 5:37 ` james woodyatt
2004-07-28 7:27 ` [Caml-list] looping recursion skaller
2004-07-28 14:36 ` Brian Hurt
2004-07-28 22:05 ` skaller
2004-07-28 0:37 ` skaller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=410898F0.6030804@baretta.com \
--to=alex@baretta.com \
--cc=brogoff@speakeasy.net \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox