From: Brian Hurt <brian.hurt@qlogic.com>
To: Diego Olivier Fernandez Pons
<Diego-Olivier.FERNANDEZ-PONS@cicrp.jussieu.fr>
Cc: <caml-list@inria.fr>
Subject: Re: [Caml-list] @, List.append, and tail recursion
Date: Fri, 31 Jan 2003 13:52:26 -0600 (CST) [thread overview]
Message-ID: <Pine.LNX.4.33.0301311126450.3577-100000@eagle.ancor.com> (raw)
In-Reply-To: <Pine.A41.4.44.0301311705230.4055078-100000@ibm1.cicrp.jussieu.fr>
On Fri, 31 Jan 2003, Diego Olivier Fernandez Pons wrote:
> Bonjour,
>
> Some comments on various contributions to this thread
>
> Brian Hurt wrote :
>
> > I'm basically doing sparse vector/matrix stuff, handling
> > (effectively) (colno * value) list for vectors, and (rowno * vector)
> > list for matrix. And I may be hitting lists long enough to trip the
> > problem.
>
> If you are doing sparse matrix operations and you still hit lists long
> enough to cause a stack overflow, then your matrix must be really
> huge.
It'd be more accurate to say I'm not sure my lists will stay short enough.
On average they are short enough and sparse enough to make using lists
worthwhile. I expect on average the lists will be about 20 elements or so
long. I should be able to stop thinking here- OK, I'm saving more than
enough computation time on the average case that being inefficient on the
rare case, where the lists has 30K or more elements in it, is OK.
But I want the rare case to *work* correctly, even if inefficiently. With
the "naive" non-tail-recursive implementation doesn't. Somewhere above
32K elements in the list the recursion trips the stack overflow. Change
the problem in some minor way, and suddenly I'm not generating lists with
32K elements in them, maybe just 30K elements in them, and everything
works OK.
Optimization is the wrong word. As Olivier's set_cdr shows, the cost of
recursion isn't signifigant (kudos to the compiler team). It's a
correctness issue more than anything: the code *should* work.
>
> If the ordrer of the terms does not matter (or if you can manage with
> the position information you keep in your sparse matrix) then you just
> need to write tail recursive functions, not taking care of the list
> being reversed
I am keeping the items in order. Because if they're in order, I can write
add in O(n). If they're not in order, the best (fastest running) way to
write add is to first sort both lists O(n log n) and then do the in-order
add.
> An other solution is to use a suitable data structure for your
> application : join-lists, catenable lists, double-linked lists ...
Several reasons, actually:
1) I want to stay as strictly functional as possible. I'm comming from an
imperitive background, and thus want to limit how much I fall back on old
habits. If it can be done functionally (purely applicative), I want to do
it that way.
2) The only thing I'm doing with the list that is at all a problem is
non-tail-recursion list construction. And this *should* work correctly.
So why should I use some other datastructure when the list primitives do
everything I need? In fact, linked lists of some sort are exactly what I
want- not even arrays. I want to be able to start creating a list of
elements without having to know how long it is, and I'm doing a lot of
walking the list and basically no accessing random elements. And I'm
always walking the list in the same direction.
3) I perfer to solve the general problem than have to resolve the specific
problem every time I hit it. This is even worse because the problem is
only likely to show up in production. Test cases with small sets of data
won't trigger the problem.
>
> There are not many applications in which you just can't work around in
> a simple way...
I don't want to have to work around it.
> In this case, the only solution is to write you program in
> C/assembler, with your own memory manager
You should only be programming in assembler if it absolutely cannot be
written in C. You should only be programming in C if you're directly
banging on hardware.
>
> That is why I suspect you may not be using the best data structures
> and algorithms avaiable. Could you explain what you are working on ?
The short form: solving a system of nonlinear equations via
Newton-Kantoravich (sp?). Basically, I compute the Jacobian F'(X) (a
matrix of the partial differentials) and the residual F(X) (a vector) and
solve F'(X) * Y = F(X) to get my new, improved guess X' = X - Y. Lather,
rinse, repeat, until F(X) is close enough to 0. This is basically
Newton's method extended to deal with multiple equations in multiple
variables.
In production I wouldn't be surprised to see systems of 30,000+ equations
in 30,000+ variables. The Jacobian matrix is going to be very sparse- I
expect the average row to have 20 or fewer non-zero elements. Thus the
attraction to sparse vectors. I'm going to be solving it via Gaussian
elimination (the Jacobian is likely to be malformed in multiple ways,
meaning I can't use any iterative method I know of. And yes, I've looked
at iterative methods as advanced as GMRES- they don't work). I think that
in general I can bound the size of vectors I'm producing. But there are
degenerate cases where I could get above 30K non-zero elements in a
vector.
The problem can't be solved at a higher level. Sparse vectors are the way
to go, and I can't gaurentee that I won't get sparse vectors of greater
than 30K elements.
> You will find in Chris Okasaki's thesis/book several implementations
> of catenable lists and many references. One of them embedds a queue in
> a tree which seems to be what you are looking for (head in O(1),
> append in O(1))
O(1), or O(log n)? Most tree operations are O(log n).
This is one of the things I considered, implementing the sparse vectors as
trees- maps, effectively. The problem is that most of the time I want to
walk in-order through all non-zero elements of the vector. With a
tree/map, this is O(n log n). With a list, it's O(n).
I don't see what the resistance here is. Were I jumping up and down
demanding someone else do the work, I could see the response being "it's
not worth it to us". Which is why I'm doing the work myself. There are,
from your point of view, two possibilities:
1) I succeed. In which case, a new set of behaviors become efficient.
Since you weren't using those behaviors anyways, you don't care- as
nothing else changes. Note that I am not proposing to change the
semantics of the language.
2) I fail. In which case nothing changes. This includes I come up
with something which breaks other stuff, at which point Xavier & co. go
"Thanks, but no thanks".
Brian
-------------------
To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr
Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
next prev parent reply other threads:[~2003-01-31 19:43 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-01-24 0:48 Brian Hurt
2003-01-30 18:10 ` Olivier Andrieu
2003-01-30 19:46 ` Brian Hurt
2003-01-30 20:52 ` Olivier Andrieu
2003-01-30 21:57 ` Brian Hurt
2003-01-31 2:16 ` james woodyatt
2003-01-31 17:05 ` Diego Olivier Fernandez Pons
2003-01-31 19:52 ` Brian Hurt [this message]
2003-02-01 10:18 ` Linear systems (was Re: [Caml-list] @, List.append, and tail recursion) Diego Olivier Fernandez Pons
2003-01-31 21:34 ` [Caml-list] @, List.append, and tail recursion Issac Trotts
2003-01-31 17:13 ` Brian Hurt
2003-01-31 17:42 ` brogoff
2003-01-31 19:18 ` Russ Ross
2003-01-31 19:32 ` Alexander V. Voinov
2003-02-01 2:30 ` brogoff
2003-01-31 23:12 ` Issac Trotts
2003-01-24 15:35 Andrew Kennedy
2003-01-30 1:44 ` brogoff
2003-01-30 9:57 ` Christophe Raffalli
2003-01-30 16:03 ` Brian Hurt
2003-01-31 10:33 ` Mattias Waldau
2003-01-31 17:32 Diego Olivier Fernandez Pons
2003-01-31 19:58 Harrison, John R
2003-01-31 21:04 ` Brian Hurt
2003-01-31 22:27 Harrison, John R
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.33.0301311126450.3577-100000@eagle.ancor.com \
--to=brian.hurt@qlogic.com \
--cc=Diego-Olivier.FERNANDEZ-PONS@cicrp.jussieu.fr \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox