From: Brian Hurt <bhurt@spnz.org>
To: Tato Thetza <thetza@sent.com>
Cc: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] Re: OCaml efficiency/optimization?
Date: Mon, 31 Oct 2005 18:29:50 -0600 (CST) [thread overview]
Message-ID: <Pine.LNX.4.63.0510311751150.9530@localhost.localdomain> (raw)
In-Reply-To: <1130495033.8413.246225066@webmail.messagingengine.com>
On Fri, 28 Oct 2005, Tato Thetza wrote:
> sorry I meant
> Ocaml for Experienced Programmers:
> http://www.bogonomicon.com/bblog/book/html/book1.html
>
> On Fri, 28 Oct 2005 03:21:43 -0700, "Tato Thetza" <thetza@sent.com>
> said:
>> I've been reading over
>> http://caml.inria.fr/pub/docs/manual-ocaml/index.html and have learned
>> two things:
>> -lists are immutable and singly linked, which explains why 1::[2;3] is
>> valid while [2,3]::1 is not, and why its efficient.
>> -the proper way to ensure tail-recursive optimization
>>
>> question: are these and other optimizations documented somewhere
>> officially? I find it a little uncomfortable I've been learning OCaml
>> without knowning such internal details. Any secrets I should definitely
>> know if I were to use this language in production?
Answering the original question (somewhat belatedly), I present Brian's
steps to optimizing code:
1) Get it to work. I don't care how fast your program returns the wrong
answer, get it returning the right answer in all cases before doing
anything else. This also includes making the code maintainable- if you
can't figure out how the code works, you won't be able to figure out how
to make the code work faster. And in general, optimizers work better on
clean, maintainable code- generally maintainable code is also faster code.
2) Measure. Is the code fast enough? If so, you can stop now.
3) Optimize. This includes compiling to native, and using ocamldefun,
the Ocaml Defunctorizor. This is stuff that is "easy" to do
(comparatively), and can have dramatic effects on performance.
4) Buy a faster computer. No, really- what's the programmer time worth?
>From here on out it'll start being really easy to rack of tens, if not
hundreds, of thousands of dollars of programmer time improving the code.
At which point simply buying a faster machine might be signifigantly
cheaper. This isn't always the case, but generally it is- you definately
shouldn't assume it isn't the case.
5) Profile. If the code isn't fast enough, what's taking the time?
Making a program 10x faster can be completely worthless- if the program
takes 100ms to fire off a database query that takes 15 minutes to
complete, making that program only take 10ms to fire off the same database
query isn't going to help. Also, optimizing initilization loops is
generally not usefull. Where is the time going?
6) Look for large-scale optimizations. Are you doing things you don't
need to do? No operation completes faster than the operation you don't
do. What is the big-O notation of your core algorithms? Replacing an
O(n^2) algorithm with an O(n log n) algorithm can yield huge performance
gains (10,000x or more). This is where maintainable code comes in real
handy. This is also the point where I consider adding imperitive code to
my program, replacing O(log N) tree lookups with O(1) array lookups. Note
that this is an *optimization*- applicative datastructures are more
maintainable (IMHO), and thus should be the default approach taken until
it is proven that performance requirements demand imperitive data
structures. But I'd definately go through Okasaki's book before making
that decision.
One comment here: implicit in this step is you being conversant with the
current literature- which generally means doing new searches on a regular
basis. You might think "Hey, I've been using red-black trees for decades,
and it's not like they're exactly new technology. Why do a literature
search on them?" But you'd miss this paper, which was only published six
years ago:
http://citeseer.ist.psu.edu/hinze99constructing.html
7) Measure again. Actually, I take that back- you should be measuring
constantly (to show that you are, in fact, making things go faster), and
profiling regularly, from here on out.
8) Now you're into the tweak stage. Generally (not always, but
generally), using recursive functions and arguments, if it's less than
five arguments, will be faster than loops and references (the arguments
will just live in registers, while the reference assignments have to get
written out to memory). Replace tuples of all floats with structures of
all floats, and unbox the floats.
9) If it still isn't fast enough, rewrite core routines in C, or even
assembly language. But these are Hail Mary passes of optimization- if
they aren't enough, you're screwed.
Brian
next prev parent reply other threads:[~2005-11-01 0:26 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-10-28 10:21 Tato Thetza
2005-10-28 10:23 ` Tato Thetza
2005-10-28 23:07 ` Ant: [Caml-list] " Martin Chabr
2005-10-31 23:50 ` Ocaml for Experienced Programmers Brian Hurt
2005-11-01 1:32 ` [Caml-list] " Yaron Minsky
2005-11-01 0:29 ` Brian Hurt [this message]
2005-11-01 23:08 ` [Caml-list] Re: OCaml efficiency/optimization? Matt Gushee
2005-10-28 12:15 ` [Caml-list] " Jon Harrop
2005-10-28 20:00 ` Jon Harrop
2005-10-28 13:03 ` Thomas Fischbacher
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Pine.LNX.4.63.0510311751150.9530@localhost.localdomain \
--to=bhurt@spnz.org \
--cc=caml-list@yquem.inria.fr \
--cc=thetza@sent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox