From: Jon Harrop <jon@ffconsultancy.com>
To: caml-list@yquem.inria.fr
Subject: Re: [Caml-list] MetaOcaml and high-performance [was: AST versus Ocaml]
Date: Tue, 10 Nov 2009 15:38:54 +0000 [thread overview]
Message-ID: <200911101538.54857.jon@ffconsultancy.com> (raw)
In-Reply-To: <20091109042328.9330E1727F@Adric.ern.nps.edu>
On Monday 09 November 2009 04:23:28 oleg@okmij.org wrote:
> Because offshoring produces a portable C or Fortran code file, you can
> use the code on 32 or 64-bit platform. The reason the native MetaOCaml
> without offshoring does not work on amd64 is because at that time
> OCaml didn't emit PIC code for amd64. So, dynamic linking was
> impossible. That problem has long been fixed in later versions of
> OCaml...
Has the problem been fixed in MetaOCaml?
> Fortunately, some people have considered MetaOCaml to be a viable
> option for performance users and have reported good results. For
> example,
>
> Tuning MetaOCaml Programs for High Performance
> Diploma Thesis of Tobias Langhammer.
> http://www.infosun.fmi.uni-passau.de/cl/arbeiten/Langhammer.pdf
>
> Here is a good quotation from the Introduction:
>
> ``This thesis proposes MetaOCaml for enriching the domain of
> high-performance computing by multi-staged programming. MetaOCaml extends
> the OCaml language.
> ...
> Benchmarks for all presented implementations confirm that the
> execution time can be reduced significantly by high-level
> optimizations. Some MetaOCaml programs even run as fast as respective
> C implementations. Furthermore, in situations where optimizations in
> pure MetaOCaml are limited, computation hotspots can be explicitly or
> implicitly exported to C. This combination of high-level and low-level
> techniques allows optimizations which cannot be obtained in pure C
> without enormous effort.''
That thesis contains three benchmarks:
1. Dense float matrix-matrix multiply.
2. Blur of an int image matrix as convolution with a 3x3 stencil matrix.
3. Polynomial multiplication with distributed parallelism.
I don't know about polynomial multiplication (suffice to say that it is not
leveraging shared-memory parallelism which is what performance users value in
today's multicore era) but the code for the first two benchmarks is probably
10-100x slower than any decent implementation. For example, his fastest
2048x2048 matrix multiply takes 167s whereas Matlab takes only 3.6s here.
In essence, the performance gain (if any) from offshoring to C or Fortran is
dwarfed by the lack of shared-memory parallelism.
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
prev parent reply other threads:[~2009-11-10 15:37 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-09 4:23 oleg
2009-11-10 15:38 ` Jon Harrop [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200911101538.54857.jon@ffconsultancy.com \
--to=jon@ffconsultancy.com \
--cc=caml-list@yquem.inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox