From: Dmitry Bely <dmitry.bely@gmail.com>
To: Xavier Leroy <Xavier.Leroy@inria.fr>, Caml List <caml-list@inria.fr>
Subject: Re: [Caml-list] Ocamlopt x86-32 and SSE2
Date: Mon, 11 May 2009 11:55:41 +0400 [thread overview]
Message-ID: <90823c940905110055y24e3589et89edd4779b41edb0@mail.gmail.com> (raw)
In-Reply-To: <4A0407A9.4000009@inria.fr>
On Fri, May 8, 2009 at 2:21 PM, Xavier Leroy <Xavier.Leroy@inria.fr> wrote:
>> I see. Why I asked this: trying to improve floating-point performance
>> on 32-bit x86 platform I have merged floating-point SSE2 code
>> generator from amd64 ocamlopt back end to i386 one, making ia32sse2
>> architecture. It also inlines sqrt() via -ffast-math flag and slightly
>> optimizes emit_float_test (usually eliminates an extra jump) -
>> features that are missed in the original amd64 code generator.
>
> You just passed black belt in OCaml compiler hacking :-)
Thank you, sensei :-)
>> Is this of any interest to anybody?
>
> I'm definitely interested in the potential improvements to the amd64
> code generator.
>
> Concerning the i386 code generator (x86 in 32-bit mode), SSE2 float
> arithmetic does improve performance and fit ocamlopt's compilation
> model much better than the current x87 float arithmetic, which is a
> bit of a hack. Several options can be considered:
>
> 1- Have an additional "ia32sse2" port of ocamlopt in parallel with the
> current "i386" port.
>
> 2- Declare pre-SSE2 processors obsolete and convert the current
> "i386" port to always use SSE2 float arithmetic.
>
> 3- Support both x87 and SSE2 float arithmetic within the same i386
> port, with a command-line option to activate SSE2, like gcc does.
>
> I'm really not keen on approach 1. We have too many ports (and
> their variants for Windows/MSVC) already. Moreover, I suspect
> packagers would stick to the i386 port for compatibility with old
> hardware, and most casual users would, too, out of lazyness, so this
> hypothetical "ia32sse2" port would receive little testing.
>
> Approach 2 is tempting for me because it would simplify the x86-32
> code generator and remove some historical cruft. The issue is that it
> demands a processor that implements SSE2. For a list of processors, see
> http://en.wikipedia.org/wiki/SSE2
> As a rule of thumb, almost all desktop PC bought since 2004 has SSE2,
> as well as almost all notebooks since 2006. That should be OK for
> professional users (it's nearly impossible to purchase maintenance
> beyond 3 years, anyway) and serious hobbyists. However, packagers are
> going to be very unhappy: Debian still lists i486 as its bottom line;
> for Fedora, it's Pentium or Pentium II; for Windows, it's "a 1GHz
> processor", meaning Pentium III. All these processors lack SSE2
> support. Only MacOS X is SSE2-compatible from scratch.
>
> Approach 3 is probably the best from a user's point of view. But it's
> going to complicate the code generator: the x87 cruft would still be
> there, and new cruft would need to be added to support SSE2. Code
> compiled with the SSE2 flag could link with code compiled without,
> provided the SSE2 registers are not used for parameter and result
> passing. But as Dmitry observed, this is already the case in the
> current ocamlopt compiler.
I am curious if passing unboxed floats is possible in the current
Ocaml data model?
As for proposed options - I tend to vote for #3 (and implement it if
there is a consensus). Still there is a plenty of low-power/embedded
x86 hardware that does not support SSE2. And one will be able to
compare x87 and SSE2 backends performance to convince him/herself that
the play really worths the candle :-)
- Dmitry Bely
next prev parent reply other threads:[~2009-05-11 7:55 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-04-28 19:36 Ocamlopt code generator question Dmitry Bely
[not found] ` <m27i13tofi.fsf@Pythagorion.local.i-did-not-set--mail-host-address--so-tickle-me>
2009-04-29 16:50 ` Dmitry Bely
2009-04-29 20:04 ` Jeffrey Scofield
2009-05-05 9:24 ` [Caml-list] " Xavier Leroy
2009-05-05 9:41 ` Dmitry Bely
2009-05-05 14:15 ` Jean-Marc Eber
2009-05-05 14:58 ` Sylvain Le Gall
2009-05-05 15:21 ` [Caml-list] " David Allsopp
2009-05-05 15:59 ` Dmitry Bely
[not found] ` <4A006410.8000205@lexifi.com>
2009-05-05 16:26 ` Dmitry Bely
2009-05-05 15:14 ` [Caml-list] " Jon Harrop
2009-05-08 10:21 ` [Caml-list] Ocamlopt x86-32 and SSE2 Xavier Leroy
2009-05-10 11:04 ` David MENTRE
2009-05-11 2:43 ` Jon Harrop
2009-05-11 3:43 ` Stefan Monnier
2009-05-11 5:38 ` [Caml-list] " Jon Harrop
2009-05-10 23:12 ` [Caml-list] " Matteo Frigo
2009-05-11 2:45 ` Jon Harrop
2009-05-11 7:55 ` Dmitry Bely [this message]
[not found] <20090509100004.353ADBC5C@yquem.inria.fr>
2009-05-09 11:38 ` CUOQ Pascal
2009-05-10 1:52 ` [Caml-list] " Goswin von Brederlow
2009-05-10 2:16 ` Seo Sanghyeon
2009-05-10 3:50 ` Jon Harrop
2009-05-11 8:05 ` Dmitry Bely
2009-05-11 9:26 ` Jon Harrop
2009-05-11 8:43 ` Dmitry Bely
2009-05-11 13:47 ` Jon Harrop
2009-05-11 9:12 ` Andrey Riabushenko
2009-05-10 8:56 ` CUOQ Pascal
2009-05-10 14:47 ` [Caml-list] " Richard Jones
2009-05-10 19:25 ` Florian Weimer
[not found] <20090511043120.976EBBC67@yquem.inria.fr>
2009-05-11 7:10 ` Pascal Cuoq
2009-05-12 9:37 ` [Caml-list] " Xavier Leroy
2009-05-12 12:40 ` Richard Jones
2009-05-13 22:30 ` Florian Weimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=90823c940905110055y24e3589et89edd4779b41edb0@mail.gmail.com \
--to=dmitry.bely@gmail.com \
--cc=Xavier.Leroy@inria.fr \
--cc=caml-list@inria.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox