From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: * X-Spam-Status: No, score=1.8 required=5.0 tests=AWL,DNS_FROM_RFC_POST, SPF_NEUTRAL autolearn=disabled version=3.1.3 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by yquem.inria.fr (Postfix) with ESMTP id C3CF0BBAF for ; Tue, 5 May 2009 11:41:32 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApkBAIum/0nRVciuimdsb2JhbACWLT8BAQEKCQwHDwWmf4EJj1wBAwEDg3oF X-IronPort-AV: E=Sophos;i="4.40,296,1238968800"; d="scan'208";a="39467200" Received: from wf-out-1314.google.com ([209.85.200.174]) by mail4-smtp-sop.national.inria.fr with ESMTP; 05 May 2009 11:41:32 +0200 Received: by wf-out-1314.google.com with SMTP id 26so4060629wfd.0 for ; Tue, 05 May 2009 02:41:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=34xLdtOQyftXP9I7GTTLmP+ZJE15ysVD5JRvkFst354=; b=YPpcGlNF+ymq266/37JrZF5kShLXIuQyvvsQJXeBVmhaiLEwPPJmj6D1NV3bckG1Ag bHqiUXqkYPGJ6eerm5ece2rraUdBVSJccTHJ9OuqQJjp2aFcnkpKIRvmtMk3oV9vfAkn V6Rqfq8hfyGAS1aOAjZkID+VoLO9Spm6cuCQY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ruNNmeBRDXc3Np1qjtbnvKyz9BNGXBVDpYmqLmBQBegWqmGfLQNJU/HFO2PLuwdu2d NwuNUxmUKtBFGO3sYQ85l4EqFljz6crik2p8Olzmf7QX1Gi5jIyO8wGNP8xcfTTJ3ynk 2usdAexFNydBbRQOEH7YVEUOHowA0PFWEq5sc= MIME-Version: 1.0 Received: by 10.143.43.7 with SMTP id v7mr3011720wfj.194.1241516490583; Tue, 05 May 2009 02:41:30 -0700 (PDT) In-Reply-To: <4A0005C8.8010609@inria.fr> References: <90823c940904281236x61204451nac149ee15b5df73a@mail.gmail.com> <4A0005C8.8010609@inria.fr> Date: Tue, 5 May 2009 13:41:30 +0400 Message-ID: <90823c940905050241y11f012e5xee8316e3e4337ff9@mail.gmail.com> Subject: Re: [Caml-list] Ocamlopt code generator question From: Dmitry Bely To: Caml List Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Spam: no; 0.00; ocamlopt:01 asmcomp:01 mlp:01 ofs:01 ofs:01 pointer:01 ocamlopt:01 unboxed:01 model:01 sqrt:01 ocaml:01 cvs:01 2009:98 eliminates:98 wrote:01 On Tue, May 5, 2009 at 1:24 PM, Xavier Leroy wrote: >> For amd64 we have in asmcomp/amd64/proc_nt.mlp: >> >> (* =A0xmm0 - xmm15 =A0100 - 115 =A0 =A0 =A0 xmm0 - xmm9: Caml function a= rguments >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xmm0 - xm= m3: C function arguments >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xmm0: Cam= l and C function results >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0xmm6-xmm1= 5 are preserved by C *) >> >> let loc_arguments arg =3D >> =A0calling_conventions 0 9 100 109 outgoing arg >> let loc_parameters arg =3D >> =A0let (loc, ofs) =3D calling_conventions 0 9 100 109 incoming arg in lo= c >> let loc_results res =3D >> =A0let (loc, ofs) =3D calling_conventions 0 0 100 100 not_supported res = in loc >> >> What these first_float=3D100 and last_float=3D109 for loc_arguments and >> loc_parameters affect? My impression is that floats are always passed >> boxed, so xmm registers are in fact never used to pass parameters. And >> float values are returned as a pointer in eax, not a value in xmm0 as >> loc_results would suggest. > > The ocamlopt code generators support unboxed floats as function > parameters and results, as well as returning multiple results in > several registers. =A0(Except for the x86-32 bits port, because of the > weird floating-point model of this architecture.) =A0You're right that > the ocamlopt "middle-end" does not currently take advantage of this > possibility, since floats are passed between functions in boxed state. I see. Why I asked this: trying to improve floating-point performance on 32-bit x86 platform I have merged floating-point SSE2 code generator from amd64 ocamlopt back end to i386 one, making ia32sse2 architecture. It also inlines sqrt() via -ffast-math flag and slightly optimizes emit_float_test (usually eliminates an extra jump) - features that are missed in the original amd64 code generator. All this seems to work OK: beyond my own code all tests found in Ocaml CVS test directory are passed. Of course this is idea is not new - you had working IA32+SSE2 back end several years ago [1] but unfortunately never released it to the public. Is this of any interest to anybody? - Dmitry Bely [1] http://caml.inria.fr/pub/ml-archives/caml-list/2003/03/e0db2f3f54ce19e4= bad589ffbb082484.fr.html