From mboxrd@z Thu Jan 1 00:00:00 1970 Received: (from majordomo@localhost) by pauillac.inria.fr (8.7.6/8.7.3) id WAA25326; Mon, 20 Jan 2003 22:23:26 +0100 (MET) X-Authentication-Warning: pauillac.inria.fr: majordomo set sender to owner-caml-list@pauillac.inria.fr using -f Received: from concorde.inria.fr (concorde.inria.fr [192.93.2.39]) by pauillac.inria.fr (8.7.6/8.7.3) with ESMTP id WAA25408 for ; Mon, 20 Jan 2003 22:23:25 +0100 (MET) Received: from TheWorld.com (pcls3.std.com [199.172.62.105]) by concorde.inria.fr (8.11.1/8.11.1) with ESMTP id h0KLNOr10569 for ; Mon, 20 Jan 2003 22:23:24 +0100 (MET) Received: from watergate.world.std.com (pool-141-149-175-233.bos.east.verizon.net [141.149.175.233]) by TheWorld.com (8.9.3/8.9.3) with ESMTP id QAA14464 for ; Mon, 20 Jan 2003 16:23:23 -0500 Message-Id: <5.2.0.9.0.20030120161510.066e4598@pop.theWorld.com> X-Sender: chase@pop.theWorld.com X-Mailer: QUALCOMM Windows Eudora Version 5.2.0.9 Date: Mon, 20 Jan 2003 16:23:22 -0500 To: caml-list@inria.fr From: David Chase Subject: Re: Coyote Gulch test in Caml (was Re: [Caml-list] speed ) In-Reply-To: <20030118155054.C22850@speakeasy.org> References: <200301181749.48295.oleg_inconnu@myrealbox.com> <3E15B3B3.3040106@163.com> <20030103071042.T22850@speakeasy.org> <20030104193118.A26208@pauillac.inria.fr> <200301181749.48295.oleg_inconnu@myrealbox.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: owner-caml-list@pauillac.inria.fr Precedence: bulk At 03:50 PM 1/18/2003 -0800, Shawn Wagner wrote: >On Sat, Jan 18, 2003 at 05:49:47PM -0500, Oleg wrote: >> On Saturday 04 January 2003 01:31 pm, Xavier Leroy wrote: >> > Apparently, the ocamlopt-generated code >> > offers less instruction-level parallelism than the g++-generated code >> > for the float computations. Still, I haven't really understood where >> > the factor of 2 comes from. > >> Just as wild guess: the code contains calls to "sin" and "cos" on the same >> value. Perhaps GCC manages to optimize those into one call to "sincos" > >It doesn't. I tried making a C++ version that does when I was fooling around >with it. Didn't really help. The single greatest speed increase I got (Which >did something like cut the runtime in half) was -ffast-math, which cuts out >the trig function calls in favor of direct use of the proper x86 >instructions. But the inlined-sincos (__sincos()) in glibc causes a segfault >on my athlon when I tried using it. :P Just a silly question, but if you want sin and cos to go faster, how much accuracy are you willing to trade away for improved performance? Just for example, by using the Pentium instructions, you reduce the number of (accurate) significant bits in the result from 53 (IEEE double) to 13 (for some inputs between zero and 2*PI). (If you are using 64-bit mantissas, the worst case is only 4 bits of accuracy.) I've noticed over the years that people focus on speed over many other things, usually because they can measure it. Well, we can measure accuracy of transcendental functions, too, so I thought I would ask the question. How much is enough for your application? Of the languages being benchmarked, which one has the most accurate transcendental functions? Is this less important than speed? David Chase ------------------- To unsubscribe, mail caml-list-request@inria.fr Archives: http://caml.inria.fr Bug reports: http://caml.inria.fr/bin/caml-bugs FAQ: http://caml.inria.fr/FAQ/ Beginner's list: http://groups.yahoo.com/group/ocaml_beginners