From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.0 required=5.0 tests=none autolearn=disabled version=3.1.3 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 2C9CEBBAF for ; Fri, 11 Jul 2008 04:24:05 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ag4BAFthdkjYi40IjWdsb2JhbACSJAEBAQEJBQgFE55m X-IronPort-AV: E=Sophos;i="4.30,341,1212357600"; d="scan'208";a="14986838" Received: from mail-out8.nyct.net (HELO mail.nyct.net) ([216.139.141.8]) by mail1-smtp-roc.national.inria.fr with ESMTP; 11 Jul 2008 04:24:04 +0200 Received: from [192.168.42.2] (pool-96-250-133-152.nycmny.east.verizon.net [96.250.133.152]) (authenticated bits=0) by mail.nyct.net (8.14.2/8.14.1) with ESMTP id m6B2Nt4c058822 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 10 Jul 2008 22:23:56 -0400 (EDT) (envelope-from bhurt@spnz.org) Date: Thu, 10 Jul 2008 23:01:31 -0400 (EDT) From: Brian Hurt X-X-Sender: bhurt@localhost To: Gerd Stolpmann Cc: J C , caml-list@yquem.inria.fr Subject: Re: [Caml-list] thousands of CPU cores In-Reply-To: <1215717356.24773.17.camel@flake.lan.gerd-stolpmann.de> Message-ID: References: <1215717356.24773.17.camel@flake.lan.gerd-stolpmann.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Spam: no; 0.00; gerd:01 stolpmann:01 ticked:01 parallelism:01 comming:01 steady:98 dictate:98 wrote:01 caml-list:01 precisely:01 slower:02 threaded:03 threaded:03 contrast:03 size:95 On Thu, 10 Jul 2008, Gerd Stolpmann wrote: > > I wouldn't take this article too seriously. It's just speculation. I would take the article seriously. > Just open up your mind to this perspective: It's a big risk for the CPU > vendors to haven taken the direction to multi-core. *Precisely*. It also stands in stark contrast to the last 50 or so years of CPU development, which focused around making single-threaded code faster. And, I note, it's not just one CPU manufacturer who has done this (which could be chalked up to stupid management or stupid engineers)- but *every* CPU manufacturer. And what do they get out of it, other than ticked off software developers grumbling about having to switch to multithreaded code? I can only see one explanation: they had no choice. They couldn't make single threaded code any faster by throwing more transistors at the problem. We've maxed out speculative execution and instruction level parallelism, pushed cache out well past the point of diminishing returns, and added all the execution units that'll ever be used, what more can be done? The only way left to increase speed is multicore. And you still have the steady drum beat of Moore's law- which, by the way, only gaurentees that the number of transistors per square millimeter doubles every yeah so often (I think the current number is 2 years). So we have the new process which gives us twice the number of transistors as the old process, but nothing we can use those new transistors on to make single threaded code go any faster. So you might as well give them two cores where they used to have one. At this point, there are only two things that can prevent kilo-core chips: one, some bright boy could come up with some way to use those extra transistors to make single-threaded code go faster, or two: Moore's law could "expire" within the next 16 years. We're at quad-core already, another 8 doublings every 2 years, with all doublings spent on more cores, gets us to 1024 cores. I think it'll be worse than this, actually, once it gets going. The Pentium III (single core) was 9.5 million transistors, while the Core Duo was 291 million. Even accounting for the 2 cores and some cost to put them together, you're looking at the P-III to be 1/16th the size of a Core. If put on the same process so the P-III runs at more or less the same clock speed, how much slower would the P-III be? 1/10th? 1/2? 90% the speed of the Core? So long as it's above 1/16th the speed, you're ahead. If your code can scale that far, it's worthwhile to have 32 P-III cores instead of a the dual core Core Duo- it's faster. Yes, there are limits to this (memory bandwidth springs to mind), but the point is that more, simpler cores could be a performance win, increasing the rate cores multiply faster than Moore's law would dictate. If we decide to go to P-III cores instead of Core cores, we could have 1024-core chips comming out in 8 years (4 doublings, not 8, to go from 4x32=64 cores to 1024 cores). And remember, if Moore's law holds out for another 8 years after we hit 1K cores, that's another 4 doublings, and we're looking at CPUs with 16K cores- literally, tens of thousands of cores. If Moore's law doesn't hold up, that's going to be a different, and much larger and smellier, pile of fecal matter to hit the rotary air impeller. Brian