From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on yquem.inria.fr X-Spam-Level: X-Spam-Status: No, score=0.1 required=5.0 tests=AWL autolearn=disabled version=3.1.3 Received: from mail1-relais-roc.national.inria.fr (mail1-relais-roc.national.inria.fr [192.134.164.82]) by yquem.inria.fr (Postfix) with ESMTP id 5BC14BBCA for ; Fri, 9 May 2008 13:10:55 +0200 (CEST) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AsIAAOfNI0jU4368mmdsb2JhbACBU5A1AQEBAQEIBQgHEZhwAQ X-IronPort-AV: E=Sophos;i="4.27,460,1204498800"; d="scan'208";a="12031903" Received: from moutng.kundenserver.de ([212.227.126.188]) by mail1-smtp-roc.national.inria.fr with ESMTP; 09 May 2008 13:10:54 +0200 Received: from gate.lan.gerd-stolpmann.de (dslb-084-059-099-212.pools.arcor-ip.net [84.59.99.212]) by mrelayeu.kundenserver.de (node=mrelayeu5) with ESMTP (Nemesis) id 0ML25U-1JuQV321e5-00078E; Fri, 09 May 2008 13:10:54 +0200 Received: from [192.168.0.32] (fw.lan.gerd-stolpmann.de [192.168.1.1]) by gate.lan.gerd-stolpmann.de (Postfix) with ESMTP id 33FF8C074; Fri, 9 May 2008 13:10:53 +0200 (CEST) Subject: Re: [Caml-list] Re: Why OCaml rocks From: Gerd Stolpmann To: Jon Harrop Cc: caml-list@yquem.inria.fr In-Reply-To: <200805090609.36123.jon@ffconsultancy.com> References: <200805090139.54870.jon@ffconsultancy.com> <74cabd9e0805082145p120ce487h6c1194d87f3f8396@mail.gmail.com> <200805090609.36123.jon@ffconsultancy.com> Content-Type: text/plain Date: Fri, 09 May 2008 13:12:06 +0200 Message-Id: <1210331526.17578.32.camel@flake.lan.gerd-stolpmann.de> Mime-Version: 1.0 X-Mailer: Evolution 2.12.1 Content-Transfer-Encoding: 7bit X-Provags-ID: V01U2FsdGVkX19XiTgUkXTlHRqzj/80KSN98ULlnQA1gDzet7O SV5uPJbz4evzh5enoTO6FnxxwREKb021uk1NcHR0V8wnZucCdp YQVCB62AZ1TBTSCIW/HYGRDaaw1A52X X-Spam: no; 0.00; ocaml:01 gerd:01 stolpmann:01 parallelism:01 scalable:01 jocaml:01 haskell:01 haskell's:01 parallelism:01 o'caml:01 o'caml:01 mutable:01 mutable:01 ocaml:01 gerd:01 Am Freitag, den 09.05.2008, 06:09 +0100 schrieb Jon Harrop: > On Friday 09 May 2008 05:45:53 you wrote: > > On Thu, May 8, 2008 at 5:39 PM, Jon Harrop wrote: > > > 1. Lack of Parallelism: Yes, this is already a complete show stopper. > > > Exploiting multicores requires a scalable concurrent GC and message > > > passing (like JoCaml) is not a substitute. Unfortunately, this is now > > > true of all functional languages available for Linux, which is why we > > > have now migrated entirely to Windows and F#. I find it particularly > > > ironic that the Haskell community keep hyping the multicore capabilities > > > of pure code when the rudimentary GC in Haskell's only usable > > > implementation already stopped scaling. > > > > Fork? For something like a raytracer, I do not see how threads would be > > any more useful than fork. I think the parallelism capabilities are already excellent. We have been able to implement the application backend of Wink's people search in O'Caml, and it is of course a highly parallel system of programs. This is not the same class raytracers or desktop parallelism fall into - this is highly professional supercomputing. I'm talking about a cluster of ~20 computers with something like 60 CPUs. Of course, we did not use multithreading very much. We are relying on multi-processing (both "fork"ed style and separately started programs), and multiplexing (i.e. application-driven micro-threading). I especially like the latter: Doing multiplexing in O'Caml is fun, and a substitute for most applications of multithreading. For example, you want to query multiple remote servers in parallel: Very easy with multiplexing, whereas the multithreaded counterpart would quickly run into scalability problems (threads are heavy-weight, and need a lot of resources). > There are two problems with that: > > . You go back to manual memory management between parallel threads/processes. I guess you refer to explicit references between processes. This is a kind of problem, and best handled by avoiding it. We have some cases where we have to keep remote state. The solution was to have a timer, and delete it after some time of not accessing it. After all, most state is only temporary, and if it is lost, it can be created again (at some cost, of course). > . Parallelism is for performance and performance requires mutable data > structures. In our case, the mutable data structures that count are on disk. Everything else is only temporary state. I admit that it is a challenge to structure programs in a way such that parallel programs not sharing memory profit from mutable state. Note that it is also a challenge to debug locks in a multithreaded program so that they run 24/7. Parallelism is not easy after all. > Then you almost always end up copying data unnecessarily because you cannot > collect it otherwise, which increases memory consumption and massively > degrades performance that, in turn, completely undermines the original point > of parallelism. Ok, I understand. We are complete fools. :-) I think that the cost of copying data is totally overrated. We are doing this often, and even over the network, and hey, we are breaking every speed limit. > The cost of interthread communication is then so high in > OCaml that you will rarely be able to obtain any performance improvement for > the number of cores desktop machines are going to see over the next ten > years, by which time OCaml will be 10-100x slower than the competition. This is a quite theoretical statement. We will rather see that most application programmers will not learn parallelism at all, and that consumers will start question the sense of multicores, and the chip industry will search for alternatives. And _if_ application programmers learn parallelism, then rather in the multi-processing/multiplexing setup we use, and not the multithreading style you propagate. And on servers (where parallelism ought to happen), the poor support of Windows for it (lacking "fork" and other cool features) is no problem. Gerd > > > When was the last time you heard of a cool new windows app anyway? > > The last time we released a product. :-) > > > > . No 16Mb limit. > > > > What do you mean by 16mb limit? > > OCaml's strings and arrays are limited to 16Mb in 32-bit. > > > > . Inlining. > > > > isn't it best for the compiler to handle that? I wouldn't mind hearing > > another perspective on this, but I thought that compilers were smarter > > these days. > > Definitely not. Compilers uniformly suck at inlining. For example, agressive > inlining is often beneficial in numerical code and often damaging in symbolic > code. Compilers cannot tell the difference. > > This is very similar to "unboxed data structures are always better", which > also isn't generally true. > > I've got more gripes to add: > > . Missing types, like float32 and int16. > . DLLs. > -- ------------------------------------------------------------ Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany gerd@gerd-stolpmann.de http://www.gerd-stolpmann.de Phone: +49-6151-153855 Fax: +49-6151-997714 ------------------------------------------------------------