From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail4-relais-sop.national.inria.fr (mail4-relais-sop.national.inria.fr [192.134.164.105]) by walapai.inria.fr (8.13.6/8.13.6) with ESMTP id p9DM7VwV005213 for ; Fri, 14 Oct 2011 00:07:31 +0200 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AqkAAJ5fl07U4367kWdsb2JhbABDhHWhMYI1IgEBAQEJCwsHFAMigVMBAQEEAQEBIEsLEAsaAiYCAicwGQmHdwalXpFjgSyBH4QOgRQEjESMXIwv X-IronPort-AV: E=Sophos;i="4.69,342,1315173600"; d="scan'208";a="112836355" Received: from moutng.kundenserver.de ([212.227.126.187]) by mail4-smtp-sop.national.inria.fr with ESMTP; 14 Oct 2011 00:07:26 +0200 Received: from office1.lan.sumadev.de (dslb-094-219-223-043.pools.arcor-ip.net [94.219.223.43]) by mrelayeu.kundenserver.de (node=mreu0) with ESMTP (Nemesis) id 0MF8bn-1RKEzu0vox-00G4EI; Fri, 14 Oct 2011 00:07:25 +0200 Received: from [192.168.0.29] (dslb-084-058-005-072.pools.arcor-ip.net [84.58.5.72]) by office1.lan.sumadev.de (Postfix) with ESMTPSA id 6FC7AC00C7; Fri, 14 Oct 2011 00:07:24 +0200 (CEST) From: Gerd Stolpmann To: caml-list@inria.fr Cc: plasma-list@ocaml-programming.de In-Reply-To: <1318436373.16477.216.camel@thinkpad> References: <1318436373.16477.216.camel@thinkpad> Content-Type: text/plain; charset="UTF-8" Date: Fri, 14 Oct 2011 00:07:22 +0200 Message-ID: <1318543642.16477.284.camel@thinkpad> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit X-Provags-ID: V02:K0:feJyw3+g/mHhp8u/kp+MxgfWvh5WZDIo1gLca6neyFE 6afAQGOdeillKQq+n5+hqPCMVKSj5pEYhmlpumERLUZY8VSeov oadGamN9M9wuVX5hIEBhAAVOmrNY2uLkDkTm94jrM8VMgZQTe2 Qh4yxm2V4b/Qk+TQlQ/9iw+WU2P/JVpsRFgyzsSXTREKmUN/3y uHZVBp9ddcL8gL0NKEU8/q+y+tthhlmgD097UtklPXGN4mcOMS K4v2Ml3MIAakFaQdNt4JEslfjsGyUSfZSE1IK/yklKUFT2ands EmjsCXJpS68LqUPwXL6Efra/dkH5TLPoUls228kSlcWUkeDAai hAFv1Y0bgiIpinOxCfcHZ14NSt/MfGz75IairZJsv Subject: Re: [Caml-list] [ANN] Plasma MapReduce, PlasmaFS, version 0.4 There is now even plasma-0.4.1, fixing some performance bugs. Also, there are now some simple performance numbers: http://plasma.camlcity.org/plasma/perf.html Gerd Am Mittwoch, den 12.10.2011, 18:19 +0200 schrieb Gerd Stolpmann: > Hi, > > I've just released Plasma-0.4. Plasma consists of two parts (for now), > namely Plasma MapReduce, a map/reduce compute framework, and PlasmaFS, > the underlying distributed filesystem. > > Major changes in version 0.4: > > * Added a security system (including strong authentication, and > authorization). This is a quite big change, and makes PlasmaFS a > highly secure DFS. > * Datanodes are now monitored, and failed nodes are automatically > considered as unavailable. The monitoring system uses multicast > messaging. > * The namenode can now profit from multi-processing, removing a > potential bottleneck. > * Improved the caching subsystem. > * Better management of file buffers in map/reduce jobs. > > Of course, there are also numerous bug fixes and performance > improvements. > > Plasma MapReduce is a distributed implementation of the map/reduce > algorithm scheme. In a sentence, map/reduce performs a parallel List.map > on an input file, sorts and splits the output by some criterion into > partitions, and runs a List.fold_left on each partition. Only that it > does not do that sequentially, but in a distributed way, and chunk by > chunk. Because of this Plasma MapReduce can process very large files, > and if run on enough computers, this also will work in reasonable time. > Of course, map and reduce are Ocaml functions here. > > This all works on top of a distributed filesystem, PlasmaFS. This is a > user-space filesystem that is primarily accessed over RPC (but it is > also mountable as NFS volume). Actually, most of the effort went here. > PlasmaFS focuses on reliability and speed for big blocksizes. To get > this, it implements ACID transactions, replicates data and metadata with > two-phase commit, uses a shared memory data channel if possible, and > monitors itself. Unlike other filesystems for map/reduce, PlasmaFS > implements the complete set of usual file operations, including random > reads and writes. It can also be used as unspecialized global > filesystem. > > Both pieces of software are bundled together in one download. The > project page with further links is > > http://projects.camlcity.org/projects/plasma.html > > There is now also a homepage at > > http://plasma.camlcity.org > > This is an early alpha release (0.4). A lot of things work already, and > you can already run distributed map/reduce jobs. However, it is in no > way complete. > > Plasma is installable via GODI for Ocaml 3.12. > > There is now a chart comparing Plasma with Hadoop. In one sentence, > PlasmaFS bases on a superior filesystem design, and has now to prove > that the implementation is really working. Plasma map/reduce generalizes > the algorithm scheme compared with Hadoop, but has still some > shortcomings in the implementation: > > http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmafs_and_hdfs.html > > http://plasma.camlcity.org/plasma/dl/plasma-0.4/doc/html/Plasmamr_and_hadoop.html > > > For discussions on specifics of Plasma there is a separate mailing list: > > https://godirepo.camlcity.org/mailman/listinfo/plasma-list > > Gerd > -- > ------------------------------------------------------------ > Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de > Creator of GODI and camlcity.org. > Contact details: http://www.camlcity.org/contact.html > Company homepage: http://www.gerd-stolpmann.de > *** Searching for new projects! Need consulting for system > *** programming in Ocaml? Gerd Stolpmann can help you. > ------------------------------------------------------------ > > -- ------------------------------------------------------------ Gerd Stolpmann, Darmstadt, Germany gerd@gerd-stolpmann.de Creator of GODI and camlcity.org. Contact details: http://www.camlcity.org/contact.html Company homepage: http://www.gerd-stolpmann.de *** Searching for new projects! Need consulting for system *** programming in Ocaml? Gerd Stolpmann can help you. ------------------------------------------------------------