From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail3-relais-sop.national.inria.fr (mail3-relais-sop.national.inria.fr [192.134.164.104]) by sympa.inria.fr (Postfix) with ESMTPS id 329AA7F0BA for ; Wed, 8 Feb 2017 07:30:21 +0100 (CET) Authentication-Results: mail3-smtp-sop.national.inria.fr; spf=None smtp.pra=examachine@gmail.com; spf=Pass smtp.mailfrom=examachine@gmail.com; spf=None smtp.helo=postmaster@mail-vk0-f50.google.com Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of examachine@gmail.com) identity=pra; client-ip=209.85.213.50; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="examachine@gmail.com"; x-sender="examachine@gmail.com"; x-conformance=sidf_compatible Received-SPF: Pass (mail3-smtp-sop.national.inria.fr: domain of examachine@gmail.com designates 209.85.213.50 as permitted sender) identity=mailfrom; client-ip=209.85.213.50; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="examachine@gmail.com"; x-sender="examachine@gmail.com"; x-conformance=sidf_compatible; x-record-type="v=spf1" Received-SPF: None (mail3-smtp-sop.national.inria.fr: no sender authenticity information available from domain of postmaster@mail-vk0-f50.google.com) identity=helo; client-ip=209.85.213.50; receiver=mail3-smtp-sop.national.inria.fr; envelope-from="examachine@gmail.com"; x-sender="postmaster@mail-vk0-f50.google.com"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3A1SBkfxL6Cq6sDyK9yNmcpTZWNBhigK39O0sv0rFi?= =?us-ascii?q?tYgeLf7xwZ3uMQTl6Ol3ixeRBMOAuq4C27Od6vi5EUU7or+5+EgYd5JNUxJXwe?= =?us-ascii?q?43pCcHRPC/NEvgMfTxZDY7FskRHHVs/nW8LFQHUJ2mPw6arXK99yMdFQviPgRp?= =?us-ascii?q?OOv1BpTSj8Oq3Oyu5pHfeQtFiT6ybL9oLxi6sArdutQKjYd/N6081gbHrnxUdu?= =?us-ascii?q?pM2GhmP0iTnxHy5sex+J5s7SFdsO8/+sBDTKv3Yb02QaRXAzo6PW814tbrtQTY?= =?us-ascii?q?QguU+nQcSGQWnQFWDAXD8Rr3Q43+sir+tup6xSmaIcj7Rq06VDi+86tmTgLjhT?= =?us-ascii?q?wZPDAl7m7Yls1wjLpaoB2/oRx/35XUa5yROPZnY6/RYc8WSW9HU8lWSiJBH5i8?= =?us-ascii?q?b5MRAOUdIeZWoY79p14Uohu/AwmnGefjxzBMi3Pz26AxzuYvHhzc3AE4H9wAvn?= =?us-ascii?q?faosjrOqgOSu261rXEwC/ZYv9KxTvw6o7FeQ0hr/GWWrJwdNLcx0Y1FwzfjlSb?= =?us-ascii?q?tJXrPjKW1uQQqWiU9e5gXv+ohmE5pAB+uD2vyd0whYnJh4IVzE7L+D52wIYwP9?= =?us-ascii?q?K4SUp7bcS4H5tXsiGXLo17Sd4sTWFvvSY10LwGuZijcScQ0pQmyB/fa+Kdf4iP?= =?us-ascii?q?+BLjW/6dITh5hHJ5eLK/mg29/VK8xe37U8m4yFZLoTBFktnLsXABzQDc6s+CSv?= =?us-ascii?q?Z740yv2i6P2hjN5u1YJU04j6nWJp47zrIui5Yev17PEy/qlEjwkaSYbF8r+vKy?= =?us-ascii?q?5OTierjmpoGTN4tzigzmN6QhgM2/AeAhPggQXGiX5f2w1LPj8EHlWrlKgfo2kq?= =?us-ascii?q?7WsJDeO8sXvLK2AwhQ0oo76ha/CSmp0MgAkHUZMF5IfAiLgovpNl3UPvz0EPmy?= =?us-ascii?q?j06snTt33/zGO6fuApTJLnjNirfherN95lZCxws8199f4ohbBa0BIPLyXE/+qs?= =?us-ascii?q?fVDhA8MwOuwubnDM9x2Z8ZWWKKGqOZKr/dsUeU5uIzJOmBfJMatyz4K/gh/vLu?= =?us-ascii?q?iX45mUQBfaSyxpsWaHW4Hux8LEmDYHrshM0BEWYQsQYkQuzqkg7KbTkGTH+3W+?= =?us-ascii?q?oY5yonQNatBILHA4Ssm6Cp3SGhH5QQaHoQWX6WFnK9Wp+NSr8reiGPOMZl2mgJ?= =?us-ascii?q?RbGkQoQh1BejnAD/wrtjaOHT/3tL5trYyNFp6riLxlkJ/jtuApHYjjiA?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0C6AwCPuppYhjLVVdFdHAEXAQYBCwEXA?= =?us-ascii?q?QYBgjiBBAEzC4EJB4NSmmUIAQEGgR2CPoZChw6HOCqFeIJSB0MUAQEBAQEBAQE?= =?us-ascii?q?BAQESAQEBCAsLCh0vgjMbgkUEGQEbHgMSEDcCJAERAQUBChgliUkBAwgNDp98g?= =?us-ascii?q?0A/jAKBaxgFARyDCQWDZwoZJwMKVYNcAgYShXaKNIJZgl8FhzMMiH6FF4Ychm2?= =?us-ascii?q?LJYF7U4REiXECkUsUHoEVDyeBHx8THWiEPoFwPTWIfgEBAQ?= X-IPAS-Result: =?us-ascii?q?A0C6AwCPuppYhjLVVdFdHAEXAQYBCwEXAQYBgjiBBAEzC4E?= =?us-ascii?q?JB4NSmmUIAQEGgR2CPoZChw6HOCqFeIJSB0MUAQEBAQEBAQEBAQESAQEBCAsLC?= =?us-ascii?q?h0vgjMbgkUEGQEbHgMSEDcCJAERAQUBChgliUkBAwgNDp98g0A/jAKBaxgFARy?= =?us-ascii?q?DCQWDZwoZJwMKVYNcAgYShXaKNIJZgl8FhzMMiH6FF4Ychm2LJYF7U4REiXECk?= =?us-ascii?q?UsUHoEVDyeBHx8THWiEPoFwPTWIfgEBAQ?= X-IronPort-AV: E=Sophos;i="5.33,346,1477954800"; d="scan'208,217";a="212442557" Received: from mail-vk0-f50.google.com ([209.85.213.50]) by mail3-smtp-sop.national.inria.fr with ESMTP/TLS/AES128-GCM-SHA256; 08 Feb 2017 07:30:20 +0100 Received: by mail-vk0-f50.google.com with SMTP id k127so94199313vke.0 for ; Tue, 07 Feb 2017 22:30:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=Y7XLOtoobkSfROSQLC7cKC2kip2WS4qKD/ogyr5gFqI=; b=inJOqHne+Gr/x4+ytXn/gY+7wnyIQlsXNabPOesmcB2xUv4CpQMV/GF+OPOnxXwTGt KgaWIZhNQppeBbzaoe78uSg8VN5TTWvRDY0IqKH9UPVUCN1P53uE8RlN59Ei/KHeBOCq XFqCPJbR3LwJ6Buto7ctjfLg6HiQqre4eN0A+cleNfkDPWvhoUfS1KvApinLZgS1N7wA K12Cp1ueSoj5CgpZO+o2o4r5r/395BJiSDsmPf0HTnDarCk328IaO4RMScNdVm9ozqWB yxtx8HaBZpd6WGBjpBblEANw0my/UBJbKu9LOWcdSJ2G1FfTnHIV8FWLd7sfzYsm1DIg CiSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=Y7XLOtoobkSfROSQLC7cKC2kip2WS4qKD/ogyr5gFqI=; b=G1Nq/KgSVavhDXqzhzi9QfXaB+kZ6pT84DPT5+DOXKZtfy8tUVGTxpfMkhBnuPdGMC 980GcCJvagOY/aBSVFnxMqewT8fQKhFOXhlPtD6f1yyM/FQpqGl7Oh8hwXkOtDXj81tn WTdIxZcBTm3dE1VVW/rr1Md551Gv1yzsde/Pp53aNWxzXsNA08HT1om7qu+uzyTvOkeQ MbLbQjsdVim8M/+6iD76ngbFRSeqhgnu8+gBwV+0krpmelNAH5D3a5SDbO2QCpNZ/THc dMpgBqcjWxZvwg9tTAxrGCVnwc8fc75ZTfVdXmpYuscQlLnme7TggO+iMrbYBqdKXSO6 FmSQ== X-Gm-Message-State: AMke39koM2igo1DOHyc/yKck4BM5r0I8FVsO6Kl4qW30sM6GU4Z3XLiRvzfyHf/XC68Vy7Bp/lcOqhgh3dXFog== X-Received: by 10.31.58.215 with SMTP id h206mr8034212vka.164.1486535418536; Tue, 07 Feb 2017 22:30:18 -0800 (PST) MIME-Version: 1.0 Received: by 10.176.91.214 with HTTP; Tue, 7 Feb 2017 22:30:18 -0800 (PST) From: Eray Ozkural Date: Wed, 8 Feb 2017 09:30:18 +0300 Message-ID: To: Caml List Content-Type: multipart/alternative; boundary=001a114407d4916c3f0547feff8d Subject: [Caml-list] ANN: parallpairs --001a114407d4916c3f0547feff8d Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable ---> https://github.com/examachine/parallpairs Parallel all-pairs similarity search algorithms in OCaml If you use this code, please cite the following paper. It is currently under review at IJPP. https://arxiv.org/abs/1402.3010 1-D and 2-D Parallel Algorithms for All-Pairs Similarity Problem Eray =C3=96zkural, Cevdet Aykanat (Submitted on 13 Feb 2014) All-pairs similarity problem asks to find all vector pairs in a set of vectors the similarities of which surpass a given similarity threshold, and it is a computational kernel in data mining and information retrieval for several tasks. We investigate the parallelization of a recent fast sequential algorithm. We propose effective 1-D and 2-D data distribution strategies that preserve the essential optimizations in the fast algorithm. 1-D parallel algorithms distribute either dimensions or vectors, whereas the 2-D parallel algorithm distributes data both ways. Additional contributions to the 1-D vertical distribution include a local pruning strategy to reduce the number of candidates, a recursive pruning algorithm, and block processing to reduce imbalance. The parallel algorithms were programmed in OCaml which affords much convenience. Our experiments indicate that the performance depends on the dataset, therefore a variety of parallelizations is useful. The code is quite interesting, as it shows how to effectively use OCaml for MPI code. There is a bunch of well-written parallel functional code that I will extract from this codebase and release separately. You need the latest ocamlmpi release as that contains the patches I made to make this code work. This code is released under AGPL-3.0. Please do not ask me to release it under BSD license. If you need a commercial license, you should purchase it. Happy hacking! Eray Ozkural, PhD --=20 Eray Ozkural, PhD. Computer Scientist Founder, Gok Us Sibernetik Ar&Ge Ltd. http://groups.yahoo.com/group/ai-philosophy --001a114407d4916c3f0547feff8d Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
---> https://github.com/examachine/parallpairs

Parallel all-pa= irs similarity search algorithms in OCaml

If you use this code, please cite the following= paper. It is currently under review at IJPP.

https://arxiv.org/abs/1402.3010

1-D and 2-D Parallel= Algorithms for All-Pairs Similarity Problem

Eray =C3=96zkural, Cevdet Aykanat (Submitted = on 13 Feb 2014)

All-pairs similarity problem asks to find all vector pairs in a set of vec= tors the similarities of which surpass a given similarity threshold, and it= is a computational kernel in data mining and information retrieval for sev= eral tasks. We investigate the parallelization of a recent fast sequential = algorithm. We propose effective 1-D and 2-D data distribution strategies th= at preserve the essential optimizations in the fast algorithm. 1-D parallel= algorithms distribute either dimensions or vectors, whereas the 2-D parall= el algorithm distributes data both ways. Additional contributions to the 1-= D vertical distribution include a local pruning strategy to reduce the numb= er of candidates, a recursive pruning algorithm, and block processing to re= duce imbalance. The parallel algorithms were programmed in OCaml which affo= rds much convenience. Our experiments indicate that the performance depends= on the dataset, therefore a variety of parallelizations is useful.

The code is quite inte= resting, as it shows how to effectively use OCaml for MPI code. There is a = bunch of well-written parallel functional code that I will extract from thi= s codebase and release separately. You need the latest ocamlmpi release as = that contains the patches I made to make this code work.

This code is released under AGPL-= 3.0. Please do not ask me to release it under BSD license. If you need a co= mmercial license, you should purchase it.

Happy hacking!


Eray Ozkural, PhD



--
Er= ay Ozkural, PhD. Computer Scientist
Founder, Gok Us Sibernetik Ar&Ge= Ltd.
http://groups.yahoo.com/group/ai-philosophy
--001a114407d4916c3f0547feff8d--