From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 7C59E80161 for ; Tue, 20 Jun 2017 03:02:04 +0200 (CEST) Authentication-Results: mail2-smtp-roc.national.inria.fr; spf=None smtp.pra=berenger@bioreg.kyushu-u.ac.jp; spf=None smtp.mailfrom=berenger@bioreg.kyushu-u.ac.jp; spf=None smtp.helo=postmaster@h4.hosting4.cc.kyushu-u.ac.jp Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of berenger@bioreg.kyushu-u.ac.jp) identity=pra; client-ip=133.5.13.5; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="berenger@bioreg.kyushu-u.ac.jp"; x-sender="berenger@bioreg.kyushu-u.ac.jp"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of berenger@bioreg.kyushu-u.ac.jp) identity=mailfrom; client-ip=133.5.13.5; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="berenger@bioreg.kyushu-u.ac.jp"; x-sender="berenger@bioreg.kyushu-u.ac.jp"; x-conformance=sidf_compatible Received-SPF: None (mail2-smtp-roc.national.inria.fr: no sender authenticity information available from domain of postmaster@h4.hosting4.cc.kyushu-u.ac.jp) identity=helo; client-ip=133.5.13.5; receiver=mail2-smtp-roc.national.inria.fr; envelope-from="berenger@bioreg.kyushu-u.ac.jp"; x-sender="postmaster@h4.hosting4.cc.kyushu-u.ac.jp"; x-conformance=sidf_compatible IronPort-PHdr: =?us-ascii?q?9a23=3AU6QzERyzofw4DQ3XCy+O+j09IxM/srCxBDY+r6Qd?= =?us-ascii?q?1OMVIJqq85mqBkHD//Il1AaPBtSEraocw8Pt8InYEVQa5piAtH1QOLdtbDQizf?= =?us-ascii?q?ssogo7HcSeAlf6JvO5JwYzHcBFSUM3tyrjaRsdF8nxfUDdrWOv5jAOBBr/KRB1?= =?us-ascii?q?JuPoEYLOksi7ze6/9pnRbglSmDaxfa55IQmrownWqsQYm5ZpJLwryhvOrHtIeu?= =?us-ascii?q?BWyn1tKFmOgRvy5dq+8YB6/ShItP0v68BPUaPhf6QlVrNYFygpM3o05MLwqxbO?= =?us-ascii?q?SxaE62YGXWUXlhpIBBXF7A3/U5zsvCb2qvZx1S+HNsDwULs6Wymt771zRRH1li?= =?us-ascii?q?kHOT43/mLZhMN+g61Uog6uqRNkzo7IY4yYLuZycr/TcN4YQ2dKQ8ZfVzZGAoO5?= =?us-ascii?q?d4YBD/ABMvxer4bhoFsOrAC+DhSxCe3g1jFGiWf406w13eo9DArL2xcvEMwUsH?= =?us-ascii?q?vKqtX1O7kdUfquwabTzDXDaOlW1iny6ITScRAgoeyMXalwccrM0EUvChnJgU+M?= =?us-ascii?q?poD/PTOVzv0Avm6G5ORjTeKik3Mrpg9/rzS1xsogkJTFi4wPxl3E7Sl13Yg4KN?= =?us-ascii?q?OiREN7e9KoDoZcuiKAO4drTc4vQGdlszsgxLIco560Zi0KxYwnxxHBb/yHdJCF?= =?us-ascii?q?4hLsWeqLITd4g2lleK6+hxa0/kitxffwWdWo31pQrSpEksTMtmsN1xzO88SHV+?= =?us-ascii?q?Fx8V291jqV1QDT8vlIIUEylaXFN54s2qM8m5QdvEjZHiL6glj6gaGMekgk4uSo?= =?us-ascii?q?7v7oYrTipp+SLY90jQT+P7w1lcOhG+Q3LA4OX2eF9uSmz7Ds5kz5QLJQjvIona?= =?us-ascii?q?nVqpPaJMQGpq6iAw9Vz58v6wulAzi8zNsUh3sHLEpddBKdk4fpI03OIOz/Dfqn?= =?us-ascii?q?n1ujijJrx/TfMr3lA5XNNWTDnaz6fbd97k5c0BA8wcpe55JSELEBIej8VlX/tN?= =?us-ascii?q?zCXVcFNFm/yuPjTdF8zZ82WGSVA6bfPrmBn0WP47cBOeCKb4gUo37XBsIIwdHD?= =?us-ascii?q?tzdtk1IHcK+m0IE/YnG/BPlpZUaIfGHsn5IcV24B+AgmGr+5wGaeWCJeMi7hF5?= =?us-ascii?q?k34Ss2Xdqr?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0AZAwCHc0hZlwUNBYVcHAEBBAEBCgEBF?= =?us-ascii?q?gEBAQMBAQEJAQEBhBKEdYsMkQuVd4IRAYJtgzYCgxoWAQEBAQEBAQEBAQESAQE?= =?us-ascii?q?BAQEIFgZXgjMigkQBAgMjDwEFUQsYAgImAgIfOBMIAQGKJ618giaHchqDUTKBC?= =?us-ascii?q?4djgneFNYJGgmEFh0YHiXyNFR6TQYsahnOVCSYDgTiBCUmFDRAMgXWKNgEBAQ?= X-IPAS-Result: =?us-ascii?q?A0AZAwCHc0hZlwUNBYVcHAEBBAEBCgEBFgEBAQMBAQEJAQE?= =?us-ascii?q?BhBKEdYsMkQuVd4IRAYJtgzYCgxoWAQEBAQEBAQEBAQESAQEBAQEIFgZXgjMig?= =?us-ascii?q?kQBAgMjDwEFUQsYAgImAgIfOBMIAQGKJ618giaHchqDUTKBC4djgneFNYJGgmE?= =?us-ascii?q?Fh0YHiXyNFR6TQYsahnOVCSYDgTiBCUmFDRAMgXWKNgEBAQ?= X-IronPort-AV: E=Sophos;i="5.39,363,1493676000"; d="scan'208";a="279700175" Received: from hosting4.cc.kyushu-u.ac.jp (HELO h4.hosting4.cc.kyushu-u.ac.jp) ([133.5.13.5]) by mail2-smtp-roc.national.inria.fr with ESMTP; 20 Jun 2017 03:02:02 +0200 Received: from [192.168.2.36] (unknown [133.5.218.148]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: berenger@bioreg.kyushu-u.ac.jp) by h4.hosting4.cc.kyushu-u.ac.jp (hde-lc-postfix) with ESMTPSA id 32B962AAFBD for ; Tue, 20 Jun 2017 10:01:59 +0900 (JST) (envelope-from berenger@bioreg.kyushu-u.ac.jp) To: caml-list@inria.fr References: <88108a18-8044-bc46-bed3-aef241b4ff48@linux-france.org> From: Francois BERENGER Message-ID: Date: Tue, 20 Jun 2017 10:01:59 +0900 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [Caml-list] segmentation fault On 06/20/2017 06:19 AM, Julia Lawall wrote: > > > On Mon, 19 Jun 2017, Josh Berdine wrote: > >> On Mon, Jun 19 2017, Julia Lawall wrote: >> >>> On Mon, 19 Jun 2017, David MENTRÉ wrote: >>> >>>> Hello Julia, >>>> >>>> Le 2017-06-18 à 21:41, Julia Lawall a écrit : >>>>> Over several >>>>> runs on two different laptops, the backtraces have nothing obvious in >>>>> common. The bytecode version does not seem to stack overflow. Adding >>>>> Gc.print_stat() at a periodic quiescent point in the execution did not >>>>> show a memory leak. >>>> >>>> A similar issue related to random crash in native code version was asked >>>> by Alexey Egorov on this list 9 days ago. Daniel Bünzli advised to him >>>> to frequently call Gc.full_major () to have a crash closer to the real >>>> issue. >>> >>> OK, I will try this. >> >> Are you using Windows? I have only experienced segfaults from stack >> overflow on Windows. > > No, Linux. > >> >>>> In the case of Alexey, it was a non tail-recursive call that triggered a >>>> stack overflow and that is not detected in OCaml native code. Apparently >>>> this kind of issue is fixed in next to come OCaml 4.06.0. >>> >>> Why would something like this not be deterministic? In my case, it can >>> happen after seconds or after tens of minutes. >> >> This sounds like an interaction with the GC to me. > > This was my thought as well. Also, of the four cores I have, two are in > the Gc. But two are not. If this is using parmap, I would suggest disabling parmap and trying a sequential version of the program. I have seen parmap "maskerading" exceptions. But when you don't use parmap, you can truly see what exception you have and where it is coming from. >>>> Of course, your issue might be entirely different. But frequently >>>> calling Gc.full_major () seems a sensible starting point to me. >>>> >>>> Otherwise you have the usual suspects: are you using C bidings? Threads? >>> >>> There are no threads. The software may use C bindings. I don't think >>> they are involved in the failing execution, but I'm not 100% sure. >> >> I have found running under valgrind to be helpful in this sort of >> situation, though it is slow and some valid code can trigger many >> warnings. > > OK, I'll try that. > > thanks, > julia > >> >> Cheers, Josh >> >