From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail2-relais-roc.national.inria.fr (mail2-relais-roc.national.inria.fr [192.134.164.83]) by sympa.inria.fr (Postfix) with ESMTPS id 4070E80161 for ; Tue, 20 Jun 2017 08:34:38 +0200 (CEST) X-IronPort-AV: E=Sophos;i="5.39,364,1493676000"; d="scan'208";a="279723344" Received: from 198.67.28.109.rev.sfr.net (HELO hadrien) ([109.28.67.198]) by mail2-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Jun 2017 08:34:37 +0200 Date: Tue, 20 Jun 2017 08:34:37 +0200 (CEST) From: Julia Lawall X-X-Sender: jll@hadrien To: Francois BERENGER cc: caml-list@inria.fr In-Reply-To: Message-ID: References: <88108a18-8044-bc46-bed3-aef241b4ff48@linux-france.org> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: multipart/mixed; BOUNDARY="8323329-52966783-1497940478=:24151" Subject: Re: [Caml-list] segmentation fault --8323329-52966783-1497940478=:24151 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT On Tue, 20 Jun 2017, Francois BERENGER wrote: > On 06/20/2017 06:19 AM, Julia Lawall wrote: > > > > > > On Mon, 19 Jun 2017, Josh Berdine wrote: > > > > > On Mon, Jun 19 2017, Julia Lawall wrote: > > > > > > > On Mon, 19 Jun 2017, David MENTRÉ wrote: > > > > > > > > > Hello Julia, > > > > > > > > > > Le 2017-06-18 à 21:41, Julia Lawall a écrit : > > > > > > Over several > > > > > > runs on two different laptops, the backtraces have nothing obvious > > > > > > in > > > > > > common. The bytecode version does not seem to stack overflow. > > > > > > Adding > > > > > > Gc.print_stat() at a periodic quiescent point in the execution did > > > > > > not > > > > > > show a memory leak. > > > > > > > > > > A similar issue related to random crash in native code version was > > > > > asked > > > > > by Alexey Egorov on this list 9 days ago. Daniel Bünzli advised to him > > > > > to frequently call Gc.full_major () to have a crash closer to the real > > > > > issue. > > > > > > > > OK, I will try this. > > > > > > Are you using Windows? I have only experienced segfaults from stack > > > overflow on Windows. > > > > No, Linux. > > > > > > > > > > In the case of Alexey, it was a non tail-recursive call that triggered > > > > > a > > > > > stack overflow and that is not detected in OCaml native code. > > > > > Apparently > > > > > this kind of issue is fixed in next to come OCaml 4.06.0. > > > > > > > > Why would something like this not be deterministic? In my case, it can > > > > happen after seconds or after tens of minutes. > > > > > > This sounds like an interaction with the GC to me. > > > > This was my thought as well. Also, of the four cores I have, two are in > > the Gc. But two are not. > > If this is using parmap, I would suggest disabling parmap and trying a > sequential version of the program. Parmap is there, but there are tests to avoid calling it when only one core is requested. julia > > I have seen parmap "maskerading" exceptions. > But when you don't use parmap, you can truly see what exception you have > and where it is coming from. > > > > > > Of course, your issue might be entirely different. But frequently > > > > > calling Gc.full_major () seems a sensible starting point to me. > > > > > > > > > > Otherwise you have the usual suspects: are you using C bidings? > > > > > Threads? > > > > > > > > There are no threads. The software may use C bindings. I don't think > > > > they are involved in the failing execution, but I'm not 100% sure. > > > > > > I have found running under valgrind to be helpful in this sort of > > > situation, though it is slow and some valid code can trigger many > > > warnings. > > > > OK, I'll try that. > > > > thanks, > > julia > > > > > > > > Cheers, Josh > > > > > > > -- > Caml-list mailing list. Subscription management and archives: > https://sympa.inria.fr/sympa/arc/caml-list > Beginner's list: http://groups.yahoo.com/group/ocaml_beginners > Bug reports: http://caml.inria.fr/bin/caml-bugs > --8323329-52966783-1497940478=:24151--